summaryrefslogtreecommitdiffstats
path: root/drivers/perf
AgeCommit message (Collapse)AuthorFilesLines
2023-01-26Partially revert "perf/arm-cmn: Optimise DTC counter accesses"Robin Murphy1-1/+6
It turns out the optimisation implemented by commit 4f2c3872dde5 is totally broken, since all the places that consume hw->dtcs_used for events other than cycle count are still not expecting it to be sparsely populated, and fail to read all the relevant DTC counters correctly if so. If implemented correctly, the optimisation potentially saves up to 3 register reads per event update, which is reasonably significant for events targeting a single node, but still not worth a massive amount of additional code complexity overall. Getting it right within the current design looks a fair bit more involved than it was ever intended to be, so let's just make a functional revert which restores the old behaviour while still backporting easily. Fixes: 4f2c3872dde5 ("perf/arm-cmn: Optimise DTC counter accesses") Reported-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/b41bb4ed7283c3d8400ce5cf5e6ec94915e6750f.1674498637.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-12-14Merge tag 'riscv-for-linus-6.2-mw1' of ↵Linus Torvalds1-10/+24
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for the T-Head PMU via the perf subsystem - ftrace support for rv32 - Support for non-volatile memory devices - Various fixes and cleanups * tag 'riscv-for-linus-6.2-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (52 commits) Documentation: RISC-V: patch-acceptance: s/implementor/implementer Documentation: RISC-V: Mention the UEFI Standards Documentation: RISC-V: Allow patches for non-standard behavior Documentation: RISC-V: Fix a typo in patch-acceptance riscv: Fixup compile error with !MMU riscv: Fix P4D_SHIFT definition for 3-level page table mode riscv: Apply a static assert to riscv_isa_ext_id RISC-V: Add some comments about the shadow and overflow stacks RISC-V: Align the shadow stack RISC-V: Ensure Zicbom has a valid block size RISC-V: Introduce riscv_isa_extension_check RISC-V: Improve use of isa2hwcap[] riscv: Don't duplicate _ALTERNATIVE_CFG* macros riscv: alternatives: Drop the underscores from the assembly macro names riscv: alternatives: Don't name unused macro parameters riscv: Don't duplicate __ALTERNATIVE_CFG in __ALTERNATIVE_CFG_2 riscv: mm: call best_map_size many times during linear-mapping riscv: Move cast inside kernel_mapping_[pv]a_to_[vp]a riscv: Fix crash during early errata patching riscv: boot: add zstd support ...
2022-12-12Merge tag 'perf-core-2022-12-12' of ↵Linus Torvalds1-9/+7
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf events updates from Ingo Molnar: - Thoroughly rewrite the data structures that implement perf task context handling, with the goal of fixing various quirks and unfeatures both in already merged, and in upcoming proposed code. The old data structure is the per task and per cpu perf_event_contexts: task_struct::perf_events_ctxp[] <-> perf_event_context <-> perf_cpu_context ^ | ^ | ^ `---------------------------------' | `--> pmu ---' v ^ perf_event ------' In this new design this is replaced with a single task context and a single CPU context, plus intermediate data-structures: task_struct::perf_event_ctxp -> perf_event_context <- perf_cpu_context ^ | ^ ^ `---------------------------' | | | | perf_cpu_pmu_context <--. | `----. ^ | | | | | | v v | | ,--> perf_event_pmu_context | | | | | | | v v | perf_event ---> pmu ----------------' [ See commit bd2756811766 for more details. ] This rewrite was developed by Peter Zijlstra and Ravi Bangoria. - Optimize perf_tp_event() - Update the Intel uncore PMU driver, extending it with UPI topology discovery on various hardware models. - Misc fixes & cleanups * tag 'perf-core-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) perf/x86/intel/uncore: Fix reference count leak in __uncore_imc_init_box() perf/x86/intel/uncore: Fix reference count leak in snr_uncore_mmio_map() perf/x86/intel/uncore: Fix reference count leak in hswep_has_limit_sbox() perf/x86/intel/uncore: Fix reference count leak in sad_cfg_iio_topology() perf/x86/intel/uncore: Make set_mapping() procedure void perf/x86/intel/uncore: Update sysfs-devices-mapping file perf/x86/intel/uncore: Enable UPI topology discovery for Sapphire Rapids perf/x86/intel/uncore: Enable UPI topology discovery for Icelake Server perf/x86/intel/uncore: Get UPI NodeID and GroupID perf/x86/intel/uncore: Enable UPI topology discovery for Skylake Server perf/x86/intel/uncore: Generalize get_topology() for SKX PMUs perf/x86/intel/uncore: Disable I/O stacks to PMU mapping on ICX-D perf/x86/intel/uncore: Clear attr_update properly perf/x86/intel/uncore: Introduce UPI topology type perf/x86/intel/uncore: Generalize IIO topology support perf/core: Don't allow grouping events from different hw pmus perf/amd/ibs: Make IBS a core pmu perf: Fix function pointer case perf/x86/amd: Remove the repeated declaration perf: Fix possible memleak in pmu_dev_alloc() ...
2022-12-12Merge tag 'irq-core-2022-12-10' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "Updates for the interrupt core and driver subsystem: The bulk is the rework of the MSI subsystem to support per device MSI interrupt domains. This solves conceptual problems of the current PCI/MSI design which are in the way of providing support for PCI/MSI[-X] and the upcoming PCI/IMS mechanism on the same device. IMS (Interrupt Message Store] is a new specification which allows device manufactures to provide implementation defined storage for MSI messages (as opposed to PCI/MSI and PCI/MSI-X that has a specified message store which is uniform accross all devices). The PCI/MSI[-X] uniformity allowed us to get away with "global" PCI/MSI domains. IMS not only allows to overcome the size limitations of the MSI-X table, but also gives the device manufacturer the freedom to store the message in arbitrary places, even in host memory which is shared with the device. There have been several attempts to glue this into the current MSI code, but after lengthy discussions it turned out that there is a fundamental design problem in the current PCI/MSI-X implementation. This needs some historical background. When PCI/MSI[-X] support was added around 2003, interrupt management was completely different from what we have today in the actively developed architectures. Interrupt management was completely architecture specific and while there were attempts to create common infrastructure the commonalities were rudimentary and just providing shared data structures and interfaces so that drivers could be written in an architecture agnostic way. The initial PCI/MSI[-X] support obviously plugged into this model which resulted in some basic shared infrastructure in the PCI core code for setting up MSI descriptors, which are a pure software construct for holding data relevant for a particular MSI interrupt, but the actual association to Linux interrupts was completely architecture specific. This model is still supported today to keep museum architectures and notorious stragglers alive. In 2013 Intel tried to add support for hot-pluggable IO/APICs to the kernel, which was creating yet another architecture specific mechanism and resulted in an unholy mess on top of the existing horrors of x86 interrupt handling. The x86 interrupt management code was already an incomprehensible maze of indirections between the CPU vector management, interrupt remapping and the actual IO/APIC and PCI/MSI[-X] implementation. At roughly the same time ARM struggled with the ever growing SoC specific extensions which were glued on top of the architected GIC interrupt controller. This resulted in a fundamental redesign of interrupt management and provided the today prevailing concept of hierarchical interrupt domains. This allowed to disentangle the interactions between x86 vector domain and interrupt remapping and also allowed ARM to handle the zoo of SoC specific interrupt components in a sane way. The concept of hierarchical interrupt domains aims to encapsulate the functionality of particular IP blocks which are involved in interrupt delivery so that they become extensible and pluggable. The X86 encapsulation looks like this: |--- device 1 [Vector]---[Remapping]---[PCI/MSI]--|... |--- device N where the remapping domain is an optional component and in case that it is not available the PCI/MSI[-X] domains have the vector domain as their parent. This reduced the required interaction between the domains pretty much to the initialization phase where it is obviously required to establish the proper parent relation ship in the components of the hierarchy. While in most cases the model is strictly representing the chain of IP blocks and abstracting them so they can be plugged together to form a hierarchy, the design stopped short on PCI/MSI[-X]. Looking at the hardware it's clear that the actual PCI/MSI[-X] interrupt controller is not a global entity, but strict a per PCI device entity. Here we took a short cut on the hierarchical model and went for the easy solution of providing "global" PCI/MSI domains which was possible because the PCI/MSI[-X] handling is uniform across the devices. This also allowed to keep the existing PCI/MSI[-X] infrastructure mostly unchanged which in turn made it simple to keep the existing architecture specific management alive. A similar problem was created in the ARM world with support for IP block specific message storage. Instead of going all the way to stack a IP block specific domain on top of the generic MSI domain this ended in a construct which provides a "global" platform MSI domain which allows overriding the irq_write_msi_msg() callback per allocation. In course of the lengthy discussions we identified other abuse of the MSI infrastructure in wireless drivers, NTB etc. where support for implementation specific message storage was just mindlessly glued into the existing infrastructure. Some of this just works by chance on particular platforms but will fail in hard to diagnose ways when the driver is used on platforms where the underlying MSI interrupt management code does not expect the creative abuse. Another shortcoming of today's PCI/MSI-X support is the inability to allocate or free individual vectors after the initial enablement of MSI-X. This results in an works by chance implementation of VFIO (PCI pass-through) where interrupts on the host side are not set up upfront to avoid resource exhaustion. They are expanded at run-time when the guest actually tries to use them. The way how this is implemented is that the host disables MSI-X and then re-enables it with a larger number of vectors again. That works by chance because most device drivers set up all interrupts before the device actually will utilize them. But that's not universally true because some drivers allocate a large enough number of vectors but do not utilize them until it's actually required, e.g. for acceleration support. But at that point other interrupts of the device might be in active use and the MSI-X disable/enable dance can just result in losing interrupts and therefore hard to diagnose subtle problems. Last but not least the "global" PCI/MSI-X domain approach prevents to utilize PCI/MSI[-X] and PCI/IMS on the same device due to the fact that IMS is not longer providing a uniform storage and configuration model. The solution to this is to implement the missing step and switch from global PCI/MSI domains to per device PCI/MSI domains. The resulting hierarchy then looks like this: |--- [PCI/MSI] device 1 [Vector]---[Remapping]---|... |--- [PCI/MSI] device N which in turn allows to provide support for multiple domains per device: |--- [PCI/MSI] device 1 |--- [PCI/IMS] device 1 [Vector]---[Remapping]---|... |--- [PCI/MSI] device N |--- [PCI/IMS] device N This work converts the MSI and PCI/MSI core and the x86 interrupt domains to the new model, provides new interfaces for post-enable allocation/free of MSI-X interrupts and the base framework for PCI/IMS. PCI/IMS has been verified with the work in progress IDXD driver. There is work in progress to convert ARM over which will replace the platform MSI train-wreck. The cleanup of VFIO, NTB and other creative "solutions" are in the works as well. Drivers: - Updates for the LoongArch interrupt chip drivers - Support for MTK CIRQv2 - The usual small fixes and updates all over the place" * tag 'irq-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (134 commits) irqchip/ti-sci-inta: Fix kernel doc irqchip/gic-v2m: Mark a few functions __init irqchip/gic-v2m: Include arm-gic-common.h irqchip/irq-mvebu-icu: Fix works by chance pointer assignment iommu/amd: Enable PCI/IMS iommu/vt-d: Enable PCI/IMS x86/apic/msi: Enable PCI/IMS PCI/MSI: Provide pci_ims_alloc/free_irq() PCI/MSI: Provide IMS (Interrupt Message Store) support genirq/msi: Provide constants for PCI/IMS support x86/apic/msi: Enable MSI_FLAG_PCI_MSIX_ALLOC_DYN PCI/MSI: Provide post-enable dynamic allocation interfaces for MSI-X PCI/MSI: Provide prepare_desc() MSI domain op PCI/MSI: Split MSI-X descriptor setup genirq/msi: Provide MSI_FLAG_MSIX_ALLOC_DYN genirq/msi: Provide msi_domain_alloc_irq_at() genirq/msi: Provide msi_domain_ops:: Prepare_desc() genirq/msi: Provide msi_desc:: Msi_data genirq/msi: Provide struct msi_map x86/apic/msi: Remove arch_create_remap_msi_irq_domain() ...
2022-12-06Merge branch 'for-next/perf' into for-next/coreWill Deacon18-12/+2907
* for-next/perf: (21 commits) arm_pmu: Drop redundant armpmu->map_event() in armpmu_event_init() drivers/perf: hisi: Add TLP filter support Documentation: perf: Indent filter options list of hisi-pcie-pmu docs: perf: Fix PMU instance name of hisi-pcie-pmu drivers/perf: hisi: Fix some event id for hisi-pcie-pmu arm64/perf: Replace PMU version number '0' with ID_AA64DFR0_EL1_PMUVer_NI perf/amlogic: Remove unused header inclusions of <linux/version.h> perf/amlogic: Fix build error for x86_64 allmodconfig dt-binding: perf: Add Amlogic DDR PMU docs/perf: Add documentation for the Amlogic G12 DDR PMU perf/amlogic: Add support for Amlogic meson G12 SoC DDR PMU driver MAINTAINERS: Update HiSilicon PMU maintainers perf: arm_cspmu: Fix module cyclic dependency perf: arm_cspmu: Fix build failure on x86_64 perf: arm_cspmu: Fix modular builds due to missing MODULE_LICENSE()s perf: arm_cspmu: Add support for NVIDIA SCF and MCF attribute perf: arm_cspmu: Add support for ARM CoreSight PMU driver perf/smmuv3: Fix hotplug callback leak in arm_smmu_pmu_init() perf/arm_dmc620: Fix hotplug callback leak in dmc620_pmu_init() drivers: perf: marvell_cn10k: Fix hotplug callback leak in tad_pmu_init() ...
2022-12-02arm_pmu: Drop redundant armpmu->map_event() in armpmu_event_init()Anshuman Khandual1-3/+0
__hw_perf_event_init() already calls armpmu->map_event() callback, and also returns its error code including -ENOENT, along with a debug callout. Hence an additional armpmu->map_event() check for -ENOENT is redundant. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20221202015611.338499-1-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-29drivers/perf: hisi: Add TLP filter supportYicong Yang1-1/+13
The PMU support to filter the TLP when counting the bandwidth with below options: - only count the TLP headers - only count the TLP payloads - count both TLP headers and payloads In the current driver it's default to count the TLP payloads only, which will have an implicity side effects that on the traffic only have header only TLPs, we'll get no data. Make this user configuration through "len_mode" parameter and make it default to count both TLP headers and payloads when user not specified. Also update the documentation for it. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20221117084136.53572-5-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-29drivers/perf: hisi: Fix some event id for hisi-pcie-pmuYicong Yang1-4/+4
Some event id of hisi-pcie-pmu is incorrect, fix them. Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20221117084136.53572-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-29perf/amlogic: Remove unused header inclusions of <linux/version.h>Jiapeng Chong2-2/+0
According to the "Abaci Robot": | ./drivers/perf/amlogic/meson_g12_ddr_pmu.c:15 linux/version.h not needed. | ./drivers/perf/amlogic/meson_ddr_pmu_core.c: 19 linux/version.h not needed. So drop the unnecessary #include directives. Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3280 Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3282 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Link: https://lore.kernel.org/r/20221129032108.119661-1-jiapeng.chong@linux.alibaba.com Link: https://lore.kernel.org/r/20221129032108.119661-2-jiapeng.chong@linux.alibaba.com [will: Squashed patches together, filled out commit message a bit more] Signed-off-by: Will Deacon <will@kernel.org>
2022-11-22perf/amlogic: Fix build error for x86_64 allmodconfigJiucheng Xu1-0/+1
The driver misses including <linux/io.h>, which causes a compilation error with x86_64 'allmodconfig': drivers/perf/amlogic/meson_g12_ddr_pmu.c: In function 'dmc_g12_get_freq_quick': drivers/perf/amlogic/meson_g12_ddr_pmu.c:135:15: error: implicit declaration of function 'readl' [-Werror=implicit-function-declaration] 135 | val = readl(info->pll_reg); | ^~~~~ drivers/perf/amlogic/meson_g12_ddr_pmu.c: In function 'dmc_g12_counter_enable': drivers/perf/amlogic/meson_g12_ddr_pmu.c:204:9: error: implicit declaration of function 'writel' [-Werror=implicit-function-declaration] 204 | writel(clock_count, info->ddr_reg[0] + DMC_MON_G12_TIMER); | ^~~~~~ Add the missing header to fix the build. Fixes: 2016e2113d35 ("perf/amlogic: Add support for Amlogic meson G12 SoC DDR PMU driver") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jiucheng Xu <jiucheng.xu@amlogic.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20221122084028.572494-1-jiucheng.xu@amlogic.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-21perf/amlogic: Add support for Amlogic meson G12 SoC DDR PMU driverJiucheng Xu6-0/+974
Add support for Amlogic Meson G12 Series SOC - DDR bandwidth PMU driver framework and interfaces. The PMU can not only monitor the total DDR bandwidth, but also individual IP module bandwidth. Signed-off-by: Jiucheng Xu <jiucheng.xu@amlogic.com> Tested-by: Chris Healy <healych@amazon.com> Link: https://lore.kernel.org/r/20221121021602.3306998-1-jiucheng.xu@amlogic.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-18perf: arm_cspmu: Fix module cyclic dependencyBesar Wicaksono1-3/+2
Build on arm64 allmodconfig failed with: | depmod: ERROR: Cycle detected: arm_cspmu -> nvidia_cspmu -> arm_cspmu | depmod: ERROR: Found 2 modules in dependency cycles! The arm_cspmu.c provides standard functions to operate the PMU and the vendor code provides vendor specific attributes. Both need to be built as single kernel module. Update the makefile to compile sources under arm_cspmu into one module. Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Reviewed-and-Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20221116203952.34168-1-bwicaksono@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-18perf: arm_cspmu: Fix build failure on x86_64Besar Wicaksono1-1/+1
Building on x86_64 allmodconfig failed: | drivers/perf/arm_cspmu/arm_cspmu.c:1114:29: error: implicit | declaration of function 'get_acpi_id_for_cpu' get_acpi_id_for_cpu is a helper function from ARM64. Fix by adding ARM64 dependency. Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20221116190455.55651-1-bwicaksono@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-17genirq: Get rid of GENERIC_MSI_IRQ_DOMAINThomas Gleixner1-1/+1
Adjust to reality and remove another layer of pointless Kconfig indirection. CONFIG_GENERIC_MSI_IRQ is good enough to serve all purposes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20221111122014.524842979@linutronix.de
2022-11-15perf: arm_cspmu: Fix modular builds due to missing MODULE_LICENSE()sWill Deacon2-0/+4
Building an arm64 allmodconfig target results in the following failure from modpost: | ERROR: modpost: missing MODULE_LICENSE() in drivers/perf/arm_cspmu/arm_cspmu.o | ERROR: modpost: missing MODULE_LICENSE() in drivers/perf/arm_cspmu/nvidia_cspmu.o | make[1]: *** [scripts/Makefile.modpost:126: Module.symvers] Error 1 | make: *** [Makefile:1944: modpost] Error 2 Add the missing MODULE_LICENSE() macros, following the license of the source files and symbol exports. Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15perf: arm_cspmu: Add support for NVIDIA SCF and MCF attributeBesar Wicaksono4-1/+426
Add support for NVIDIA System Cache Fabric (SCF) and Memory Control Fabric (MCF) PMU attributes for CoreSight PMU implementation in NVIDIA devices. Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Link: https://lore.kernel.org/r/20221111222330.48602-3-bwicaksono@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15perf: arm_cspmu: Add support for ARM CoreSight PMU driverBesar Wicaksono6-0/+1465
Add support for ARM CoreSight PMU driver framework and interfaces. The driver provides generic implementation to operate uncore PMU based on ARM CoreSight PMU architecture. The driver also provides interface to get vendor/implementation specific information, for example event attributes and formating. The specification used in this implementation can be found below: * ACPI Arm Performance Monitoring Unit table: https://developer.arm.com/documentation/den0117/latest * ARM Coresight PMU architecture: https://developer.arm.com/documentation/ihi0091/latest Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Link: https://lore.kernel.org/r/20221111222330.48602-2-bwicaksono@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15perf/smmuv3: Fix hotplug callback leak in arm_smmu_pmu_init()Shang XiaoJing1-1/+7
arm_smmu_pmu_init() won't remove the callback added by cpuhp_setup_state_multi() when platform_driver_register() failed. Remove the callback by cpuhp_remove_multi_state() in fail path. Similar to the handling of arm_ccn_init() in commit 26242b330093 ("bus: arm-ccn: Prevent hotplug callback leak") Fixes: 7d839b4b9e00 ("perf/smmuv3: Add arm64 smmuv3 pmu driver") Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Reviewed-by: Punit Agrawal <punit.agrawal@bytedance.com> Link: https://lore.kernel.org/r/20221115115540.6245-3-shangxiaojing@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15perf/arm_dmc620: Fix hotplug callback leak in dmc620_pmu_init()Shang XiaoJing1-1/+7
dmc620_pmu_init() won't remove the callback added by cpuhp_setup_state_multi() when platform_driver_register() failed. Remove the callback by cpuhp_remove_multi_state() in fail path. Similar to the handling of arm_ccn_init() in commit 26242b330093 ("bus: arm-ccn: Prevent hotplug callback leak") Fixes: 53c218da220c ("driver/perf: Add PMU driver for the ARM DMC-620 memory controller") Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Reviewed-by: Punit Agrawal <punit.agrawal@bytedance.com> Link: https://lore.kernel.org/r/20221115115540.6245-2-shangxiaojing@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15drivers: perf: marvell_cn10k: Fix hotplug callback leak in tad_pmu_init()Yuan Can1-1/+5
tad_pmu_init() won't remove the callback added by cpuhp_setup_state_multi() when platform_driver_register() failed. Remove the callback by cpuhp_remove_multi_state() in fail path. Similar to the handling of arm_ccn_init() in commit 26242b330093 ("bus: arm-ccn: Prevent hotplug callback leak") Fixes: 036a7584bede ("drivers: perf: Add LLC-TAD perf counter support") Signed-off-by: Yuan Can <yuancan@huawei.com> Link: https://lore.kernel.org/r/20221115070207.32634-3-yuancan@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15perf: arm_dsu: Fix hotplug callback leak in dsu_pmu_init()Yuan Can1-1/+5
dsu_pmu_init() won't remove the callback added by cpuhp_setup_state_multi() when platform_driver_register() failed. Remove the callback by cpuhp_remove_multi_state() in fail path. Similar to the handling of arm_ccn_init() in commit 26242b330093 ("bus: arm-ccn: Prevent hotplug callback leak") Fixes: 7520fa99246d ("perf: ARM DynamIQ Shared Unit PMU support") Signed-off-by: Yuan Can <yuancan@huawei.com> Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20221115070207.32634-2-yuancan@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-08arm_pmu: acpi: handle allocation failureMark Rutland1-0/+1
One of the failure paths in the arm_pmu ACPI code is missing an early return, permitting a NULL pointer dereference upon a memory allocation failure. Add the missing return. Fixes: fe40ffdb7656 ("arm_pmu: rework ACPI probing") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20221108093725.1239563-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-07arm_pmu: rework ACPI probingMark Rutland2-60/+52
The current ACPI PMU probing logic tries to associate PMUs with CPUs when the CPU is first brought online, in order to handle late hotplug, though PMUs are only registered during early boot, and so for late hotplugged CPUs this can only associate the CPU with an existing PMU. We tried to be clever and the have the arm_pmu_acpi_cpu_starting() callback allocate a struct arm_pmu when no matching instance is found, in order to avoid duplication of logic. However, as above this doesn't do anything useful for late hotplugged CPUs, and this requires us to allocate memory in an atomic context, which is especially problematic for PREEMPT_RT, as reported by Valentin and Pierre. This patch reworks the probing to detect PMUs for all online CPUs in the arm_pmu_acpi_probe() function, which is more aligned with how DT probing works. The arm_pmu_acpi_cpu_starting() callback only tries to associate CPUs with an existing arm_pmu instance, avoiding the problem of allocating in atomic context. Note that as we didn't previously register PMUs for late-hotplugged CPUs, this change doesn't result in a loss of existing functionality, though we will now warn when we cannot associate a CPU with a PMU. This change allows us to pull the hotplug callback registration into the arm_pmu_acpi_probe() function, as we no longer need the callbacks to be invoked shortly after probing the boot CPUs, and can register it without invoking the calls. For the moment the arm_pmu_acpi_init() initcall remains to register the SPE PMU, though in future this should probably be moved elsewhere (e.g. the arm64 ACPI init code), since this doesn't need to be tied to the regular CPU PMU code. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lore.kernel.org/r/20210810134127.1394269-2-valentin.schneider@arm.com/ Reported-by: Pierre Gondois <pierre.gondois@arm.com> Link: https://lore.kernel.org/linux-arm-kernel/20220912155105.1443303-1-pierre.gondois@arm.com/ Cc: Pierre Gondois <pierre.gondois@arm.com> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Will Deacon <will@kernel.org> Reviewed-and-tested-by: Pierre Gondois <pierre.gondois@arm.com> Link: https://lore.kernel.org/r/20220930111844.1522365-4-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-07arm_pmu: factor out PMU matchingMark Rutland1-2/+13
A subsequent patch will rework the ACPI probing of PMUs, and we'll need to match a CPU with a known cpuid in two separate paths. Factor out the matching logic into a helper function so that it can be reused. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Pierre Gondois <pierre.gondois@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-and-tested-by: Pierre Gondois <pierre.gondois@arm.com> Link: https://lore.kernel.org/r/20220930111844.1522365-3-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-07arm_pmu: acpi: factor out PMU<->CPU associationMark Rutland1-12/+17
A subsequent patch will rework the ACPI probing of PMUs, and we'll need to associate a CPU with a PMU in two separate paths. Factor out the association logic into a helper function so that it can be reused. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Pierre Gondois <pierre.gondois@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-and-tested-by: Pierre Gondois <pierre.gondois@arm.com> Link: https://lore.kernel.org/r/20220930111844.1522365-2-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-10-27drivers/perf: riscv_pmu_sbi: add support for PMU variant on T-Head C9xx coresHeiko Stuebner1-10/+24
With the T-HEAD C9XX cores being designed before or during the ratification to the SSCOFPMF extension, it implements functionality very similar but not equal to it. It implements overflow handling and also some privilege-mode filtering. While SSCOFPMF supports this for all modes, the C9XX only implements the filtering for M-mode and S-mode but not user-mode. So add some adaptions to allow the C9XX to still handle its PMU through the regular SBI PMU interface instead of defining new interfaces or drivers. To work properly, this requires a matching change in SBI, though the actual interface between kernel and SBI does not change. The main differences are a the overflow CSR and irq number. As the reading of the overflow-csr is in the hot-path during irq handling, use an errata and alternatives to not introduce new conditionals there. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/all/20221011231841.2951264-2-heiko@sntech.de/ Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-27perf: Rewrite core context handlingPeter Zijlstra1-9/+7
There have been various issues and limitations with the way perf uses (task) contexts to track events. Most notable is the single hardware PMU task context, which has resulted in a number of yucky things (both proposed and merged). Notably: - HW breakpoint PMU - ARM big.little PMU / Intel ADL PMU - Intel Branch Monitoring PMU - AMD IBS PMU - S390 cpum_cf PMU - PowerPC trace_imc PMU *Current design:* Currently we have a per task and per cpu perf_event_contexts: task_struct::perf_events_ctxp[] <-> perf_event_context <-> perf_cpu_context ^ | ^ | ^ `---------------------------------' | `--> pmu ---' v ^ perf_event ------' Each task has an array of pointers to a perf_event_context. Each perf_event_context has a direct relation to a PMU and a group of events for that PMU. The task related perf_event_context's have a pointer back to that task. Each PMU has a per-cpu pointer to a per-cpu perf_cpu_context, which includes a perf_event_context, which again has a direct relation to that PMU, and a group of events for that PMU. The perf_cpu_context also tracks which task context is currently associated with that CPU and includes a few other things like the hrtimer for rotation etc. Each perf_event is then associated with its PMU and one perf_event_context. *Proposed design:* New design proposed by this patch reduce to a single task context and a single CPU context but adds some intermediate data-structures: task_struct::perf_event_ctxp -> perf_event_context <- perf_cpu_context ^ | ^ ^ `---------------------------' | | | | perf_cpu_pmu_context <--. | `----. ^ | | | | | | v v | | ,--> perf_event_pmu_context | | | | | | | v v | perf_event ---> pmu ----------------' With the new design, perf_event_context will hold all events for all pmus in the (respective pinned/flexible) rbtrees. This can be achieved by adding pmu to rbtree key: {cpu, pmu, cgroup, group_index} Each perf_event_context carries a list of perf_event_pmu_context which is used to hold per-pmu-per-context state. For example, it keeps track of currently active events for that pmu, a pmu specific task_ctx_data, a flag to tell whether rotation is required or not etc. Additionally, perf_cpu_pmu_context is used to hold per-pmu-per-cpu state like hrtimer details to drive the event rotation, a pointer to perf_event_pmu_context of currently running task and some other ancillary information. Each perf_event is associated to it's pmu, perf_event_context and perf_event_pmu_context. Further optimizations to current implementation are possible. For example, ctx_resched() can be optimized to reschedule only single pmu events. Much thanks to Ravi for picking this up and pushing it towards completion. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Co-developed-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221008062424.313-1-ravi.bangoria@amd.com
2022-10-14Merge tag 'arm64-fixes' of ↵Linus Torvalds2-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Cortex-A55 errata workaround (repeat TLBI) - AMPERE1 added to the Spectre-BHB affected list - MTE fix to avoid setting PG_mte_tagged if no tags have been touched on a page - Fixed typo in the SCTLR_EL1.SPINTMASK bit naming (the commit log has other typos) - perf: return value check in ali_drw_pmu_probe(), ALIBABA_UNCORE_DRW_PMU dependency on ACPI * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Add AMPERE1 to the Spectre-BHB affected list arm64: mte: Avoid setting PG_mte_tagged if no tags cleared or restored MAINTAINERS: rectify file entry in ALIBABA PMU DRIVER drivers/perf: ALIBABA_UNCORE_DRW_PMU should depend on ACPI drivers/perf: fix return value check in ali_drw_pmu_probe() arm64: errata: Add Cortex-A55 to the repeat tlbi list arm64/sysreg: Fix typo in SCTR_EL1.SPINTMASK
2022-10-14Merge tag 'riscv-for-linus-6.1-mw2' of ↵Linus Torvalds1-2/+5
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull more RISC-V updates from Palmer Dabbelt: - DT updates for the PolarFire SOC - a fix to correct the handling of write-only mappings - m{vetndor,arcd,imp}id is now in /proc/cpuinfo - the SiFive L2 cache controller support has been refactored to also support L3 caches - misc fixes, cleanups and improvements throughout the tree * tag 'riscv-for-linus-6.1-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (42 commits) MAINTAINERS: add RISC-V's patchwork RISC-V: Make port I/O string accessors actually work riscv: enable software resend of irqs RISC-V: Re-enable counter access from userspace riscv: vdso: fix NULL deference in vdso_join_timens() when vfork riscv: Add cache information in AUX vector soc: sifive: ccache: define the macro for the register shifts soc: sifive: ccache: use pr_fmt() to remove CCACHE: prefixes soc: sifive: ccache: reduce printing on init soc: sifive: ccache: determine the cache level from dts soc: sifive: ccache: Rename SiFive L2 cache to Composable cache. dt-bindings: sifive-ccache: change Sifive L2 cache to Composable cache riscv: check for kernel config option in t-head memory types errata riscv: use BIT() marco for cpufeature probing riscv: use BIT() macros in t-head errata init riscv: drop some idefs from CMO initialization riscv: cleanup svpbmt cpufeature probing riscv: Pass -mno-relax only on lld < 15.0.0 RISC-V: Avoid dereferening NULL regs in die() dt-bindings: riscv: add new riscv,isa strings for emulators ...
2022-10-13RISC-V: Re-enable counter access from userspacePalmer Dabbelt1-2/+5
These counters were part of the ISA when we froze the uABI, removing them breaks userspace. Link: https://lore.kernel.org/all/YxEhC%2FmDW1lFt36J@aurel32.net/ Fixes: e9991434596f ("RISC-V: Add perf platform driver based on SBI PMU extension") Tested-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20220928131807.30386-1-palmer@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-10Merge tag 'perf-core-2022-10-07' of ↵Linus Torvalds1-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf events updates from Ingo Molnar: "PMU driver updates: - Add AMD Last Branch Record Extension Version 2 (LbrExtV2) feature support for Zen 4 processors. - Extend the perf ABI to provide branch speculation information, if available, and use this on CPUs that have it (eg. LbrExtV2). - Improve Intel PEBS TSC timestamp handling & integration. - Add Intel Raptor Lake S CPU support. - Add 'perf mem' and 'perf c2c' memory profiling support on AMD CPUs by utilizing IBS tagged load/store samples. - Clean up & optimize various x86 PMU details. HW breakpoints: - Big rework to optimize the code for systems with hundreds of CPUs and thousands of breakpoints: - Replace the nr_bp_mutex global mutex with the bp_cpuinfo_sem per-CPU rwsem that is read-locked during most of the key operations. - Improve the O(#cpus * #tasks) logic in toggle_bp_slot() and fetch_bp_busy_slots(). - Apply micro-optimizations & cleanups. - Misc cleanups & enhancements" * tag 'perf-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (75 commits) perf/hw_breakpoint: Annotate tsk->perf_event_mutex vs ctx->mutex perf: Fix pmu_filter_match() perf: Fix lockdep_assert_event_ctx() perf/x86/amd/lbr: Adjust LBR regardless of filtering perf/x86/utils: Fix uninitialized var in get_branch_type() perf/uapi: Define PERF_MEM_SNOOPX_PEER in kernel header file perf/x86/amd: Support PERF_SAMPLE_PHY_ADDR perf/x86/amd: Support PERF_SAMPLE_ADDR perf/x86/amd: Support PERF_SAMPLE_{WEIGHT|WEIGHT_STRUCT} perf/x86/amd: Support PERF_SAMPLE_DATA_SRC perf/x86/amd: Add IBS OP_DATA2 DataSrc bit definitions perf/mem: Introduce PERF_MEM_LVLNUM_{EXTN_MEM|IO} perf/x86/uncore: Add new Raptor Lake S support perf/x86/cstate: Add new Raptor Lake S support perf/x86/msr: Add new Raptor Lake S support perf/x86: Add new Raptor Lake S support bpf: Check flags for branch stack in bpf_read_branch_records helper perf, hw_breakpoint: Fix use-after-free if perf_event_open() fails perf: Use sample_flags for raw_data perf: Use sample_flags for addr ...
2022-10-09Merge tag 'riscv-for-linus-6.1-mw1' of ↵Linus Torvalds2-13/+22
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Improvements to the CPU topology subsystem, which fix some issues where RISC-V would report bad topology information. - The default NR_CPUS has increased to XLEN, and the maximum configurable value is 512. - The CD-ROM filesystems have been enabled in the defconfig. - Support for THP_SWAP has been added for rv64 systems. There are also a handful of cleanups and fixes throughout the tree. * tag 'riscv-for-linus-6.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: enable THP_SWAP for RV64 RISC-V: Print SSTC in canonical order riscv: compat: s/failed/unsupported if compat mode isn't supported RISC-V: Increase range and default value of NR_CPUS cpuidle: riscv-sbi: Fix CPU_PM_CPU_IDLE_ENTER_xyz() macro usage perf: RISC-V: throttle perf events perf: RISC-V: exclude invalid pmu counters from SBI calls riscv: enable CD-ROM file systems in defconfig riscv: topology: fix default topology reporting arm64: topology: move store_cpu_topology() to shared code
2022-10-07drivers/perf: ALIBABA_UNCORE_DRW_PMU should depend on ACPIGeert Uytterhoeven1-1/+1
The Alibaba T-Head Yitian 710 DDR Sub-system Driveway PMU driver relies solely on ACPI for matching. Hence add a dependency on ACPI, to prevent asking the user about this driver when configuring a kernel without ACPI support. Fixes: cf7b61073e45 ("drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/2a4407bb598285660fa5e604e56823ddb12bb0aa.1664285774.git.geert+renesas@glider.be Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-10-07drivers/perf: fix return value check in ali_drw_pmu_probe()Sun Ke1-2/+2
In case of error, devm_ioremap_resource() returns ERR_PTR(), and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR(). Fixes: cf7b61073e45 ("drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC") Signed-off-by: Sun Ke <sunke32@huawei.com> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Link: https://lore.kernel.org/r/20220924032127.313156-1-sunke32@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-10-06Merge tag 'arm64-upstream' of ↵Linus Torvalds5-4/+822
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - arm64 perf: DDR PMU driver for Alibaba's T-Head Yitian 710 SoC, SVE vector granule register added to the user regs together with SVE perf extensions documentation. - SVE updates: add HWCAP for SVE EBF16, update the SVE ABI documentation to match the actual kernel behaviour (zeroing the registers on syscall rather than "zeroed or preserved" previously). - More conversions to automatic system registers generation. - vDSO: use self-synchronising virtual counter access in gettimeofday() if the architecture supports it. - arm64 stacktrace cleanups and improvements. - arm64 atomics improvements: always inline assembly, remove LL/SC trampolines. - Improve the reporting of EL1 exceptions: rework BTI and FPAC exception handling, better EL1 undefs reporting. - Cortex-A510 erratum 2658417: remove BF16 support due to incorrect result. - arm64 defconfig updates: build CoreSight as a module, enable options necessary for docker, memory hotplug/hotremove, enable all PMUs provided by Arm. - arm64 ptrace() support for TPIDR2_EL0 (register provided with the SME extensions). - arm64 ftraces updates/fixes: fix module PLTs with mcount, remove unused function. - kselftest updates for arm64: simple HWCAP validation, FP stress test improvements, validation of ZA regs in signal handlers, include larger SVE and SME vector lengths in signal tests, various cleanups. - arm64 alternatives (code patching) improvements to robustness and consistency: replace cpucap static branches with equivalent alternatives, associate callback alternatives with a cpucap. - Miscellaneous updates: optimise kprobe performance of patching single-step slots, simplify uaccess_mask_ptr(), move MTE registers initialisation to C, support huge vmalloc() mappings, run softirqs on the per-CPU IRQ stack, compat (arm32) misalignment fixups for multiword accesses. * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (126 commits) arm64: alternatives: Use vdso/bits.h instead of linux/bits.h arm64/kprobe: Optimize the performance of patching single-step slot arm64: defconfig: Add Coresight as module kselftest/arm64: Handle EINTR while reading data from children kselftest/arm64: Flag fp-stress as exiting when we begin finishing up kselftest/arm64: Don't repeat termination handler for fp-stress ARM64: reloc_test: add __init/__exit annotations to module init/exit funcs arm64/mm: fold check for KFENCE into can_set_direct_map() arm64: ftrace: fix module PLTs with mcount arm64: module: Remove unused plt_entry_is_initialized() arm64: module: Make plt_equals_entry() static arm64: fix the build with binutils 2.27 kselftest/arm64: Don't enable v8.5 for MTE selftest builds arm64: uaccess: simplify uaccess_mask_ptr() arm64: asm/perf_regs.h: Avoid C++-style comment in UAPI header kselftest/arm64: Fix typo in hwcap check arm64: mte: move register initialization to C arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate() arm64: dma: Drop cache invalidation from arch_dma_prep_coherent() arm64/sve: Add Perf extensions documentation ...
2022-10-03Merge tag 'acpi-6.1-rc1' of ↵Linus Torvalds3-8/+9
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "ACPI and PNP updates for 6.1-rc1. These rearrange the ACPI device object initialization code (to get rid of a redundant parent pointer from struct acpi_device among other things), unify the _UID handling, drop support for some _OSI strings that should not be necessary any more, add new IDs to support more hardware and some more quirks, fix a few issues and clean up code all over. Specifics: - Reimplement acpi_get_pci_dev() using the list of physical devices associated with the given ACPI device object (Rafael Wysocki) - Rename ACPI device object reference counting functions (Rafael Wysocki) - Rearrange ACPI device object initialization code (Rafael Wysocki) - Drop parent field from struct acpi_device (Rafael Wysocki) - Extend the the int3472-tps68470 driver to support multiple consumers of a single TPS68470 along with the requisite framework-level support (Daniel Scally) - Filter out non-memory resources in is_memory(), add a helper function to find all memory type resources of an ACPI device object and use that function in 3 places (Heikki Krogerus) - Add IRQ override quirks for Asus Vivobook K3402ZA/K3502ZA and ASUS model S5402ZA (Tamim Khan, Kellen Renshaw) - Fix acpi_dev_state_d0() kerneldoc (Sakari Ailus) - Fix up suspend-to-idle support on ASUS Rembrandt laptops (Mario Limonciello) - Clean up ACPI platform devices support code (Andy Shevchenko, John Garry) - Clean up ACPI bus management code (Andy Shevchenko, ye xingchen) - Add support for multiple DMA windows with different offsets to the ACPI device enumeration code and use it on LoongArch (Jianmin Lv) - Clean up the ACPI LPSS (Intel SoC) driver (Andy Shevchenko) - Add a quirk for Dell Inspiron 14 2-in-1 for StorageD3Enable (Mario Limonciello) - Drop unused dev_fmt() and redundant 'HMAT' prefix from the HMAT parsing code (Liu Shixin) - Make ACPI FPDT parsing code avoid calling acpi_os_map_memory() on invalid physical addresses (Hans de Goede) - Silence missing-declarations warning related to Apple device properties management (Lukas Wunner) - Disable frequency invariance in the CPPC library if registers used by cppc_get_perf_ctrs() are accessed via PCC (Jeremy Linton) - Add ACPI disabled check to acpi_cpc_valid() (Perry Yuan) - Fix Tx acknowledge in the PCC address space handler (Huisong Li) - Use wait_for_completion_timeout() for PCC mailbox operations (Huisong Li) - Release resources on PCC address space setup failure path (Rafael Mendonca) - Remove unneeded result variables from APEI code (ye xingchen) - Print total number of records found during BERT log parsing (Dmitry Monakhov) - Drop support for 3 _OSI strings that should not be necessary any more and update documentation on custom _OSI strings so that adding new ones is not encouraged any more (Mario Limonciello) - Drop unneeded result variable from ec_write() (ye xingchen) - Remove the leftover struct acpi_ac_bl from the ACPI AC driver (Hanjun Guo) - Reorder symbols to get rid of a few forward declarations in the ACPI fan driver (Uwe Kleine-König) - Add Toshiba Satellite/Portege Z830 ACPI backlight quirk (Arvid Norlander) - Add ARM DMA-330 controller to the supported list in the ACPI AMBA driver (Vijayenthiran Subramaniam) - Drop references to non-functional 01.org/linux-acpi web site from MAINTAINERS and Kconfig help texts (Rafael Wysocki) - Replace strlcpy() with unused retval with strscpy() in the ACPI support code (Wolfram Sang) - Do not initialize ret in main() in the pfrut utility (Shi junming) - Drop useless ACPI DSDT override documentation (Rafael Wysocki) - Fix a few typos and wording mistakes in the ACPI device enumeration documentation (Jean Delvare) - Introduce acpi_dev_uid_to_integer() to convert a _UID string into an integer value (Andy Shevchenko) - Use acpi_dev_uid_to_integer() in several places to unify _UID handling (Andy Shevchenko) - Drop unused pnpid32_to_pnpid() declaration from PNP code (Gaosheng Cui)" * tag 'acpi-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (79 commits) ACPI: LPSS: Deduplicate skipping device in acpi_lpss_create_device() ACPI: LPSS: Replace loop with first entry retrieval ACPI: x86: s2idle: Add another ID to s2idle_dmi_table ACPI: x86: s2idle: Fix a NULL pointer dereference MAINTAINERS: Drop records pointing to 01.org/linux-acpi ACPI: Kconfig: Drop link to https://01.org/linux-acpi ACPI: docs: Drop useless DSDT override documentation ACPI: DPTF: Drop stale link from Kconfig help ACPI: x86: s2idle: Add a quirk for ASUSTeK COMPUTER INC. ROG Flow X13 ACPI: x86: s2idle: Add a quirk for Lenovo Slim 7 Pro 14ARH7 ACPI: x86: s2idle: Add a quirk for ASUS ROG Zephyrus G14 ACPI: x86: s2idle: Add a quirk for ASUS TUF Gaming A17 FA707RE ACPI: x86: s2idle: Add module parameter to prefer Microsoft GUID ACPI: x86: s2idle: If a new AMD _HID is missing assume Rembrandt ACPI: x86: s2idle: Move _HID handling for AMD systems into structures platform/x86: int3472: Add board data for Surface Go2 IR camera platform/x86: int3472: Support multiple gpio lookups in board data platform/x86: int3472: Support multiple clock consumers ACPI: bus: Add iterator for dependent devices ACPI: scan: Add acpi_dev_get_next_consumer_dev() ...
2022-10-03Merge branch 'acpi-uid'Rafael J. Wysocki1-5/+5
Merge ACPI _UID handling unification changes for 6.1-rc1: - Introduce acpi_dev_uid_to_integer() to convert a _UID string into an integer value (Andy Shevchenko). - Use acpi_dev_uid_to_integer() in several places to unify _UID handling (Andy Shevchenko). * acpi-uid: efi/dev-path-parser: Refactor _UID handling to use acpi_dev_uid_to_integer() spi: pxa2xx: Refactor _UID handling to use acpi_dev_uid_to_integer() perf: qcom_l2_pmu: Refactor _UID handling to use acpi_dev_uid_to_integer() i2c: mlxbf: Refactor _UID handling to use acpi_dev_uid_to_integer() i2c: amd-mp2-plat: Refactor _UID handling to use acpi_dev_uid_to_integer() ACPI: x86: Refactor _UID handling to use acpi_dev_uid_to_integer() ACPI: LPSS: Refactor _UID handling to use acpi_dev_uid_to_integer() ACPI: utils: Add acpi_dev_uid_to_integer() helper to get _UID as integer
2022-09-30Merge branch 'acpi-dev'Rafael J. Wysocki2-3/+4
Merge changes regarding the management of ACPI device objects for 6.1-rc1: - Rename ACPI device object reference counting functions (Rafael Wysocki). - Rearrange ACPI device object initialization code (Rafael Wysocki). - Drop parent field from struct acpi_device (Rafael Wysocki). - Extend the the int3472-tps68470 driver to support multiple consumers of a single TPS68470 along with the requisite framework-level support (Daniel Scally). * acpi-dev: platform/x86: int3472: Add board data for Surface Go2 IR camera platform/x86: int3472: Support multiple gpio lookups in board data platform/x86: int3472: Support multiple clock consumers ACPI: bus: Add iterator for dependent devices ACPI: scan: Add acpi_dev_get_next_consumer_dev() ACPI: property: Use acpi_dev_parent() ACPI: Drop redundant acpi_dev_parent() header ACPI: PM: Fix NULL argument handling in acpi_device_get/set_power() ACPI: Drop parent field from struct acpi_device ACPI: scan: Eliminate __acpi_device_add() ACPI: scan: Rearrange initialization of ACPI device objects ACPI: scan: Rename acpi_bus_get_parent() and rearrange it ACPI: Rename acpi_bus_get/put_acpi_device()
2022-09-30Merge branches 'for-next/doc', 'for-next/sve', 'for-next/sysreg', ↵Catalin Marinas1-3/+3
'for-next/gettimeofday', 'for-next/stacktrace', 'for-next/atomics', 'for-next/el1-exceptions', 'for-next/a510-erratum-2658417', 'for-next/defconfig', 'for-next/tpidr2_el0' and 'for-next/ftrace', remote-tracking branch 'arm64/for-next/perf' into for-next/core * arm64/for-next/perf: arm64: asm/perf_regs.h: Avoid C++-style comment in UAPI header arm64/sve: Add Perf extensions documentation perf: arm64: Add SVE vector granule register to user regs MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC docs: perf: Add description for Alibaba's T-Head PMU driver * for-next/doc: : Documentation/arm64 updates arm64/sve: Document our actual ABI for clearing registers on syscall * for-next/sve: : SVE updates arm64/sysreg: Add hwcap for SVE EBF16 * for-next/sysreg: (35 commits) : arm64 system registers generation (more conversions) arm64/sysreg: Fix a few missed conversions arm64/sysreg: Convert ID_AA64AFRn_EL1 to automatic generation arm64/sysreg: Convert ID_AA64DFR1_EL1 to automatic generation arm64/sysreg: Convert ID_AA64FDR0_EL1 to automatic generation arm64/sysreg: Use feature numbering for PMU and SPE revisions arm64/sysreg: Add _EL1 into ID_AA64DFR0_EL1 definition names arm64/sysreg: Align field names in ID_AA64DFR0_EL1 with architecture arm64/sysreg: Add defintion for ALLINT arm64/sysreg: Convert SCXTNUM_EL1 to automatic generation arm64/sysreg: Convert TIPDR_EL1 to automatic generation arm64/sysreg: Convert ID_AA64PFR1_EL1 to automatic generation arm64/sysreg: Convert ID_AA64PFR0_EL1 to automatic generation arm64/sysreg: Convert ID_AA64MMFR2_EL1 to automatic generation arm64/sysreg: Convert ID_AA64MMFR1_EL1 to automatic generation arm64/sysreg: Convert ID_AA64MMFR0_EL1 to automatic generation arm64/sysreg: Convert HCRX_EL2 to automatic generation arm64/sysreg: Standardise naming of ID_AA64PFR1_EL1 SME enumeration arm64/sysreg: Standardise naming of ID_AA64PFR1_EL1 BTI enumeration arm64/sysreg: Standardise naming of ID_AA64PFR1_EL1 fractional version fields arm64/sysreg: Standardise naming for MTE feature enumeration ... * for-next/gettimeofday: : Use self-synchronising counter access in gettimeofday() (if FEAT_ECV) arm64: vdso: use SYS_CNTVCTSS_EL0 for gettimeofday arm64: alternative: patch alternatives in the vDSO arm64: module: move find_section to header * for-next/stacktrace: : arm64 stacktrace cleanups and improvements arm64: stacktrace: track hyp stacks in unwinder's address space arm64: stacktrace: track all stack boundaries explicitly arm64: stacktrace: remove stack type from fp translator arm64: stacktrace: rework stack boundary discovery arm64: stacktrace: add stackinfo_on_stack() helper arm64: stacktrace: move SDEI stack helpers to stacktrace code arm64: stacktrace: rename unwind_next_common() -> unwind_next_frame_record() arm64: stacktrace: simplify unwind_next_common() arm64: stacktrace: fix kerneldoc comments * for-next/atomics: : arm64 atomics improvements arm64: atomic: always inline the assembly arm64: atomics: remove LL/SC trampolines * for-next/el1-exceptions: : Improve the reporting of EL1 exceptions arm64: rework BTI exception handling arm64: rework FPAC exception handling arm64: consistently pass ESR_ELx to die() arm64: die(): pass 'err' as long arm64: report EL1 UNDEFs better * for-next/a510-erratum-2658417: : Cortex-A510: 2658417: remove BF16 support due to incorrect result arm64: errata: remove BF16 HWCAP due to incorrect result on Cortex-A510 arm64: cpufeature: Expose get_arm64_ftr_reg() outside cpufeature.c arm64: cpufeature: Force HWCAP to be based on the sysreg visible to user-space * for-next/defconfig: : arm64 defconfig updates arm64: defconfig: Add Coresight as module arm64: Enable docker support in defconfig arm64: defconfig: Enable memory hotplug and hotremove config arm64: configs: Enable all PMUs provided by Arm * for-next/tpidr2_el0: : arm64 ptrace() support for TPIDR2_EL0 kselftest/arm64: Add coverage of TPIDR2_EL0 ptrace interface arm64/ptrace: Support access to TPIDR2_EL0 arm64/ptrace: Document extension of NT_ARM_TLS to cover TPIDR2_EL0 kselftest/arm64: Add test coverage for NT_ARM_TLS * for-next/ftrace: : arm64 ftraces updates/fixes arm64: ftrace: fix module PLTs with mcount arm64: module: Remove unused plt_entry_is_initialized() arm64: module: Make plt_equals_entry() static
2022-09-29Merge branch 'v6.0-rc7'Peter Zijlstra3-3/+3
Merge upstream to get RAPTORLAKE_S Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2022-09-23Merge tag 'arm64-fixes' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "These are all very simple and self-contained, although the CFI jump-table fix touches the generic linker script as that's where the problematic macro lives. - Fix false positive "sleeping while atomic" warning resulting from the kPTI rework taking a mutex too early. - Fix possible overflow in AMU frequency calculation - Fix incorrect shift in CMN PMU driver which causes problems with newer versions of the IP - Reduce alignment of the CFI jump table to avoid huge kernel images and link errors with !4KiB page size configurations" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: vmlinux.lds.h: CFI: Reduce alignment of jump-table to function alignment perf/arm-cmn: Add more bits to child node address offset field arm64: topology: fix possible overflow in amu_fie_setup() arm64: mm: don't acquire mutex when rewriting swapper
2022-09-22perf: arm64: Add SVE vector granule register to user regsJames Clark1-1/+1
Dwarf based unwinding in a function that pushes SVE registers onto the stack requires the unwinder to know the length of the SVE register to calculate the stack offsets correctly. This was added to the Arm specific Dwarf spec as the VG pseudo register[1]. Add the vector length at position 46 if it's requested by userspace and SVE is supported. If it's not supported then fail to open the event. The vector length must be on each sample because it can be changed at runtime via a prctl or ptrace call. Also by adding it as a register rather than a separate attribute, minimal changes will be required in an unwinder that already indexes into the register list. [1]: https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: James Clark <james.clark@arm.com> Link: https://lore.kernel.org/r/20220901132658.1024635-2-james.clark@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-09-22perf/arm-cmn: Add more bits to child node address offset fieldIlkka Koskinen1-1/+1
CMN-600 uses bits [27:0] for child node address offset while bits [30:28] are required to be zero. For CMN-650, the child node address offset field has been increased to include bits [29:0] while leaving only bit 30 set to zero. Let's include the missing two bits and assume older implementations comply with the spec and set bits [29:28] to 0. Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Fixes: 60d1504070c2 ("perf/arm-cmn: Support new IP features") Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20220808195455.79277-1-ilkka@os.amperecomputing.com Signed-off-by: Will Deacon <will@kernel.org>
2022-09-22drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoCShuai Xue3-0/+818
Add the DDR Sub-System Driveway Performance Monitoring Unit (PMU) driver support for Alibaba T-Head Yitian 710 SoC chip. Yitian supports DDR5/4 DRAM and targets cloud computing and HPC. Each PMU is registered as a device in /sys/bus/event_source/devices, and users can select event to monitor in each sub-channel, independently. For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two sub-channels of the same channel in die 0. And the PMU device of die 1 is prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000. Due to hardware limitation, one of DDRSS Driveway PMU overflow interrupt shares the same irq number with MPAM ERR_IRQ. To register DDRSS PMU and MPAM drivers successfully, add IRQF_SHARED flag. Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Co-developed-by: Hongbo Yao <yaohongbo@linux.alibaba.com> Signed-off-by: Hongbo Yao <yaohongbo@linux.alibaba.com> Co-developed-by: Neng Chen <nengchen@linux.alibaba.com> Signed-off-by: Neng Chen <nengchen@linux.alibaba.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Link: https://lore.kernel.org/r/20220818031822.38415-3-xueshuai@linux.alibaba.com Signed-off-by: Will Deacon <will@kernel.org>
2022-09-21arm64/sysreg: Fix a few missed conversionsNathan Chancellor1-3/+3
After the conversion to automatically generating the ID_AA64DFR0_EL1 definition names, the build fails in a few different places because some of the definitions were not changed to their new names along the way. Update the names to resolve the build errors. Fixes: c0357a73fa4a ("arm64/sysreg: Align field names in ID_AA64DFR0_EL1 with architecture") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220919160928.3905780-1-nathan@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-19perf: qcom_l2_pmu: Refactor _UID handling to use acpi_dev_uid_to_integer()Andy Shevchenko1-5/+5
ACPI utils provide acpi_dev_uid_to_integer() helper to extract _UID as an integer. Use it instead of custom approach. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-09-09Merge tag 'riscv-for-linus-6.0-rc5' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - A pair of device tree fixes for the Polarfire SOC - A fix to avoid overflowing the PMU counter array when firmware incorrectly reports the number of supported counters, which manifests on OpenSBI versions prior to 1.1 * tag 'riscv-for-linus-6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: perf: RISC-V: fix access beyond allocated array riscv: dts: microchip: use an mpfs specific l2 compatible dt-bindings: riscv: sifive-l2: add a PolarFire SoC compatible
2022-09-08perf: RISC-V: fix access beyond allocated arraySergey Matyukevich1-1/+1
SBI firmware should report total number of firmware and hardware counters including unused ones or special ones. In this case the kernel doesn't need to make any assumptions about gaps in reported counters, e.g. excluded timer counter. That was fixed in OpenSBI v1.1 by commit 3f66465fb6bf ("lib: pmu: allow to use the highest available counter"). This kernel patch has no effect if SBI firmware behaves correctly. However it eliminates access beyond the allocated pmu_ctr_list if the kernel is used with OpenSBI older than v1.1. Fixes: e9991434596f ("RISC-V: Add perf platform driver based on SBI PMU extension") Signed-off-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220830155306.301714-2-geomatsi@gmail.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-09-08perf: RISC-V: throttle perf eventsSergey Matyukevich1-0/+4
Call perf_sample_event_took() to report time spent in overflow interrupts. Perf core uses these measurements to throttle perf events properly. Signed-off-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20220830155306.301714-4-geomatsi@gmail.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-09-08perf: RISC-V: exclude invalid pmu counters from SBI callsSergey Matyukevich2-13/+18
SBI firmware may not provide information for some counters in response to SBI_EXT_PMU_COUNTER_GET_INFO call. Exclude such counters from the subsequent SBI requests. For this purpose use global mask to keep track of fully specified counters. Signed-off-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20220830155306.301714-3-geomatsi@gmail.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>