summaryrefslogtreecommitdiffstats
path: root/arch/x86/events
AgeCommit message (Collapse)AuthorFilesLines
2021-05-18perf/x86/intel/uncore: Enable I/O stacks to IIO PMON mapping on ICXAlexander Antonov1-0/+51
This patch enables I/O stacks to IIO PMON mapping on Icelake server. Mapping of IDs in SAD_CONTROL_CFG notation to IDs in PMON notation for Icelake server: Stack Name | CBDMA/DMI | PCIe_1 | PCIe_2 | PCIe_3 | PCIe_4 | PCIe_5 SAD_CONTROL_CFG ID | 0 | 1 | 2 | 3 | 4 | 5 PMON ID | 5 | 0 | 1 | 2 | 3 | 4 I/O stacks to IIO PMON mapping is exposed through attributes /sys/devices/uncore_iio_<pmu_idx>/dieX, where dieX is file which holds "Segment:Root Bus" for PCIe root port which can be monitored by that IIO PMON block. Example for 2-S Icelake server: ==> /sys/devices/uncore_iio_0/die0 <== 0000:16 ==> /sys/devices/uncore_iio_0/die1 <== 0000:97 ==> /sys/devices/uncore_iio_1/die0 <== 0000:30 ==> /sys/devices/uncore_iio_1/die1 <== 0000:b0 ==> /sys/devices/uncore_iio_3/die0 <== 0000:4a ==> /sys/devices/uncore_iio_3/die1 <== 0000:c9 ==> /sys/devices/uncore_iio_4/die0 <== 0000:64 ==> /sys/devices/uncore_iio_4/die1 <== 0000:e2 ==> /sys/devices/uncore_iio_5/die0 <== 0000:00 ==> /sys/devices/uncore_iio_5/die1 <== 0000:80 Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lkml.kernel.org/r/20210426131614.16205-4-alexander.antonov@linux.intel.com
2021-05-18perf/x86/intel/uncore: Enable I/O stacks to IIO PMON mapping on SNRAlexander Antonov1-0/+96
I/O stacks to PMON mapping on Skylake server relies on topology information from CPU_BUS_NO MSR but this approach is not applicable for SNR and ICX. Mapping on these platforms can be gotten by reading SAD_CONTROL_CFG CSR from Mesh2IIO device with 0x09a2 DID. SAD_CONTROL_CFG CSR contains stack IDs in its own notation which are statically mapped on IDs in PMON notation. The map for Snowridge: Stack Name | CBDMA/DMI | PCIe Gen 3 | DLB | NIS | QAT SAD_CONTROL_CFG ID | 0 | 1 | 2 | 3 | 4 PMON ID | 1 | 4 | 3 | 2 | 0 This patch enables I/O stacks to IIO PMON mapping on Snowridge. Mapping is exposed through attributes /sys/devices/uncore_iio_<pmu_idx>/dieX, where dieX is file which holds "Segment:Root Bus" for PCIe root port which can be monitored by that IIO PMON block. Example for Snowridge: ==> /sys/devices/uncore_iio_0/die0 <== 0000:f3 ==> /sys/devices/uncore_iio_1/die0 <== 0000:00 ==> /sys/devices/uncore_iio_2/die0 <== 0000:eb ==> /sys/devices/uncore_iio_3/die0 <== 0000:e3 ==> /sys/devices/uncore_iio_4/die0 <== 0000:14 Mapping for Icelake server will be enabled in the follow-up patch. Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lkml.kernel.org/r/20210426131614.16205-3-alexander.antonov@linux.intel.com
2021-05-18perf/x86/intel/uncore: Generalize I/O stacks to PMON mapping procedureAlexander Antonov2-6/+21
Currently I/O stacks to IIO PMON mapping is available on Skylake servers only and need to make code more general to easily enable further platforms. So, introduce get_topology() callback in struct intel_uncore_type which allows to move common code to separate function and make mapping procedure more general. Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lkml.kernel.org/r/20210426131614.16205-2-alexander.antonov@linux.intel.com
2021-05-12perf/x86/intel/uncore: Drop unnecessary NULL checks after container_of()Guenter Roeck1-4/+0
The parameter passed to the pmu_enable() and pmu_disable() functions can not be NULL because it is dereferenced by the caller. That means the result of container_of() on that parameter can also never be NULL. The existing NULL checks are therefore unnecessary and misleading. Remove them. This change was made automatically with the following Coccinelle script. @@ type t; identifier v; statement s; @@ <+... ( t v = container_of(...); | v = container_of(...); ) ... when != v - if (\( !v \| v == NULL \) ) s ...+> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210510224849.2349861-1-linux@roeck-us.net
2021-05-09Merge tag 'perf_urgent_for_v5.13_rc1' of ↵Linus Torvalds1-21/+26
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 perf fix from Borislav Petkov: "Handle power-gating of AMD IOMMU perf counters properly when they are used" * tag 'perf_urgent_for_v5.13_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/events/amd/iommu: Fix invalid Perf result due to IOMMU PMC power-gating
2021-05-06x86/events/amd/iommu: Fix invalid Perf result due to IOMMU PMC power-gatingSuravee Suthikulpanit1-21/+26
On certain AMD platforms, when the IOMMU performance counter source (csource) field is zero, power-gating for the counter is enabled, which prevents write access and returns zero for read access. This can cause invalid perf result especially when event multiplexing is needed (i.e. more number of events than available counters) since the current logic keeps track of the previously read counter value, and subsequently re-program the counter to continue counting the event. With power-gating enabled, we cannot gurantee successful re-programming of the counter. Workaround this issue by : 1. Modifying the ordering of setting/reading counters and enabing/ disabling csources to only access the counter when the csource is set to non-zero. 2. Since AMD IOMMU PMU does not support interrupt mode, the logic can be simplified to always start counting with value zero, and accumulate the counter value when stopping without the need to keep track and reprogram the counter with the previously read counter value. This has been tested on systems with and without power-gating. Fixes: 994d6608efe4 ("iommu/amd: Remove performance counter pre-initialization test") Suggested-by: Alexander Monakov <amonakov@ispras.ru> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210504065236.4415-1-suravee.suthikulpanit@amd.com
2021-05-01Merge tag 'iommu-updates-v5.13' of ↵Linus Torvalds2-19/+1
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - Big cleanup of almost unsused parts of the IOMMU API by Christoph Hellwig. This mostly affects the Freescale PAMU driver. - New IOMMU driver for Unisoc SOCs - ARM SMMU Updates from Will: - Drop vestigial PREFETCH_ADDR support (SMMUv3) - Elide TLB sync logic for empty gather (SMMUv3) - Fix "Service Failure Mode" handling (SMMUv3) - New Qualcomm compatible string (SMMUv2) - Removal of the AMD IOMMU performance counter writeable check on AMD. It caused long boot delays on some machines and is only needed to work around an errata on some older (possibly pre-production) chips. If someone is still hit by this hardware issue anyway the performance counters will just return 0. - Support for targeted invalidations in the AMD IOMMU driver. Before that the driver only invalidated a single 4k page or the whole IO/TLB for an address space. This has been extended now and is mostly useful for emulated AMD IOMMUs. - Several fixes for the Shared Virtual Memory support in the Intel VT-d driver - Mediatek drivers can now be built as modules - Re-introduction of the forcedac boot option which got lost when converting the Intel VT-d driver to the common dma-iommu implementation. - Extension of the IOMMU device registration interface and support iommu_ops to be const again when drivers are built as modules. * tag 'iommu-updates-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (84 commits) iommu: Streamline registration interface iommu: Statically set module owner iommu/mediatek-v1: Add error handle for mtk_iommu_probe iommu/mediatek-v1: Avoid build fail when build as module iommu/mediatek: Always enable the clk on resume iommu/fsl-pamu: Fix uninitialized variable warning iommu/vt-d: Force to flush iotlb before creating superpage iommu/amd: Put newline after closing bracket in warning iommu/vt-d: Fix an error handling path in 'intel_prepare_irq_remapping()' iommu/vt-d: Fix build error of pasid_enable_wpe() with !X86 iommu/amd: Remove performance counter pre-initialization test Revert "iommu/amd: Fix performance counter initialization" iommu/amd: Remove duplicate check of devid iommu/exynos: Remove unneeded local variable initialization iommu/amd: Page-specific invalidations for more than one page iommu/arm-smmu-v3: Remove the unused fields for PREFETCH_CONFIG command iommu/vt-d: Avoid unnecessary cache flush in pasid entry teardown iommu/vt-d: Invalidate PASID cache when root/context entry changed iommu/vt-d: Remove WO permissions on second-level paging entries iommu/vt-d: Report the right page fault address ...
2021-04-28Merge tag 'perf-core-2021-04-28' of ↵Linus Torvalds18-270/+2155
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf event updates from Ingo Molnar: - Improve Intel uncore PMU support: - Parse uncore 'discovery tables' - a new hardware capability enumeration method introduced on the latest Intel platforms. This table is in a well-defined PCI namespace location and is read via MMIO. It is organized in an rbtree. These uncore tables will allow the discovery of standard counter blocks, but fancier counters still need to be enumerated explicitly. - Add Alder Lake support - Improve IIO stacks to PMON mapping support on Skylake servers - Add Intel Alder Lake PMU support - which requires the introduction of 'hybrid' CPUs and PMUs. Alder Lake is a mix of Golden Cove ('big') and Gracemont ('small' - Atom derived) cores. The CPU-side feature set is entirely symmetrical - but on the PMU side there's core type dependent PMU functionality. - Reduce data loss with CPU level hardware tracing on Intel PT / AUX profiling, by fixing the AUX allocation watermark logic. - Improve ring buffer allocation on NUMA systems - Put 'struct perf_event' into their separate kmem_cache pool - Add support for synchronous signals for select perf events. The immediate motivation is to support low-overhead sampling-based race detection for user-space code. The feature consists of the following main changes: - Add thread-only event inheritance via perf_event_attr::inherit_thread, which limits inheritance of events to CLONE_THREAD. - Add the ability for events to not leak through exec(), via perf_event_attr::remove_on_exec. - Allow the generation of SIGTRAP via perf_event_attr::sigtrap, extend siginfo with an u64 ::si_perf, and add the breakpoint information to ::si_addr and ::si_perf if the event is PERF_TYPE_BREAKPOINT. The siginfo support is adequate for breakpoints right now - but the new field can be used to introduce support for other types of metadata passed over siginfo as well. - Misc fixes, cleanups and smaller updates. * tag 'perf-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits) signal, perf: Add missing TRAP_PERF case in siginfo_layout() signal, perf: Fix siginfo_t by avoiding u64 on 32-bit architectures perf/x86: Allow for 8<num_fixed_counters<16 perf/x86/rapl: Add support for Intel Alder Lake perf/x86/cstate: Add Alder Lake CPU support perf/x86/msr: Add Alder Lake CPU support perf/x86/intel/uncore: Add Alder Lake support perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE perf/x86/intel: Add Alder Lake Hybrid support perf/x86: Support filter_match callback perf/x86/intel: Add attr_update for Hybrid PMUs perf/x86: Add structures for the attributes of Hybrid PMUs perf/x86: Register hybrid PMUs perf/x86: Factor out x86_pmu_show_pmu_cap perf/x86: Remove temporary pmu assignment in event_init perf/x86/intel: Factor out intel_pmu_check_extra_regs perf/x86/intel: Factor out intel_pmu_check_event_constraints perf/x86/intel: Factor out intel_pmu_check_num_counters perf/x86: Hybrid PMU support for extra_regs perf/x86: Hybrid PMU support for event constraints ...
2021-04-27Merge tag 'x86_core_for_v5.13' of ↵Linus Torvalds2-11/+10
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 updates from Borislav Petkov: - Turn the stack canary into a normal __percpu variable on 32-bit which gets rid of the LAZY_GS stuff and a lot of code. - Add an insn_decode() API which all users of the instruction decoder should preferrably use. Its goal is to keep the details of the instruction decoder away from its users and simplify and streamline how one decodes insns in the kernel. Convert its users to it. - kprobes improvements and fixes - Set the maximum DIE per package variable on Hygon - Rip out the dynamic NOP selection and simplify all the machinery around selecting NOPs. Use the simplified NOPs in objtool now too. - Add Xeon Sapphire Rapids to list of CPUs that support PPIN - Simplify the retpolines by folding the entire thing into an alternative now that objtool can handle alternatives with stack ops. Then, have objtool rewrite the call to the retpoline with the alternative which then will get patched at boot time. - Document Intel uarch per models in intel-family.h - Make Sub-NUMA Clustering topology the default and Cluster-on-Die the exception on Intel. * tag 'x86_core_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits) x86, sched: Treat Intel SNC topology as default, COD as exception x86/cpu: Comment Skylake server stepping too x86/cpu: Resort and comment Intel models objtool/x86: Rewrite retpoline thunk calls objtool: Skip magical retpoline .altinstr_replacement objtool: Cache instruction relocs objtool: Keep track of retpoline call sites objtool: Add elf_create_undef_symbol() objtool: Extract elf_symbol_add() objtool: Extract elf_strtab_concat() objtool: Create reloc sections implicitly objtool: Add elf_create_reloc() helper objtool: Rework the elf_rebuild_reloc_section() logic objtool: Fix static_call list generation objtool: Handle per arch retpoline naming objtool: Correctly handle retpoline thunk calls x86/retpoline: Simplify retpolines x86/alternatives: Optimize optimize_nops() x86: Add insn_decode_kernel() x86/kprobes: Move 'inline' to the beginning of the kprobe_is_ss() declaration ...
2021-04-26Merge tag 'x86_cleanups_for_v5.13' of ↵Linus Torvalds11-24/+24
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 cleanups from Borislav Petkov: "Trivial cleanups and fixes all over the place" * tag 'x86_cleanups_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: MAINTAINERS: Remove me from IDE/ATAPI section x86/pat: Do not compile stubbed functions when X86_PAT is off x86/asm: Ensure asm/proto.h can be included stand-alone x86/platform/intel/quark: Fix incorrect kernel-doc comment syntax in files x86/msr: Make locally used functions static x86/cacheinfo: Remove unneeded dead-store initialization x86/process/64: Move cpu_current_top_of_stack out of TSS tools/turbostat: Unmark non-kernel-doc comment x86/syscalls: Fix -Wmissing-prototypes warnings from COND_SYSCALL() x86/fpu/math-emu: Fix function cast warning x86/msr: Fix wr/rdmsr_safe_regs_on_cpu() prototypes x86: Fix various typos in comments, take #2 x86: Remove unusual Unicode characters from comments x86/kaslr: Return boolean values from a function returning bool x86: Fix various typos in comments x86/setup: Remove unused RESERVE_BRK_ARRAY() stacktrace: Move documentation for arch_stack_walk_reliable() to header x86: Remove duplicate TSC DEADLINE MSR definitions
2021-04-23perf/x86: Allow for 8<num_fixed_counters<16Colin Ian King1-1/+1
The 64 bit value read from MSR_ARCH_PERFMON_FIXED_CTR_CTRL is being bit-wise masked with the value (0x03 << i*4). However, the shifted value is evaluated using 32 bit arithmetic, so will UB when i > 8. Fix this by making 0x03 a ULL so that the shift is performed using 64 bit arithmetic. This makes the arithmetic internally consistent and preparers for the day when hardware provides 8<num_fixed_counters<16. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210420142907.382417-1-colin.king@canonical.com
2021-04-22perf/x86/kvm: Fix Broadwell Xeon stepping in isolation_ucodes[]Jim Mattson1-1/+1
The only stepping of Broadwell Xeon parts is stepping 1. Fix the relevant isolation_ucodes[] entry, which previously enumerated stepping 2. Although the original commit was characterized as an optimization, it is also a workaround for a correctness issue. If a PMI arrives between kvm's call to perf_guest_get_msrs() and the subsequent VM-entry, a stale value for the IA32_PEBS_ENABLE MSR may be restored at the next VM-exit. This is because, unbeknownst to kvm, PMI throttling may clear bits in the IA32_PEBS_ENABLE MSR. CPUs with "PEBS isolation" don't suffer from this issue, because perf_guest_get_msrs() doesn't report the IA32_PEBS_ENABLE value. Fixes: 9b545c04abd4f ("perf/x86/kvm: Avoid unnecessary work in guest filtering") Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Peter Shier <pshier@google.com> Acked-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/20210422001834.1748319-1-jmattson@google.com
2021-04-21perf/x86/intel/uncore: Remove uncore extra PCI dev HSWEP_PCI_PCU_3Kan Liang1-35/+26
There may be a kernel panic on the Haswell server and the Broadwell server, if the snbep_pci2phy_map_init() return error. The uncore_extra_pci_dev[HSWEP_PCI_PCU_3] is used in the cpu_init() to detect the existence of the SBOX, which is a MSR type of PMON unit. The uncore_extra_pci_dev is allocated in the uncore_pci_init(). If the snbep_pci2phy_map_init() returns error, perf doesn't initialize the PCI type of the PMON units, so the uncore_extra_pci_dev will not be allocated. But perf may continue initializing the MSR type of PMON units. A null dereference kernel panic will be triggered. The sockets in a Haswell server or a Broadwell server are identical. Only need to detect the existence of the SBOX once. Current perf probes all available PCU devices and stores them into the uncore_extra_pci_dev. It's unnecessary. Use the pci_get_device() to replace the uncore_extra_pci_dev. Only detect the existence of the SBOX on the first available PCU device once. Factor out hswep_has_limit_sbox(), since the Haswell server and the Broadwell server uses the same way to detect the existence of the SBOX. Add some macros to replace the magic number. Fixes: 5306c31c5733 ("perf/x86/uncore/hsw-ep: Handle systems with only two SBOXes") Reported-by: Steve Wahl <steve.wahl@hpe.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lkml.kernel.org/r/1618521764-100923-1-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86/rapl: Add support for Intel Alder LakeZhang Rui1-0/+2
Alder Lake RAPL support is the same as previous Sky Lake. Add Alder Lake model for RAPL. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-26-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86/cstate: Add Alder Lake CPU supportKan Liang1-10/+29
Compared with the Rocket Lake, the CORE C1 Residency Counter is added for Alder Lake, but the CORE C3 Residency Counter is removed. Other counters are the same. Create a new adl_cstates for Alder Lake. Update the comments accordingly. The External Design Specification (EDS) is not published yet. It comes from an authoritative internal source. The patch has been tested on real hardware. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-25-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86/msr: Add Alder Lake CPU supportKan Liang1-0/+2
PPERF and SMI_COUNT MSRs are also supported on Alder Lake. The External Design Specification (EDS) is not published yet. It comes from an authoritative internal source. The patch has been tested on real hardware. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-24-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86/intel/uncore: Add Alder Lake supportKan Liang3-0/+139
The uncore subsystem for Alder Lake is similar to the previous Tiger Lake. The difference includes: - New MSR addresses for global control, fixed counters, CBOX and ARB. Add a new adl_uncore_msr_ops for uncore operations. - Add a new threshold field for CBOX. - New PCIIDs for IMC devices. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-23-git-send-email-kan.liang@linux.intel.com
2021-04-19perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHEKan Liang1-0/+1
Current Hardware events and Hardware cache events have special perf types, PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE. The two types don't pass the PMU type in the user interface. For a hybrid system, the perf subsystem doesn't know which PMU the events belong to. The first capable PMU will always be assigned to the events. The events never get a chance to run on the other capable PMUs. Extend the two types to become PMU aware types. The PMU type ID is stored at attr.config[63:32]. Add a new PMU capability, PERF_PMU_CAP_EXTENDED_HW_TYPE, to indicate a PMU which supports the extended PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE. The PMU type is only required when searching a specific PMU. The PMU specific codes will only be interested in the 'real' config value, which is stored in the low 32 bit of the event->attr.config. Update the event->attr.config in the generic code, so the PMU specific codes don't need to calculate it separately. If a user specifies a PMU type, but the PMU doesn't support the extended type, error out. If an event cannot be initialized in a PMU specified by a user, error out immediately. Perf should not try to open it on other PMUs. The new PMU capability is only set for the X86 hybrid PMUs for now. Other architectures, e.g., ARM, may need it as well. The support on ARM may be implemented later separately. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1618237865-33448-22-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86/intel: Add Alder Lake Hybrid supportKan Liang3-1/+268
Alder Lake Hybrid system has two different types of core, Golden Cove core and Gracemont core. The Golden Cove core is registered to "cpu_core" PMU. The Gracemont core is registered to "cpu_atom" PMU. The difference between the two PMUs include: - Number of GP and fixed counters - Events - The "cpu_core" PMU supports Topdown metrics. The "cpu_atom" PMU supports PEBS-via-PT. The "cpu_core" PMU is similar to the Sapphire Rapids PMU, but without PMEM. The "cpu_atom" PMU is similar to Tremont, but with different events, event_constraints, extra_regs and number of counters. The mem-loads AUX event workaround only applies to the Golden Cove core. Users may disable all CPUs of the same CPU type on the command line or in the BIOS. For this case, perf still register a PMU for the CPU type but the CPU mask is 0. Current caps/pmu_name is usually the microarch codename. Assign the "alderlake_hybrid" to the caps/pmu_name of both PMUs to indicate the hybrid Alder Lake microarchitecture. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-21-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86: Support filter_match callbackKan Liang2-0/+11
Implement filter_match callback for X86, which check whether an event is schedulable on the current CPU. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-20-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86/intel: Add attr_update for Hybrid PMUsKan Liang1-6/+114
The attribute_group for Hybrid PMUs should be different from the previous cpu PMU. For example, cpumask is required for a Hybrid PMU. The PMU type should be included in the event and format attribute. Add hybrid_attr_update for the Hybrid PMU. Check the PMU type in is_visible() function. Only display the event or format for the matched Hybrid PMU. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-19-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86: Add structures for the attributes of Hybrid PMUsKan Liang2-0/+62
Hybrid PMUs have different events and formats. In theory, Hybrid PMU specific attributes should be maintained in the dedicated struct x86_hybrid_pmu, but it wastes space because the events and formats are similar among Hybrid PMUs. To reduce duplication, all hybrid PMUs will share a group of attributes in the following patch. To distinguish an attribute from different Hybrid PMUs, a PMU aware attribute structure is introduced. A PMU type is required for the attribute structure. The type is internal usage. It is not visible in the sysfs API. Hybrid PMUs may support the same event name, but with different event encoding, e.g., the mem-loads event on an Atom PMU has different event encoding from a Core PMU. It brings issue if two attributes are created for them. Current sysfs_update_group finds an attribute by searching the attr name (aka event name). If two attributes have the same event name, the first attribute will be replaced. To address the issue, only one attribute is created for the event. The event_str is extended and stores event encodings from all Hybrid PMUs. Each event encoding is divided by ";". The order of the event encodings must follow the order of the hybrid PMU index. The event_str is internal usage as well. When a user wants to show the attribute of a Hybrid PMU, only the corresponding part of the string is displayed. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-18-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86: Register hybrid PMUsKan Liang3-21/+223
Different hybrid PMUs have different PMU capabilities and events. Perf should registers a dedicated PMU for each of them. To check the X86 event, perf has to go through all possible hybrid pmus. All the hybrid PMUs are registered at boot time. Before the registration, add intel_pmu_check_hybrid_pmus() to check and update the counters information, the event constraints, the extra registers and the unique capabilities for each hybrid PMUs. Postpone the display of the PMU information and HW check to CPU_STARTING, because the boot CPU is the only online CPU in the init_hw_perf_events(). Perf doesn't know the availability of the other PMUs. Perf should display the PMU information only if the counters of the PMU are available. One type of CPUs may be all offline. For this case, users can still observe the PMU in /sys/devices, but its CPU mask is 0. All hybrid PMUs have capability PERF_PMU_CAP_HETEROGENEOUS_CPUS. The PMU name for hybrid PMUs will be "cpu_XXX", which will be assigned later in a separated patch. The PMU type id for the core PMU is still PERF_TYPE_RAW. For the other hybrid PMUs, the PMU type id is not hard code. The event->cpu must be compatitable with the supported CPUs of the PMU. Add a check in the x86_pmu_event_init(). The events in a group must be from the same type of hybrid PMU. The fake cpuc used in the validation must be from the supported CPU of the event->pmu. Perf may not retrieve a valid core type from get_this_hybrid_cpu_type(). For example, ADL may have an alternative configuration. With that configuration, Perf cannot retrieve the core type from the CPUID leaf 0x1a. Add a platform specific get_hybrid_cpu_type(). If the generic way fails, invoke the platform specific get_hybrid_cpu_type(). Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1618237865-33448-17-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86: Factor out x86_pmu_show_pmu_capKan Liang2-9/+19
The PMU capabilities are different among hybrid PMUs. Perf should dump the PMU capabilities information for each hybrid PMU. Factor out x86_pmu_show_pmu_cap() which shows the PMU capabilities information. The function will be reused later when registering a dedicated hybrid PMU. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-16-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86: Remove temporary pmu assignment in event_initKan Liang1-11/+0
The temporary pmu assignment in event_init is unnecessary. The assignment was introduced by commit 8113070d6639 ("perf_events: Add fast-path to the rescheduling code"). At that time, event->pmu is not assigned yet when initializing an event. The assignment is required. However, from commit 7e5b2a01d2ca ("perf: provide PMU when initing events"), the event->pmu is provided before event_init is invoked. The temporary pmu assignment in event_init should be removed. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-15-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86/intel: Factor out intel_pmu_check_extra_regsKan Liang1-14/+21
Each Hybrid PMU has to check and update its own extra registers before registration. The intel_pmu_check_extra_regs will be reused later to check the extra registers of each hybrid PMU. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-14-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86/intel: Factor out intel_pmu_check_event_constraintsKan Liang1-35/+47
Each Hybrid PMU has to check and update its own event constraints before registration. The intel_pmu_check_event_constraints will be reused later to check the event constraints of each hybrid PMU. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-13-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86/intel: Factor out intel_pmu_check_num_countersKan Liang1-14/+24
Each Hybrid PMU has to check its own number of counters and mask fixed counters before registration. The intel_pmu_check_num_counters will be reused later to check the number of the counters for each hybrid PMU. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-12-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86: Hybrid PMU support for extra_regsKan Liang3-8/+13
Different hybrid PMU may have different extra registers, e.g. Core PMU may have offcore registers, frontend register and ldlat register. Atom core may only have offcore registers and ldlat register. Each hybrid PMU should use its own extra_regs. An Intel Hybrid system should always have extra registers. Unconditionally allocate shared_regs for Intel Hybrid system. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-11-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86: Hybrid PMU support for event constraintsKan Liang4-5/+10
The events are different among hybrid PMUs. Each hybrid PMU should use its own event constraints. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-10-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86: Hybrid PMU support for hardware cache eventKan Liang2-3/+11
The hardware cache events are different among hybrid PMUs. Each hybrid PMU should have its own hw cache event table. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1618237865-33448-9-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86: Hybrid PMU support for unconstrainedKan Liang2-1/+12
The unconstrained value depends on the number of GP and fixed counters. Each hybrid PMU should use its own unconstrained. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1618237865-33448-8-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86: Hybrid PMU support for countersKan Liang4-25/+56
The number of GP and fixed counters are different among hybrid PMUs. Each hybrid PMU should use its own counter related information. When handling a certain hybrid PMU, apply the number of counters from the corresponding hybrid PMU. When reserving the counters in the initialization of a new event, reserve all possible counters. The number of counter recored in the global x86_pmu is for the architecture counters which are available for all hybrid PMUs. KVM doesn't support the hybrid PMU yet. Return the number of the architecture counters for now. For the functions only available for the old platforms, e.g., intel_pmu_drain_pebs_nhm(), nothing is changed. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-7-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86: Hybrid PMU support for intel_ctrlKan Liang3-14/+24
The intel_ctrl is the counter mask of a PMU. The PMU counter information may be different among hybrid PMUs, each hybrid PMU should use its own intel_ctrl to check and access the counters. When handling a certain hybrid PMU, apply the intel_ctrl from the corresponding hybrid PMU. When checking the HW existence, apply the PMU and number of counters from the corresponding hybrid PMU as well. Perf will check the HW existence for each Hybrid PMU before registration. Expose the check_hw_exists() for a later patch. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-6-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86/intel: Hybrid PMU support for perf capabilitiesKan Liang4-7/+57
Some platforms, e.g. Alder Lake, have hybrid architecture. Although most PMU capabilities are the same, there are still some unique PMU capabilities for different hybrid PMUs. Perf should register a dedicated pmu for each hybrid PMU. Add a new struct x86_hybrid_pmu, which saves the dedicated pmu and capabilities for each hybrid PMU. The architecture MSR, MSR_IA32_PERF_CAPABILITIES, only indicates the architecture features which are available on all hybrid PMUs. The architecture features are stored in the global x86_pmu.intel_cap. For Alder Lake, the model-specific features are perf metrics and PEBS-via-PT. The corresponding bits of the global x86_pmu.intel_cap should be 0 for these two features. Perf should not use the global intel_cap to check the features on a hybrid system. Add a dedicated intel_cap in the x86_hybrid_pmu to store the model-specific capabilities. Use the dedicated intel_cap to replace the global intel_cap for thse two features. The dedicated intel_cap will be set in the following "Add Alder Lake Hybrid support" patch. Add is_hybrid() to distinguish a hybrid system. ADL may have an alternative configuration. With that configuration, the X86_FEATURE_HYBRID_CPU is not set. Perf cannot rely on the feature bit. Add a new static_key_false, perf_is_hybrid, to indicate a hybrid system. It will be assigned in the following "Add Alder Lake Hybrid support" patch as well. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1618237865-33448-5-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86: Track pmu in per-CPU cpu_hw_eventsKan Liang5-12/+24
Some platforms, e.g. Alder Lake, have hybrid architecture. In the same package, there may be more than one type of CPU. The PMU capabilities are different among different types of CPU. Perf will register a dedicated PMU for each type of CPU. Add a 'pmu' variable in the struct cpu_hw_events to track the dedicated PMU of the current CPU. Current x86_get_pmu() use the global 'pmu', which will be broken on a hybrid platform. Modify it to apply the 'pmu' of the specific CPU. Initialize the per-CPU 'pmu' variable with the global 'pmu'. There is nothing changed for the non-hybrid platforms. The is_x86_event() will be updated in the later patch ("perf/x86: Register hybrid PMUs") for hybrid platforms. For the non-hybrid platforms, nothing is changed here. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1618237865-33448-4-git-send-email-kan.liang@linux.intel.com
2021-04-16perf/amd/uncore: Fix sysfs type mismatchNathan Chancellor1-3/+3
dev_attr_show() calls the __uncore_*_show() functions via an indirect call but their type does not currently match the type of the show() member in 'struct device_attribute', resulting in a Control Flow Integrity violation. $ cat /sys/devices/amd_l3/format/umask config:8-15 $ dmesg | grep "CFI failure" [ 1258.174653] CFI failure (target: __uncore_umask_show...): Update the type in the DEFINE_UNCORE_FORMAT_ATTR macro to match 'struct device_attribute' so that there is no more CFI violation. Fixes: 06f2c24584f3 ("perf/amd/uncore: Prepare to scale for more attributes that vary per family") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210415001112.3024673-2-nathan@kernel.org
2021-04-16x86/events/amd/iommu: Fix sysfs type mismatchNathan Chancellor1-3/+3
dev_attr_show() calls _iommu_event_show() via an indirect call but _iommu_event_show()'s type does not currently match the type of the show() member in 'struct device_attribute', resulting in a Control Flow Integrity violation. $ cat /sys/devices/amd_iommu_1/events/mem_dte_hit csource=0x0a $ dmesg | grep "CFI failure" [ 3526.735140] CFI failure (target: _iommu_event_show...): Change _iommu_event_show() and 'struct amd_iommu_event_desc' to 'struct device_attribute' so that there is no more CFI violation. Fixes: 7be6296fdd75 ("perf/x86/amd: AMD IOMMU Performance Counter PERF uncore PMU implementation") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210415001112.3024673-1-nathan@kernel.org
2021-04-16Merge branches 'iommu/fixes', 'arm/mediatek', 'arm/smmu', 'arm/exynos', ↵Joerg Roedel2-19/+1
'unisoc', 'x86/vt-d', 'x86/amd' and 'core' into next
2021-04-16perf/x86: Move cpuc->running into P4 specific codeKan Liang3-5/+13
The 'running' variable is only used in the P4 PMU. Current perf sets the variable in the critical function x86_pmu_start(), which wastes cycles for everybody not running on P4. Move cpuc->running into the P4 specific p4_pmu_enable_event(). Add a static per-CPU 'p4_running' variable to replace the 'running' variable in the struct cpu_hw_events. Saves space for the generic structure. The p4_pmu_enable_all() also invokes the p4_pmu_enable_event(), but it should not set cpuc->running. Factor out __p4_pmu_enable_event() for p4_pmu_enable_all(). Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1618410990-21383-1-git-send-email-kan.liang@linux.intel.com
2021-04-07iommu/amd: Move a few prototypes to include/linux/amd-iommu.hChristoph Hellwig2-19/+1
A few functions that were intentended for the perf events support are currently declared in arch/x86/events/amd/iommu.h, which mens they are not in scope for the actual function definition. Also amdkfd has started using a few of them using externs in a .c file. End that misery by moving the prototypes to the proper header. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210402143312.372386-5-hch@lst.de Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-02Merge tag 'v5.12-rc5' into WIP.x86/core, to pick up recent NOP related changesIngo Molnar2-1/+4
In particular we want to have this upstream commit: b90829704780: ("bpf: Use NOP_ATOMIC5 instead of emit_nops(&prog, 5) for BPF_TRAMP_F_CALL_ORIG") ... before merging in x86/cpu changes and the removal of the NOP optimizations, and applying PeterZ's !retpoline objtool series. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-04-02perf/x86/intel/uncore: Enable IIO stacks to PMON mapping for multi-segment SKXAlexander Antonov3-34/+47
IIO stacks to PMON mapping on Skylake servers is exposed through introduced early attributes /sys/devices/uncore_iio_<pmu_idx>/dieX, where dieX is a file which holds "Segment:Root Bus" for PCIe root port which can be monitored by that IIO PMON block. These sysfs attributes are disabled for multiple segment topologies except VMD domains which start at 0x10000. This patch removes the limitation and enables IIO stacks to PMON mapping for multi-segment Skylake servers by introducing segment-aware intel_uncore_topology structure and attributing the topology configuration to the segment in skx_iio_get_topology() function. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Tested-by: Kyle Meyer <kyle.meyer@hpe.com> Link: https://lkml.kernel.org/r/20210323150507.2013-1-alexander.antonov@linux.intel.com
2021-04-02perf/x86/intel/uncore: Generic support for the MMIO type of uncore blocksKan Liang4-0/+101
The discovery table provides the generic uncore block information for the MMIO type of uncore blocks, which is good enough to provide basic uncore support. The box control field is composed of the BAR address and box control offset. When initializing the uncore blocks, perf should ioremap the address from the box control field. Implement the generic support for the MMIO type of uncore block. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1616003977-90612-6-git-send-email-kan.liang@linux.intel.com
2021-04-02perf/x86/intel/uncore: Generic support for the PCI type of uncore blocksKan Liang4-7/+177
The discovery table provides the generic uncore block information for the PCI type of uncore blocks, which is good enough to provide basic uncore support. The PCI BUS and DEVFN information can be retrieved from the box control field. Introduce the uncore_pci_pmus_register() to register all the PCICFG type of uncore blocks. The old PCI probe/remove way is dropped. The PCI BUS and DEVFN information are different among dies. Add box_ctls to store the box control field of each die. Add a new BUS notifier for the PCI type of uncore block to support the hotplug. If the device is "hot remove", the corresponding registered PMU has to be unregistered. Perf cannot locate the PMU by searching a const pci_device_id table, because the discovery tables don't provide such information. Introduce uncore_pci_find_dev_pmu_from_types() to search the whole uncore_pci_uncores for the PMU. Implement generic support for the PCI type of uncore block. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1616003977-90612-5-git-send-email-kan.liang@linux.intel.com
2021-04-02perf/x86/intel/uncore: Rename uncore_notifier to uncore_pci_sub_notifierKan Liang1-6/+14
Perf will use a similar method to the PCI sub driver to register the PMUs for the PCI type of uncore blocks. The method requires a BUS notifier to support hotplug. The current BUS notifier cannot be reused, because it searches a const id_table for the corresponding registered PMU. The PCI type of uncore blocks in the discovery tables doesn't provide an id_table. Factor out uncore_bus_notify() and add the pointer of an id_table as a parameter. The uncore_bus_notify() will be reused in the following patch. The current BUS notifier is only used by the PCI sub driver. Its name is too generic. Rename it to uncore_pci_sub_notifier, which is specific for the PCI sub driver. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1616003977-90612-4-git-send-email-kan.liang@linux.intel.com
2021-04-02perf/x86/intel/uncore: Generic support for the MSR type of uncore blocksKan Liang4-10/+182
The discovery table provides the generic uncore block information for the MSR type of uncore blocks, e.g., the counter width, the number of counters, the location of control/counter registers, which is good enough to provide basic uncore support. It can be used as a fallback solution when the kernel doesn't support a platform. The name of the uncore box cannot be retrieved from the discovery table. uncore_type_&typeID_&boxID will be used as its name. Save the type ID and the box ID information in the struct intel_uncore_type. Factor out uncore_get_pmu_name() to handle different naming methods. Implement generic support for the MSR type of uncore block. Some advanced features, such as filters and constraints, cannot be retrieved from discovery tables. Features that rely on that information are not be supported here. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1616003977-90612-3-git-send-email-kan.liang@linux.intel.com
2021-04-02perf/x86/intel/uncore: Parse uncore discovery tablesKan Liang4-8/+448
A self-describing mechanism for the uncore PerfMon hardware has been introduced with the latest Intel platforms. By reading through an MMIO page worth of information, perf can 'discover' all the standard uncore PerfMon registers in a machine. The discovery mechanism relies on BIOS's support. With a proper BIOS, a PCI device with the unique capability ID 0x23 can be found on each die. Perf can retrieve the information of all available uncore PerfMons from the device via MMIO. The information is composed of one global discovery table and several unit discovery tables. - The global discovery table includes global uncore information of the die, e.g., the address of the global control register, the offset of the global status register, the number of uncore units, the offset of unit discovery tables, etc. - The unit discovery table includes generic uncore unit information, e.g., the access type, the counter width, the address of counters, the address of the counter control, the unit ID, the unit type, etc. The unit is also called "box" in the code. Perf can provide basic uncore support based on this information with the following patches. To locate the PCI device with the discovery tables, check the generic PCI ID first. If it doesn't match, go through the entire PCI device tree and locate the device with the unique capability ID. The uncore information is similar among dies. To save parsing time and space, only completely parse and store the discovery tables on the first die and the first box of each die. The parsed information is stored in an RB tree structure, intel_uncore_discovery_type. The size of the stored discovery tables varies among platforms. It's around 4KB for a Sapphire Rapids server. If a BIOS doesn't support the 'discovery' mechanism, the uncore driver will exit with -ENODEV. There is nothing changed. Add a module parameter to disable the discovery feature. If a BIOS gets the discovery tables wrong, users can have an option to disable the feature. For the current patchset, the uncore driver will exit with -ENODEV. In the future, it may fall back to the hardcode uncore driver on a known platform. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1616003977-90612-2-git-send-email-kan.liang@linux.intel.com
2021-03-21x86: Fix various typos in comments, take #2Ingo Molnar3-3/+3
Fix another ~42 single-word typos in arch/x86/ code comments, missed a few in the first pass, in particular in .S files. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-kernel@vger.kernel.org
2021-03-21x86: Remove unusual Unicode characters from commentsIngo Molnar1-6/+6
We've accumulated a few unusual Unicode characters in arch/x86/ over the years, substitute them with their proper ASCII equivalents. A few of them were a whitespace equivalent: ' ' - the use was harmless. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-kernel@vger.kernel.org