summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2022-11-23dt-bindings: PCI: dwc: Add Baikal-T1 PCIe Root Port bindingsSerge Semin1-0/+168
Baikal-T1 SoC is equipped with DWC PCIe v4.60a Root Port controller, which link can be trained to work on up to Gen.3 speed over up to x4 lanes. The controller is supposed to be fed up with four clock sources: DBI peripheral clock, AXI application Tx/Rx clocks and external PHY/core reference clock generating the 100MHz signal. In addition to that the platform provide a way to reset each part of the controller: sticky/non-sticky bits, host controller core, PIPE interface, PCS/PHY and Hot/Power reset signal. The Root Port controller is equipped with multiple IRQ lines like MSI, system AER, PME, HP, Bandwidth change, Link equalization request and eDMA ones. The registers space is accessed over the DBI interface. There can be no more than four inbound or outbound iATU windows configured. Link: https://lore.kernel.org/r/20221113191301.5526-15-Sergey.Semin@baikalelectronics.ru Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org>
2022-11-23dt-bindings: PCI: dwc: Apply common schema to Rockchip DW PCIe nodesSerge Semin1-2/+2
As the DT-bindings description states the Rockchip PCIe controller is based on the DW PCIe RP IP-core thus its DT-nodes are supposed to be compatible with the common DW PCIe controller schema. Let's make sure they are evaluated against it by referring to the snps,dw-pcie.yaml schema in the allOf sub-schemas composition. Link: https://lore.kernel.org/r/20221113191301.5526-14-Sergey.Semin@baikalelectronics.ru Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org>
2022-11-23dt-bindings: PCI: dwc: Add dma-coherent propertySerge Semin1-0/+2
DW PCIe EP/RP AXI- and TRGT1-master interfaces are responsible for the application memory access. They are used by the RP/EP PCIe buses (MWr/MWr TLPs emitted by the peripheral PCIe devices) and the eDMA block. Since all of them mainly involve the system memory and basically mean DMA we can expect the corresponding platforms can be designed in a way to make sure the transactions are cache-coherent. As such the DW PCIe DT-nodes can have the 'dma-coherent' property specified. Let's permit it in the DT-bindings then. Link: https://lore.kernel.org/r/20221113191301.5526-13-Sergey.Semin@baikalelectronics.ru Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org>
2022-11-23dt-bindings: PCI: dwc: Add clocks/resets common propertiesSerge Semin3-2/+126
DW PCIe RP/EP reference manuals explicit define all the clocks and reset requirements in [1] and [2]. Seeing the DW PCIe vendor-specific DT-bindings have already started assigning random names to the same set of the clocks and resets lines, let's define a generic names sets and add them to the DW PCIe common DT-schema. Note since there are DW PCI-based vendor-specific DT-bindings with the custom names assigned to the same clocks and resets resources we have no much choice but to add them to the generic DT-schemas in order to have the schemas being applicable for such devices. These names are marked as vendor-specific and should be avoided being used in new bindings in favor of the generic names. [1] Synopsys DesignWare Cores PCI Express Controller Databook - DWC PCIe Root Port, Version 5.40a, March 2019, p.55 - 78. [2] Synopsys DesignWare Cores PCI Express Controller Databook - DWC PCIe Endpoint, Version 5.40a, March 2019, p.58 - 81. Link: https://lore.kernel.org/r/20221113191301.5526-12-Sergey.Semin@baikalelectronics.ru Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org>
2022-11-23dt-bindings: PCI: dwc: Add reg/reg-names common propertiesSerge Semin3-13/+169
Even though there is a more-or-less limited set of the CSR spaces can be defined for each DW PCIe controller the generic DT-schema currently doesn't specify much limitations on the reg-space names used for one or another range. In order to prevent the vendor-specific controller schemas further deviation from the generic interface let's fix that by introducing the reg-names definition in the common DW PCIe DT-schemas and preserving the generic "reg" and "reg-names" properties in there. New DW PCIe device DT-bindings are encouraged to use the generic set of the CSR spaces defined in the generic DW PCIe RP/EP DT-bindings, while the already available vendor-specific DT-bindings can still apple the common DT-schemas. Note the number of reg/reg-names items need to be changed in the DW PCIe EP DT-schema since aside with the "dbi" CSRs space these arrays can have "dbi2", "addr_space", "atu", etc ranges. Also note since there are DW PCIe-based vendor-specific DT-bindings with the custom names assigned to the same CSR resources we have no much choice but to add them to the generic DT-schemas in order to have the schemas being applicable for such devices. These names are marked as vendor-specific and should be avoided being used in new bindings in favor of the generic names. Link: https://lore.kernel.org/r/20221113191301.5526-11-Sergey.Semin@baikalelectronics.ru Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org>
2022-11-23dt-bindings: PCI: dwc: Add interrupts/interrupt-names common propertiesSerge Semin3-3/+158
Currently the 'interrupts' and 'interrupt-names' properties are defined being too generic to really describe any actual IRQ interface. Moreover the DW PCIe End-point devices are left with no IRQ signals. All of that can be fixed by adding the IRQ-related properties to the common DW PCIe DT-schemas in accordance with the hardware reference manual. The DW PCIe common DT-schema will contain the generic properties definitions with just a number of entries per property, while the DW PCIe RP/EP-specific schemas will have the particular number of items and the generic resource names listed. Note since there are DW PCI-based vendor-specific DT-bindings with the custom names assigned to the same IRQ resources we have no much choice but to add them to the generic DT-schemas in order to have the schemas being applicable for such devices. These names are marked as vendor-specific and should be avoided being used in new bindings in favor of the generic names. Link: https://lore.kernel.org/r/20221113191301.5526-10-Sergey.Semin@baikalelectronics.ru Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org>
2022-11-23dt-bindings: PCI: dwc: Add max-functions EP propertySerge Semin1-0/+4
In accordance with [1] the CX_NFUNC IP-core synthesize parameter is responsible for the number of physical functions to support in the EP mode. Its upper limit is 32. Let's use it to constrain the number of PCIe functions the DW PCIe EP DT-nodes can advertise. [1] Synopsys DesignWare Cores PCI Express Controller Databook - DWC PCIe Endpoint, Version 5.40a, March 2019, p. 887. Link: https://lore.kernel.org/r/20221113191301.5526-9-Sergey.Semin@baikalelectronics.ru Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org>
2022-11-23dt-bindings: PCI: dwc: Apply generic schema for generic device onlySerge Semin2-12/+20
Having the generic compatible strings constraints with the 'any'+'generic string' semantic implicitly encourages either to add new DW PCIe-based DT-bindings with the generic compatible string attached or just forget about adding new DT-bindings since the corresponding DT-node will be evaluated anyway. Moreover having that semantic implemented in the generic DT-schema causes the DT-validation tool to apply the schema twice: first by implicit compatible-string-based selection and second by means of the 'allOf: [ $ref ]' statement. Let's fix all of that by dropping the compatible property constraints and selecting the generic DT-schema only for the purely generic DW PCIe DT-nodes. The later is required since there is a driver for such devices. (Though there are no such DT-nodes currently defined in the kernel DT sources.) Link: https://lore.kernel.org/r/20221113191301.5526-8-Sergey.Semin@baikalelectronics.ru Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org>
2022-11-23dt-bindings: PCI: dwc: Add max-link-speed common propertySerge Semin3-0/+6
In accordance with [1] DW PCIe controllers support up to Gen5 link speed. Let's add the max-link-speed property upper bound to 5 then. The DT bindings of the particular devices are expected to setup more strict constraint on that parameter. [1] Synopsys DesignWare Cores PCI Express Controller Databook, Version 5.40a, March 2019, p. 27 Link: https://lore.kernel.org/r/20221113191301.5526-7-Sergey.Semin@baikalelectronics.ru Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org>
2022-11-23dt-bindings: PCI: dwc: Add phys/phy-names common propertiesSerge Semin3-0/+30
It's normal to have the DW PCIe RP/EP DT-nodes equipped with the explicit PHY phandle references. There can be up to 16 PHYs attach in accordance with the maximum number of supported PCIe lanes. Let's extend the common DW PCIe controller schema with the 'phys' and 'phy-names' properties definition. There two types PHY names are defined: preferred generic names '^pcie[0-9]+$' and non-preferred vendor-specific names '^pcie([0-9]+|-?phy[0-9]*)?$' so to match the names currently supported by the DW PCIe platform drivers ("pcie": meson; "pciephy": qcom, imx6; "pcie-phy": uniphier, rockchip, spear13xx; "pcie": intel-gw; "pcie-phy%d": keystone, dra7xx; "pcie": histb, etc). Link: https://lore.kernel.org/r/20221113191301.5526-6-Sergey.Semin@baikalelectronics.ru Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org>
2022-11-23dt-bindings: PCI: dwc: Remove bus node from the examplesSerge Semin2-27/+24
It's absolutely redundant seeing by default each node is embedded into its own example-X node with address and size cells set to 1. Link: https://lore.kernel.org/r/20221113191301.5526-5-Sergey.Semin@baikalelectronics.ru Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org>
2022-11-23dt-bindings: PCI: dwc: Detach common RP/EP DT bindingsSerge Semin3-62/+78
Currently both DW PCIe Root Port and End-point DT bindings are defined as separate schemas. Carefully looking at them, at the hardware reference manuals and seeing there is a generic part of the driver used by the both RP and EP drivers we can greatly simplify the DW PCIe controller bindings by moving some of the properties into the common DT schema. It concerns the PERST GPIO control, number of lanes, number of iATU windows and CDM check properties. They will be defined in the snps,dw-pcie-common.yaml schema which will be referenced in the DW PCIe Root Port and End-point DT bindings in order to evaluate the common for both of these controllers properties. The rest of properties like reg{,-names}, clock{s,-names}, reset{s,-names}, etc will be consolidate there in one of the next commits. Link: https://lore.kernel.org/r/20221113191301.5526-4-Sergey.Semin@baikalelectronics.ru Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org>
2022-11-23dt-bindings: visconti-pcie: Fix interrupts array max constraintsSerge Semin1-3/+4
In accordance with the way the device DT-node is actually defined in arch/arm64/boot/dts/toshiba/tmpv7708.dtsi and the way the device is probed by the DW PCIe driver there are two IRQs it actually has. It's MSI IRQ the DT-bindings lack. Let's extend the interrupts property constraints then and fix the schema example so one would be acceptable by the actual device DT-bindings. Link: https://lore.kernel.org/r/20221113191301.5526-3-Sergey.Semin@baikalelectronics.ru Fixes: 17c1b16340f0 ("dt-bindings: pci: Add DT binding for Toshiba Visconti PCIe controller") Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Acked-by: Rob Herring <robh@kernel.org> Acked-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
2022-11-23dt-bindings: imx6q-pcie: Fix clock names for imx6sx and imx8mqSerge Semin1-4/+42
Originally as it was defined the legacy bindings the pcie_inbound_axi and pcie_aux clock names were supposed to be used in the fsl,imx6sx-pcie and fsl,imx8mq-pcie devices respectively. But the bindings conversion has been incorrectly so now the fourth clock name is defined as "pcie_inbound_axi for imx6sx-pcie, pcie_aux for imx8mq-pcie", which is completely wrong. Let's fix that by conditionally apply the clock-names constraints based on the compatible string content. Link: https://lore.kernel.org/r/20221113191301.5526-2-Sergey.Semin@baikalelectronics.ru Fixes: 751ca492f131 ("dt-bindings: PCI: imx6: convert the imx pcie controller to dtschema") Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Alexander Stein <alexander.stein@ew.tq-group.com>
2022-11-14PCI: histb: Switch to using gpiod APIDmitry Torokhov1-20/+19
This patch switches the driver away from legacy gpio/of_gpio API to gpiod API, and removes use of of_get_named_gpio_flags() which I want to make private to gpiolib. Link: https://lore.kernel.org/r/20220906204301.3736813-1-dmitry.torokhov@gmail.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
2022-11-11PCI: imx6: Initialize PHY before deasserting core resetSascha Hauer1-6/+7
When the PHY is the reference clock provider then it must be initialized and powered on before the reset on the client is deasserted, otherwise the link will never come up. The order was changed in cf236e0c0d59. Restore the correct order to make the driver work again on boards where the PHY provides the reference clock. This also changes the order for boards where the Soc is the PHY reference clock divider, but this shouldn't do any harm. Link: https://lore.kernel.org/r/20221101095714.440001-1-s.hauer@pengutronix.de Fixes: cf236e0c0d59 ("PCI: imx6: Do not hide PHY driver callbacks and refine the error handling") Tested-by: Richard Zhu <hongxing.zhu@nxp.com> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
2022-11-10PCI: dwc: Use dev_info for PCIe link down event loggingVidya Sagar1-1/+1
Some of the platforms (like Tegra194 and Tegra234) have open slots and not having an endpoint connected to the slot is not an error. So, changing the macro from dev_err to dev_info to log the event. Link: https://lore.kernel.org/r/20220913101237.4337-1-vidyas@nvidia.com Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Acked-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2022-11-10PCI: qcom: Fix error message for reset_control_assert()Manivannan Sadhasivam1-1/+1
Fix the error message to mention "assert" instead of "deassert". Link: https://lore.kernel.org/r/20221109094039.25753-1-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Vinod Koul <vkoul@kernel.org>
2022-10-27PCI: designware-ep: Disable PTM capabilities for EP modeVidya Sagar1-1/+18
Dual mode DesignWare PCIe IP has PTM capability enabled (if supported) even in the EP mode. The PCIe compliance for the EP mode expects PTM capabilities (ROOT_CAPABLE, RES_CAPABLE, CLK_GRAN) be disabled. Hence disable PTM for the EP mode. Link: https://lore.kernel.org/r/20220919143340.4527-3-vidyas@nvidia.com Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Acked-by: Jingoo Han <jingoohan1@gmail.com>
2022-10-27PCI: Add PCI_PTM_CAP_RES macroVidya Sagar1-0/+1
Add macro defining Responder capable bit in Precision Time Measurement capability register. Link: https://lore.kernel.org/r/20220919143340.4527-2-vidyas@nvidia.com Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Jingoo Han <jingoohan1@gmail.com>
2022-10-27PCI: dwc: Fix n_fts[] array overrunVidya Sagar1-1/+1
commit aeaa0bfe89654 ("PCI: dwc: Move N_FTS setup to common setup") incorrectly uses pci->link_gen in deriving the index to the n_fts[] array also introducing the issue of accessing beyond the boundaries of array for greater than Gen-2 speeds. This change fixes that issue. Link: https://lore.kernel.org/r/20220926111923.22487-1-vidyas@nvidia.com Fixes: aeaa0bfe8965 ("PCI: dwc: Move N_FTS setup to common setup") Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Jingoo Han <jingoohan1@gmail.com>
2022-10-16Linux 6.1-rc1v6.1-rc1Linus Torvalds1-2/+2
2022-10-16Merge tag 'random-6.1-rc1-for-linus' of ↵Linus Torvalds185-421/+378
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull more random number generator updates from Jason Donenfeld: "This time with some large scale treewide cleanups. The intent of this pull is to clean up the way callers fetch random integers. The current rules for doing this right are: - If you want a secure or an insecure random u64, use get_random_u64() - If you want a secure or an insecure random u32, use get_random_u32() The old function prandom_u32() has been deprecated for a while now and is just a wrapper around get_random_u32(). Same for get_random_int(). - If you want a secure or an insecure random u16, use get_random_u16() - If you want a secure or an insecure random u8, use get_random_u8() - If you want secure or insecure random bytes, use get_random_bytes(). The old function prandom_bytes() has been deprecated for a while now and has long been a wrapper around get_random_bytes() - If you want a non-uniform random u32, u16, or u8 bounded by a certain open interval maximum, use prandom_u32_max() I say "non-uniform", because it doesn't do any rejection sampling or divisions. Hence, it stays within the prandom_*() namespace, not the get_random_*() namespace. I'm currently investigating a "uniform" function for 6.2. We'll see what comes of that. By applying these rules uniformly, we get several benefits: - By using prandom_u32_max() with an upper-bound that the compiler can prove at compile-time is ≤65536 or ≤256, internally get_random_u16() or get_random_u8() is used, which wastes fewer batched random bytes, and hence has higher throughput. - By using prandom_u32_max() instead of %, when the upper-bound is not a constant, division is still avoided, because prandom_u32_max() uses a faster multiplication-based trick instead. - By using get_random_u16() or get_random_u8() in cases where the return value is intended to indeed be a u16 or a u8, we waste fewer batched random bytes, and hence have higher throughput. This series was originally done by hand while I was on an airplane without Internet. Later, Kees and I worked on retroactively figuring out what could be done with Coccinelle and what had to be done manually, and then we split things up based on that. So while this touches a lot of files, the actual amount of code that's hand fiddled is comfortably small" * tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: prandom: remove unused functions treewide: use get_random_bytes() when possible treewide: use get_random_u32() when possible treewide: use get_random_{u8,u16}() when possible, part 2 treewide: use get_random_{u8,u16}() when possible, part 1 treewide: use prandom_u32_max() when possible, part 2 treewide: use prandom_u32_max() when possible, part 1
2022-10-16Merge tag 'perf-tools-for-v6.1-2-2022-10-16' of ↵Linus Torvalds36-71/+1265
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull more perf tools updates from Arnaldo Carvalho de Melo: - Use BPF CO-RE (Compile Once, Run Everywhere) to support old kernels when using bperf (perf BPF based counters) with cgroups. - Support HiSilicon PCIe Performance Monitoring Unit (PMU), that monitors bandwidth, latency, bus utilization and buffer occupancy. Documented in Documentation/admin-guide/perf/hisi-pcie-pmu.rst. - User space tasks can migrate between CPUs, so when tracing selected CPUs, system-wide sideband is still needed, fix it in the setup of Intel PT on hybrid systems. - Fix metricgroups title message in 'perf list', it should state that the metrics groups are to be used with the '-M' option, not '-e'. - Sync the msr-index.h copy with the kernel sources, adding support for using "AMD64_TSC_RATIO" in filter expressions in 'perf trace' as well as decoding it when printing the MSR tracepoint arguments. - Fix program header size and alignment when generating a JIT ELF in 'perf inject'. - Add multiple new Intel PT 'perf test' entries, including a jitdump one. - Fix the 'perf test' entries for 'perf stat' CSV and JSON output when running on PowerPC due to an invalid topology number in that arch. - Fix the 'perf test' for arm_coresight failures on the ARM Juno system. - Fix the 'perf test' attr entry for PERF_FORMAT_LOST, adding this option to the or expression expected in the intercepted perf_event_open() syscall. - Add missing condition flags ('hs', 'lo', 'vc', 'vs') for arm64 in the 'perf annotate' asm parser. - Fix 'perf mem record -C' option processing, it was being chopped up when preparing the underlying 'perf record -e mem-events' and thus being ignored, requiring using '-- -C CPUs' as a workaround. - Improvements and tidy ups for 'perf test' shell infra. - Fix Intel PT information printing segfault in uClibc, where a NULL format was being passed to fprintf. * tag 'perf-tools-for-v6.1-2-2022-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (23 commits) tools arch x86: Sync the msr-index.h copy with the kernel sources perf auxtrace arm64: Add support for parsing HiSilicon PCIe Trace packet perf auxtrace arm64: Add support for HiSilicon PCIe Tune and Trace device driver perf auxtrace arm: Refactor event list iteration in auxtrace_record__init() perf tests stat+json_output: Include sanity check for topology perf tests stat+csv_output: Include sanity check for topology perf intel-pt: Fix system_wide dummy event for hybrid perf intel-pt: Fix segfault in intel_pt_print_info() with uClibc perf test: Fix attr tests for PERF_FORMAT_LOST perf test: test_intel_pt.sh: Add 9 tests perf inject: Fix GEN_ELF_TEXT_OFFSET for jit perf test: test_intel_pt.sh: Add jitdump test perf test: test_intel_pt.sh: Tidy some alignment perf test: test_intel_pt.sh: Print a message when skipping kernel tracing perf test: test_intel_pt.sh: Tidy some perf record options perf test: test_intel_pt.sh: Fix return checking again perf: Skip and warn on unknown format 'configN' attrs perf list: Fix metricgroups title message perf mem: Fix -C option behavior for perf mem record perf annotate: Add missing condition flags for arm64 ...
2022-10-16Merge tag 'kbuild-fixes-v6.1' of ↵Linus Torvalds6-11/+18
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y compile error for the combination of Clang >= 14 and GAS <= 2.35. - Drop vmlinux.bz2 from the rpm package as it just annoyingly increased the package size. - Fix modpost error under build environments using musl. - Make *.ll files keep value names for easier debugging - Fix single directory build - Prevent RISC-V from selecting the broken DWARF5 support when Clang and GAS are used together. * tag 'kbuild-fixes-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: lib/Kconfig.debug: Add check for non-constant .{s,u}leb128 support to DWARF5 kbuild: fix single directory build kbuild: add -fno-discard-value-names to cmd_cc_ll_c scripts/clang-tools: Convert clang-tidy args to list modpost: put modpost options before argument kbuild: Stop including vmlinux.bz2 in the rpm's Kconfig.debug: add toolchain checks for DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT Kconfig.debug: simplify the dependency of DEBUG_INFO_DWARF4/5
2022-10-16Merge tag 'clk-for-linus' of ↵Linus Torvalds18-129/+1831
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull more clk updates from Stephen Boyd: "This is the final part of the clk patches for this merge window. The clk rate range series needed another week to fully bake. Maxime fixed the bug that broke clk notifiers and prevented this from being included in the first pull request. He also added a unit test on top to make sure it doesn't break so easily again. The majority of the series fixes up how the clk_set_rate_*() APIs work, particularly around when the rate constraints are dropped and how they move around when reparenting clks. Overall it's a much needed improvement to the clk rate range APIs that used to be pretty broken if you looked sideways. Beyond the core changes there are a few driver fixes for a compilation issue or improper data causing clks to fail to register or have the wrong parents. These are good to get in before the first -rc so that the system actually boots on the affected devices" * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (31 commits) clk: tegra: Fix Tegra PWM parent clock clk: at91: fix the build with binutils 2.27 clk: qcom: gcc-msm8660: Drop hardcoded fixed board clocks clk: mediatek: clk-mux: Add .determine_rate() callback clk: tests: Add tests for notifiers clk: Update req_rate on __clk_recalc_rates() clk: tests: Add missing test case for ranges clk: qcom: clk-rcg2: Take clock boundaries into consideration for gfx3d clk: Introduce the clk_hw_get_rate_range function clk: Zero the clk_rate_request structure clk: Stop forwarding clk_rate_requests to the parent clk: Constify clk_has_parent() clk: Introduce clk_core_has_parent() clk: Switch from __clk_determine_rate to clk_core_round_rate_nolock clk: Add our request boundaries in clk_core_init_rate_req clk: Introduce clk_hw_init_rate_request() clk: Move clk_core_init_rate_req() from clk_core_round_rate_nolock() to its caller clk: Change clk_core_init_rate_req prototype clk: Set req_rate on reparenting clk: Take into account uncached clocks in clk_set_rate_range() ...
2022-10-16Merge tag '6.1-rc-smb3-client-fixes-part2' of ↵Linus Torvalds23-726/+922
git://git.samba.org/sfrench/cifs-2.6 Pull more cifs updates from Steve French: - fix a regression in guest mounts to old servers - improvements to directory leasing (caching directory entries safely beyond the root directory) - symlink improvement (reducing roundtrips needed to process symlinks) - an lseek fix (to problem where some dir entries could be skipped) - improved ioctl for returning more detailed information on directory change notifications - clarify multichannel interface query warning - cleanup fix (for better aligning buffers using ALIGN and round_up) - a compounding fix - fix some uninitialized variable bugs found by Coverity and the kernel test robot * tag '6.1-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: smb3: improve SMB3 change notification support cifs: lease key is uninitialized in two additional functions when smb1 cifs: lease key is uninitialized in smb1 paths smb3: must initialize two ACL struct fields to zero cifs: fix double-fault crash during ntlmssp cifs: fix static checker warning cifs: use ALIGN() and round_up() macros cifs: find and use the dentry for cached non-root directories also cifs: enable caching of directories for which a lease is held cifs: prevent copying past input buffer boundaries cifs: fix uninitialised var in smb2_compound_op() cifs: improve symlink handling for smb2+ smb3: clarify multichannel warning cifs: fix regression in very old smb1 mounts cifs: fix skipping to incorrect offset in emit_cached_dirents
2022-10-16Revert "cpumask: fix checking valid cpu range".Tetsuo Handa1-8/+11
This reverts commit 78e5a3399421 ("cpumask: fix checking valid cpu range"). syzbot is hitting WARN_ON_ONCE(cpu >= nr_cpumask_bits) warning at cpu_max_bits_warn() [1], for commit 78e5a3399421 ("cpumask: fix checking valid cpu range") is broken. Obviously that patch hits WARN_ON_ONCE() when e.g. reading /proc/cpuinfo because passing "cpu + 1" instead of "cpu" will trivially hit cpu == nr_cpumask_bits condition. Although syzbot found this problem in linux-next.git on 2022/09/27 [2], this problem was not fixed immediately. As a result, that patch was sent to linux.git before the patch author recognizes this problem, and syzbot started failing to test changes in linux.git since 2022/10/10 [3]. Andrew Jones proposed a fix for x86 and riscv architectures [4]. But [2] and [5] indicate that affected locations are not limited to arch code. More delay before we find and fix affected locations, less tested kernel (and more difficult to bisect and fix) before release. We should have inspected and fixed basically all cpumask users before applying that patch. We should not crash kernels in order to ask existing cpumask users to update their code, even if limited to CONFIG_DEBUG_PER_CPU_MAPS=y case. Link: https://syzkaller.appspot.com/bug?extid=d0fd2bf0dd6da72496dd [1] Link: https://syzkaller.appspot.com/bug?extid=21da700f3c9f0bc40150 [2] Link: https://syzkaller.appspot.com/bug?extid=51a652e2d24d53e75734 [3] Link: https://lkml.kernel.org/r/20221014155845.1986223-1-ajones@ventanamicro.com [4] Link: https://syzkaller.appspot.com/bug?extid=4d46c43d81c3bd155060 [5] Reported-by: Andrew Jones <ajones@ventanamicro.com> Reported-by: syzbot+d0fd2bf0dd6da72496dd@syzkaller.appspotmail.com Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Yury Norov <yury.norov@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-10-17lib/Kconfig.debug: Add check for non-constant .{s,u}leb128 support to DWARF5Nathan Chancellor1-2/+7
When building with a RISC-V kernel with DWARF5 debug info using clang and the GNU assembler, several instances of the following error appear: /tmp/vgettimeofday-48aa35.s:2963: Error: non-constant .uleb128 is not supported Dumping the .s file reveals these .uleb128 directives come from .debug_loc and .debug_ranges: .Ldebug_loc0: .byte 4 # DW_LLE_offset_pair .uleb128 .Lfunc_begin0-.Lfunc_begin0 # starting offset .uleb128 .Ltmp1-.Lfunc_begin0 # ending offset .byte 1 # Loc expr size .byte 90 # DW_OP_reg10 .byte 0 # DW_LLE_end_of_list .Ldebug_ranges0: .byte 4 # DW_RLE_offset_pair .uleb128 .Ltmp6-.Lfunc_begin0 # starting offset .uleb128 .Ltmp27-.Lfunc_begin0 # ending offset .byte 4 # DW_RLE_offset_pair .uleb128 .Ltmp28-.Lfunc_begin0 # starting offset .uleb128 .Ltmp30-.Lfunc_begin0 # ending offset .byte 0 # DW_RLE_end_of_list There is an outstanding binutils issue to support a non-constant operand to .sleb128 and .uleb128 in GAS for RISC-V but there does not appear to be any movement on it, due to concerns over how it would work with linker relaxation. To avoid these build errors, prevent DWARF5 from being selected when using clang and an assembler that does not have support for these symbol deltas, which can be easily checked in Kconfig with as-instr plus the small test program from the dwz test suite from the binutils issue. Link: https://sourceware.org/bugzilla/show_bug.cgi?id=27215 Link: https://github.com/ClangBuiltLinux/linux/issues/1719 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-10-17kbuild: fix single directory buildMasahiro Yamada1-0/+2
Commit f110e5a250e3 ("kbuild: refactor single builds of *.ko") was wrong. KBUILD_MODULES _is_ needed for single builds. Otherwise, "make foo/bar/baz/" does not build module objects at all. Fixes: f110e5a250e3 ("kbuild: refactor single builds of *.ko") Reported-by: David Sterba <dsterba@suse.cz> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: David Sterba <dsterba@suse.com>
2022-10-15Merge tag 'slab-for-6.1-rc1-hotfix' of ↵Linus Torvalds2-19/+19
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab hotfix from Vlastimil Babka: "A single fix for the common-kmalloc series, for warnings on mips and sparc64 reported by Guenter Roeck" * tag 'slab-for-6.1-rc1-hotfix' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: mm/slab: use kmalloc_node() for off slab freelist_idx_t array allocation
2022-10-15Merge tag 'for-linus' of https://github.com/openrisc/linuxLinus Torvalds2-9/+9
Pull OpenRISC updates from Stafford Horne: "I have relocated to London so not much work from me while I get settled. Still, OpenRISC picked up two patches in this window: - Fix for kernel page table walking from Jann Horn - MAINTAINER entry cleanup from Palmer Dabbelt" * tag 'for-linus' of https://github.com/openrisc/linux: MAINTAINERS: git://github -> https://github.com for openrisc openrisc: Fix pagewalk usage in arch_dma_{clear, set}_uncached
2022-10-15Merge tag 'pci-v6.1-fixes-1' of ↵Linus Torvalds1-61/+1
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci fix from Bjorn Helgaas: "Revert the attempt to distribute spare resources to unconfigured hotplug bridges at boot time. This fixed some dock hot-add scenarios, but Jonathan Cameron reported that it broke a topology with a multi-function device where one function was a Switch Upstream Port and the other was an Endpoint" * tag 'pci-v6.1-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: Revert "PCI: Distribute available resources for root buses, too"
2022-10-15mm/slab: use kmalloc_node() for off slab freelist_idx_t array allocationHyeonggon Yoo2-19/+19
After commit d6a71648dbc0 ("mm/slab: kmalloc: pass requests larger than order-1 page to page allocator"), SLAB passes large ( > PAGE_SIZE * 2) requests to buddy like SLUB does. SLAB has been using kmalloc caches to allocate freelist_idx_t array for off slab caches. But after the commit, freelist_size can be bigger than KMALLOC_MAX_CACHE_SIZE. Instead of using pointer to kmalloc cache, use kmalloc_node() and only check if the kmalloc cache is off slab during calculate_slab_order(). If freelist_size > KMALLOC_MAX_CACHE_SIZE, no looping condition happens as it allocates freelist_idx_t array directly from buddy. Link: https://lore.kernel.org/all/20221014205818.GA1428667@roeck-us.net/ Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> Fixes: d6a71648dbc0 ("mm/slab: kmalloc: pass requests larger than order-1 page to page allocator") Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2022-10-15MAINTAINERS: git://github -> https://github.com for openriscPalmer Dabbelt1-1/+1
Github deprecated the git:// links about a year ago, so let's move to the https:// URLs instead. Reported-by: Conor Dooley <conor.dooley@microchip.com> Link: https://github.blog/2021-09-01-improving-git-protocol-security-github/ Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Stafford Horne <shorne@gmail.com>
2022-10-15smb3: improve SMB3 change notification supportSteve French6-13/+90
Change notification is a commonly supported feature by most servers, but the current ioctl to request notification when a directory is changed does not return the information about what changed (even though it is returned by the server in the SMB3 change notify response), it simply returns when there is a change. This ioctl improves upon CIFS_IOC_NOTIFY by returning the notify information structure which includes the name of the file(s) that changed and why. See MS-SMB2 2.2.35 for details on the individual filter flags and the file_notify_information structure returned. To use this simply pass in the following (with enough space to fit at least one file_notify_information structure) struct __attribute__((__packed__)) smb3_notify { uint32_t completion_filter; bool watch_tree; uint32_t data_len; uint8_t data[]; } __packed; using CIFS_IOC_NOTIFY_INFO 0xc009cf0b or equivalently _IOWR(CIFS_IOCTL_MAGIC, 11, struct smb3_notify_info) The ioctl will block until the server detects a change to that directory or its subdirectories (if watch_tree is set). Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-15cifs: lease key is uninitialized in two additional functions when smb1Steve French1-2/+2
cifs_open and _cifsFileInfo_put also end up with lease_key uninitialized in smb1 mounts. It is cleaner to set lease key to zero in these places where leases are not supported (smb1 can not return lease keys so the field was uninitialized). Addresses-Coverity: 1514207 ("Uninitialized scalar variable") Addresses-Coverity: 1514331 ("Uninitialized scalar variable") Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-15cifs: lease key is uninitialized in smb1 pathsSteve French1-1/+1
It is cleaner to set lease key to zero in the places where leases are not supported (smb1 can not return lease keys so the field was uninitialized). Addresses-Coverity: 1513994 ("Uninitialized scalar variable") Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-15smb3: must initialize two ACL struct fields to zeroSteve French1-1/+2
Coverity spotted that we were not initalizing Stbz1 and Stbz2 to zero in create_sd_buf. Addresses-Coverity: 1513848 ("Uninitialized scalar variable") Cc: <stable@vger.kernel.org> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-15cifs: fix double-fault crash during ntlmsspPaulo Alcantara1-7/+9
The crash occurred because we were calling memzero_explicit() on an already freed sess_data::iov[1] (ntlmsspblob) in sess_free_buffer(). Fix this by not calling memzero_explicit() on sess_data::iov[1] as it's already by handled by callers. Fixes: a4e430c8c8ba ("cifs: replace kfree() with kfree_sensitive() for sensitive data") Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-15tools arch x86: Sync the msr-index.h copy with the kernel sourcesArnaldo Carvalho de Melo1-0/+18
To pick up the changes in: b8d1d163604bd1e6 ("x86/apic: Don't disable x2APIC if locked") ca5b7c0d9621702e ("perf/x86/amd/lbr: Add LbrExtV2 branch record support") Addressing these tools/perf build warnings: diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h' That makes the beautification scripts to pick some new entries: $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after $ diff -u before after --- before 2022-10-14 18:06:34.294561729 -0300 +++ after 2022-10-14 18:06:41.285744044 -0300 @@ -264,6 +264,7 @@ [0xc0000102 - x86_64_specific_MSRs_offset] = "KERNEL_GS_BASE", [0xc0000103 - x86_64_specific_MSRs_offset] = "TSC_AUX", [0xc0000104 - x86_64_specific_MSRs_offset] = "AMD64_TSC_RATIO", + [0xc000010e - x86_64_specific_MSRs_offset] = "AMD64_LBR_SELECT", [0xc000010f - x86_64_specific_MSRs_offset] = "AMD_DBG_EXTN_CFG", [0xc0000300 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_STATUS", [0xc0000301 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_CTL", $ Now one can trace systemwide asking to see backtraces to where that MSR is being read/written, see this example with a previous update: # perf trace -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB" ^C# If we use -v (verbose mode) we can see what it does behind the scenes: # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB" Using CPUID AuthenticAMD-25-21-0 0x6a0 0x6a8 New filter for msr:read_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313) 0x6a0 0x6a8 New filter for msr:write_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313) mmap size 528384B ^C# Example with a frequent msr: # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2 Using CPUID AuthenticAMD-25-21-0 0x48 New filter for msr:read_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841) 0x48 New filter for msr:write_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841) mmap size 528384B Looking at the vmlinux_path (8 entries long) symsrc__init: build id mismatch for vmlinux. Using /proc/kcore for kernel data Using /proc/kallsyms for symbols 0.000 Timer/2525383 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) __switch_to_xtra ([kernel.kallsyms]) __switch_to ([kernel.kallsyms]) __schedule ([kernel.kallsyms]) schedule ([kernel.kallsyms]) futex_wait_queue_me ([kernel.kallsyms]) futex_wait ([kernel.kallsyms]) do_futex ([kernel.kallsyms]) __x64_sys_futex ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64_after_hwframe ([kernel.kallsyms]) __futex_abstimed_wait_common64 (/usr/lib64/libpthread-2.33.so) 0.030 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) __switch_to_xtra ([kernel.kallsyms]) __switch_to ([kernel.kallsyms]) __schedule ([kernel.kallsyms]) schedule_idle ([kernel.kallsyms]) do_idle ([kernel.kallsyms]) cpu_startup_entry ([kernel.kallsyms]) secondary_startup_64_no_verify ([kernel.kallsyms]) # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Daniel Sneddon <daniel.sneddon@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Link: https://lore.kernel.org/lkml/Y0nQkz2TUJxwfXJd@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15perf auxtrace arm64: Add support for parsing HiSilicon PCIe Trace packetQi Liu7-0/+396
Add support for using 'perf report --dump-raw-trace' to parse PTT packet. Example usage: Output will contain raw PTT data and its textual representation, such as (8DW format): 0 0 0x5810 [0x30]: PERF_RECORD_AUXTRACE size: 0x400000 offset: 0 ref: 0xa5d50c725 idx: 0 tid: -1 cpu: 0 . . ... HISI PTT data: size 4194304 bytes . 00000000: 00 00 00 00 Prefix . 00000004: 08 20 00 60 Header DW0 . 00000008: ff 02 00 01 Header DW1 . 0000000c: 20 08 00 00 Header DW2 . 00000010: 10 e7 44 ab Header DW3 . 00000014: 2a a8 1e 01 Time . 00000020: 00 00 00 00 Prefix . 00000024: 01 00 00 60 Header DW0 . 00000028: 0f 1e 00 01 Header DW1 . 0000002c: 04 00 00 00 Header DW2 . 00000030: 40 00 81 02 Header DW3 . 00000034: ee 02 00 00 Time .... This patch only add basic parsing support according to the definition of the PTT packet described in Documentation/trace/hisi-ptt.rst. And the fields of each packet can be further decoded following the PCIe Spec's definition of TLP packet. Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bjorn Helgaas <helgaas@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: John Garry <john.garry@huawei.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Liu <liuqi6124@gmail.com> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Zeng Prime <prime.zeng@huawei.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-pci@vger.kernel.org Cc: linuxarm@huawei.com Link: https://lore.kernel.org/r/20220927081400.14364-4-yangyicong@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15perf auxtrace arm64: Add support for HiSilicon PCIe Tune and Trace device driverQi Liu7-1/+273
HiSilicon PCIe tune and trace device (PTT) could dynamically tune the PCIe link's events, and trace the TLP headers). This patch add support for PTT device in perf tool, so users could use 'perf record' to get TLP headers trace data. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: John Garry <john.garry@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bjorn Helgaas <helgaas@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Liu <liuqi6124@gmail.com> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Zeng Prime <prime.zeng@huawei.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-pci@vger.kernel.org Cc: linuxarm@huawei.com Link: https://lore.kernel.org/r/20220927081400.14364-3-yangyicong@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15perf auxtrace arm: Refactor event list iteration in auxtrace_record__init()Qi Liu1-19/+34
Add find_pmu_for_event() and use to simplify logic in auxtrace_record_init(). find_pmu_for_event() will be reused in subsequent patches. Reviewed-by: John Garry <john.garry@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bjorn Helgaas <helgaas@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Liu <liuqi6124@gmail.com> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Zeng Prime <prime.zeng@huawei.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-pci@vger.kernel.org Cc: linuxarm@huawei.com Link: https://lore.kernel.org/r/20220927081400.14364-2-yangyicong@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15perf tests stat+json_output: Include sanity check for topologyAthira Rajeev1-4/+39
Testcase stat+json_output.sh fails in powerpc: 86: perf stat JSON output linter : FAILED! The testcase "stat+json_output.sh" verifies perf stat JSON output. The test covers aggregation modes like per-socket, per-core, per-die, -A (no_aggr mode) along with few other tests. It counts expected fields for various commands. For example say -A (i.e, AGGR_NONE mode), expects 7 fields in the output having "CPU" as first field. Same way, for per-socket, it expects the first field in result to point to socket id. The testcases compares the result with expected count. The values for socket, die, core and cpu are fetched from topology directory: /sys/devices/system/cpu/cpu*/topology. For example, socket value is fetched from "physical_package_id" file of topology directory. (cpu__get_topology_int() in util/cpumap.c) If a platform fails to fetch the topology information, values will be set to -1. For example, incase of pSeries platform of powerpc, value for "physical_package_id" is restricted and not exposed. So, -1 will be assigned. Perf code has a checks for valid cpu id in "aggr_printout" (stat-display.c), which displays the fields. So, in cases where topology values not exposed, first field of the output displaying will be empty. This cause the testcase to fail, as it counts number of fields in the output. Incase of -A (AGGR_NONE mode,), testcase expects 7 fields in the output, becos of -1 value obtained from topology files for some, only 6 fields are printed. Hence a testcase failure reported due to mismatch in number of fields in the output. Patch here adds a sanity check in the testcase for topology. Check will help to skip the test if -1 value found. Fixes: 0c343af2a2f82844 ("perf test: JSON format checking") Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Suggested-by: Ian Rogers <irogers@google.com> Suggested-by: James Clark <james.clark@arm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Claire Jensen <cjense@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Link: https://lore.kernel.org/r/20221006155149.67205-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15perf tests stat+csv_output: Include sanity check for topologyAthira Rajeev1-4/+39
Testcase stat+csv_output.sh fails in powerpc: 84: perf stat CSV output linter: FAILED! The testcase "stat+csv_output.sh" verifies perf stat CSV output. The test covers aggregation modes like per-socket, per-core, per-die, -A (no_aggr mode) along with few other tests. It counts expected fields for various commands. For example say -A (i.e, AGGR_NONE mode), expects 7 fields in the output having "CPU" as first field. Same way, for per-socket, it expects the first field in result to point to socket id. The testcases compares the result with expected count. The values for socket, die, core and cpu are fetched from topology directory: /sys/devices/system/cpu/cpu*/topology. For example, socket value is fetched from "physical_package_id" file of topology directory. (cpu__get_topology_int() in util/cpumap.c) If a platform fails to fetch the topology information, values will be set to -1. For example, incase of pSeries platform of powerpc, value for "physical_package_id" is restricted and not exposed. So, -1 will be assigned. Perf code has a checks for valid cpu id in "aggr_printout" (stat-display.c), which displays the fields. So, in cases where topology values not exposed, first field of the output displaying will be empty. This cause the testcase to fail, as it counts number of fields in the output. Incase of -A (AGGR_NONE mode,), testcase expects 7 fields in the output, becos of -1 value obtained from topology files for some, only 6 fields are printed. Hence a testcase failure reported due to mismatch in number of fields in the output. Patch here adds a sanity check in the testcase for topology. Check will help to skip the test if -1 value found. Fixes: 7473ee56dbc91c98 ("perf test: Add checking for perf stat CSV output.") Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Suggested-by: Ian Rogers <irogers@google.com> Suggested-by: James Clark <james.clark@arm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Claire Jensen <cjense@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Link: https://lore.kernel.org/r/20221006155149.67205-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15perf intel-pt: Fix system_wide dummy event for hybridAdrian Hunter1-1/+1
User space tasks can migrate between CPUs, so when tracing selected CPUs, system-wide sideband is still needed, however evlist->core.has_user_cpus is not set in the hybrid case, so check the target cpu_list instead. Fixes: 7d189cadbeebc778 ("perf intel-pt: Track sideband system-wide when needed") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221012082259.22394-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15perf intel-pt: Fix segfault in intel_pt_print_info() with uClibcAdrian Hunter1-2/+7
uClibc segfaulted because NULL was passed as the format to fprintf(). That happened because one of the format strings was missing and intel_pt_print_info() didn't check that before calling fprintf(). Add the missing format string, and check format is not NULL before calling fprintf(). Fixes: 11fa7cb86b56d361 ("perf tools: Pass Intel PT information for decoding MTC and CYC") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221012082259.22394-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15perf test: Fix attr tests for PERF_FORMAT_LOSTJames Clark6-11/+11
Since PERF_FORMAT_LOST was added, the default read format has that bit set, so add it to the tests. Keep the old value as well so that the test still passes on older kernels. This fixes the following failure: expected read_format=0|4, got 20 FAILED './tests/attr/test-record-C0' - match failure Fixes: 85b425f31c8866e0 ("perf record: Set PERF_FORMAT_LOST by default") Signed-off-by: James Clark <james.clark@arm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20221012094633.21669-2-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15perf test: test_intel_pt.sh: Add 9 testsAmmy Yi1-1/+194
Add tests: Test with MTC and TSC disabled Test with branches disabled Test with/without CYC Test recording with sample mode Test with kernel trace Test virtual LBR Test power events Test with TNT packets disabled Test with event_trace These tests mostly check that perf record works with the corresponding Intel PT config terms, sometimes also checking that certain packets do or do not appear in the resulting trace as appropriate. The "Test virtual LBR" is slightly trickier, using a Python script to check that branch stacks are actually synthesized. Signed-off-by: Ammy Yi <ammy.yi@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20221014170905.64069-8-adrian.hunter@intel.com Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>