summaryrefslogtreecommitdiffstats
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2022-06-07libbpf: Add enum64 parsing and new enum64 public APIYonghong Song3-0/+117
Add enum64 parsing support and two new enum64 public APIs: btf__add_enum64 btf__add_enum64_value Also add support of signedness for BTF_KIND_ENUM. The BTF_KIND_ENUM API signatures are not changed. The signedness will be changed from unsigned to signed if btf__add_enum_value() finds any negative values. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062621.3719391-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07libbpf: Refactor btf__add_enum() for future code sharingYonghong Song1-15/+21
Refactor btf__add_enum() function to create a separate function btf_add_enum_common() so later the common function can be used to add enum64 btf type. There is no functionality change for this patch. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062615.3718063-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07libbpf: Fix an error in 64bit relocation value computationYonghong Song1-1/+1
Currently, the 64bit relocation value in the instruction is computed as follows: __u64 imm = insn[0].imm + ((__u64)insn[1].imm << 32) Suppose insn[0].imm = -1 (0xffffffff) and insn[1].imm = 1. With the above computation, insn[0].imm will first sign-extend to 64bit -1 (0xffffffffFFFFFFFF) and then add 0x1FFFFFFFF, producing incorrect value 0xFFFFFFFF. The correct value should be 0x1FFFFFFFF. Changing insn[0].imm to __u32 first will prevent 64bit sign extension and fix the issue. Merging high and low 32bit values also changed from '+' to '|' to be consistent with other similar occurences in kernel and libbpf. Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062610.3717378-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07libbpf: Permit 64bit relocation valueYonghong Song2-24/+29
Currently, the libbpf limits the relocation value to be 32bit since all current relocations have such a limit. But with BTF_KIND_ENUM64 support, the enum value could be 64bit. So let us permit 64bit relocation value in libbpf. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062605.3716779-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07bpf: Add btf enum64 supportYonghong Song1-3/+14
Currently, BTF only supports upto 32bit enum value with BTF_KIND_ENUM. But in kernel, some enum indeed has 64bit values, e.g., in uapi bpf.h, we have enum { BPF_F_INDEX_MASK = 0xffffffffULL, BPF_F_CURRENT_CPU = BPF_F_INDEX_MASK, BPF_F_CTXLEN_MASK = (0xfffffULL << 32), }; In this case, BTF_KIND_ENUM will encode the value of BPF_F_CTXLEN_MASK as 0, which certainly is incorrect. This patch added a new btf kind, BTF_KIND_ENUM64, which permits 64bit value to cover the above use case. The BTF_KIND_ENUM64 has the following three fields followed by the common type: struct bpf_enum64 { __u32 nume_off; __u32 val_lo32; __u32 val_hi32; }; Currently, btf type section has an alignment of 4 as all element types are u32. Representing the value with __u64 will introduce a pad for bpf_enum64 and may also introduce misalignment for the 64bit value. Hence, two members of val_hi32 and val_lo32 are chosen to avoid these issues. The kflag is also introduced for BTF_KIND_ENUM and BTF_KIND_ENUM64 to indicate whether the value is signed or unsigned. The kflag intends to provide consistent output of BTF C fortmat with the original source code. For example, the original BTF_KIND_ENUM bit value is 0xffffffff. The format C has two choices, printing out 0xffffffff or -1 and current libbpf prints out as unsigned value. But if the signedness is preserved in btf, the value can be printed the same as the original source code. The kflag value 0 means unsigned values, which is consistent to the default by libbpf and should also cover most cases as well. The new BTF_KIND_ENUM64 is intended to support the enum value represented as 64bit value. But it can represent all BTF_KIND_ENUM values as well. The compiler ([1]) and pahole will generate BTF_KIND_ENUM64 only if the value has to be represented with 64 bits. In addition, a static inline function btf_kind_core_compat() is introduced which will be used later when libbpf relo_core.c changed. Here the kernel shares the same relo_core.c with libbpf. [1] https://reviews.llvm.org/D124641 Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062600.3716578-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-03selftests/bpf: Add drv mode testing for xdpingHangbin Liu1-0/+4
As subject, we only test SKB mode for xdping at present. Now add DRV mode for xdping. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20220602032507.464453-1-liuhangbin@gmail.com
2022-06-03libbpf: Fix is_pow_of_2Yuze Chi3-10/+5
Move the correct definition from linker.c into libbpf_internal.h. Fixes: 0087a681fa8c ("libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary") Reported-by: Yuze Chi <chiyuze@google.com> Signed-off-by: Yuze Chi <chiyuze@google.com> Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220603055156.2830463-1-irogers@google.com
2022-06-03selftests/bpf: Fix tc_redirect_dtimeMartin KaFai Lau2-6/+55
tc_redirect_dtime was reported flaky from time to time. It always fails at the udp test and complains about the bpf@tc-ingress got a skb->tstamp when handling udp packet. It is unexpected because the skb->tstamp should have been cleared when crossing different netns. The most likely cause is that the skb is actually a tcp packet from the earlier tcp test. It could be the final TCP_FIN handling. This patch tightens the skb->tstamp check in the bpf prog. It ensures the skb is the current testing traffic. First, it checks that skb matches the IPPROTO of the running test (i.e. tcp vs udp). Second, it checks the server port (dst_ns_port). The server port is unique for each test (50000 + test_enum). Also fixed a typo in test_udp_dtime(): s/P100/P101/ Fixes: c803475fd8dd ("bpf: selftests: test skb->tstamp in redirect_neigh") Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20220601234050.2572671-1-kafai@fb.com
2022-06-03libbpf: Fix a couple of typosDaniel Müller3-6/+6
This change fixes a couple of typos that were encountered while studying the source code. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20220601154025.3295035-1-deso@posteo.net
2022-06-02libbpf: Fix determine_ptr_size() guessingDouglas Raillard1-6/+20
One strategy employed by libbpf to guess the pointer size is by finding the size of "unsigned long" type. This is achieved by looking for a type of with the expected name and checking its size. Unfortunately, the C syntax is friendlier to humans than to computers as there is some variety in how such a type can be named. Specifically, gcc and clang do not use the same names for integer types in debug info: - clang uses "unsigned long" - gcc uses "long unsigned int" Lookup all the names for such a type so that libbpf can hope to find the information it wants. Signed-off-by: Douglas Raillard <douglas.raillard@arm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220524094447.332186-1-douglas.raillard@arm.com
2022-06-02bpftool: Check for NULL ptr of btf in codegen_assertsMichael Mullin1-0/+3
bpf_object__btf() can return a NULL value. If bpf_object__btf returns null, do not progress through codegen_asserts(). This avoids a null ptr dereference at the call btf__type_cnt() in the function find_type_for_map() Signed-off-by: Michael Mullin <masmullin@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220523194917.igkgorco42537arb@jup
2022-06-02selftests/bpf: Fix test_run logic in fexit_stress.cYuntao Wang1-28/+4
In the commit da00d2f117a0 ("bpf: Add test ops for BPF_PROG_TYPE_TRACING"), the bpf_fentry_test1 function was moved into bpf_prog_test_run_tracing(), which is the test_run function of the tracing BPF programs. Thus calling 'bpf_prog_test_run_opts(filter_fd, &topts)' will not trigger bpf_fentry_test1 function as filter_fd is a sk_filter BPF program. Fix it by replacing filter_fd with fexit_fd in the bpf_prog_test_run_opts() function. Fixes: da00d2f117a0 ("bpf: Add test ops for BPF_PROG_TYPE_TRACING") Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220521151329.648013-1-ytcoode@gmail.com
2022-06-02bpftool: Use libbpf_bpf_link_type_strDaniel Müller1-17/+10
This change switches bpftool over to using the recently introduced libbpf_bpf_link_type_str function instead of maintaining its own string representation for the bpf_link_type enum. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220523230428.3077108-13-deso@posteo.net
2022-06-02selftests/bpf: Add test for libbpf_bpf_link_type_strDaniel Müller1-0/+48
This change adds a test for libbpf_bpf_link_type_str. The test retrieves all variants of the bpf_link_type enumeration using BTF and makes sure that the function under test works as expected for them. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220523230428.3077108-12-deso@posteo.net
2022-06-02libbpf: Introduce libbpf_bpf_link_type_strDaniel Müller3-0/+31
This change introduces a new function, libbpf_bpf_link_type_str, to the public libbpf API. The function allows users to get a string representation for a bpf_link_type enum variant. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220523230428.3077108-11-deso@posteo.net
2022-06-02bpftool: Use libbpf_bpf_attach_type_strDaniel Müller9-150/+182
This change switches bpftool over to using the recently introduced libbpf_bpf_attach_type_str function instead of maintaining its own string representation for the bpf_attach_type enum. Note that contrary to other enum types, the variant names that bpftool maps bpf_attach_type to do not adhere a simple to follow rule. With bpf_prog_type, for example, the textual representation can easily be inferred by stripping the BPF_PROG_TYPE_ prefix and lowercasing the remaining string. bpf_attach_type violates this rule for various variants. We decided to fix up this deficiency with this change, meaning that bpftool uses the same textual representations as libbpf. Supporting tests, completion scripts, and man pages have been adjusted accordingly. However, we did add support for accepting (the now undocumented) original attach type names when they are provided by users. For the test (test_bpftool_synctypes.py), I have removed the enum representation checks, because we no longer mirror the various enum variant names in bpftool source code. For the man page, help text, and completion script checks we are now using enum definitions from uapi/linux/bpf.h as the source of truth directly. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20220523230428.3077108-10-deso@posteo.net
2022-06-02selftests/bpf: Add test for libbpf_bpf_attach_type_strDaniel Müller1-0/+48
This change adds a test for libbpf_bpf_attach_type_str. The test retrieves all variants of the bpf_attach_type enumeration using BTF and makes sure that the function under test works as expected for them. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220523230428.3077108-9-deso@posteo.net
2022-06-02libbpf: Introduce libbpf_bpf_attach_type_strDaniel Müller3-0/+64
This change introduces a new function, libbpf_bpf_attach_type_str, to the public libbpf API. The function allows users to get a string representation for a bpf_attach_type variant. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220523230428.3077108-8-deso@posteo.net
2022-06-02bpftool: Use libbpf_bpf_map_type_strDaniel Müller4-79/+71
This change switches bpftool over to using the recently introduced libbpf_bpf_map_type_str function instead of maintaining its own string representation for the bpf_map_type enum. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20220523230428.3077108-7-deso@posteo.net
2022-06-02selftests/bpf: Add test for libbpf_bpf_map_type_strDaniel Müller1-1/+55
This change adds a test for libbpf_bpf_map_type_str. The test retrieves all variants of the bpf_map_type enumeration using BTF and makes sure that the function under test works as expected for them. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220523230428.3077108-6-deso@posteo.net
2022-06-02libbpf: Introduce libbpf_bpf_map_type_strDaniel Müller3-0/+52
This change introduces a new function, libbpf_bpf_map_type_str, to the public libbpf API. The function allows users to get a string representation for a bpf_map_type enum variant. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220523230428.3077108-5-deso@posteo.net
2022-06-02bpftool: Use libbpf_bpf_prog_type_strDaniel Müller6-92/+65
This change switches bpftool over to using the recently introduced libbpf_bpf_prog_type_str function instead of maintaining its own string representation for the bpf_prog_type enum. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20220523230428.3077108-4-deso@posteo.net
2022-06-02selftests/bpf: Add test for libbpf_bpf_prog_type_strDaniel Müller1-0/+57
This change adds a test for libbpf_bpf_prog_type_str. The test retrieves all variants of the bpf_prog_type enumeration using BTF and makes sure that the function under test works as expected for them. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220523230428.3077108-3-deso@posteo.net
2022-06-02libbpf: Introduce libbpf_bpf_prog_type_strDaniel Müller3-0/+55
This change introduces a new function, libbpf_bpf_prog_type_str, to the public libbpf API. The function allows users to get a string representation for a bpf_prog_type variant. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220523230428.3077108-2-deso@posteo.net
2022-06-02Merge tag 'net-5.19-rc1' of ↵Linus Torvalds3-13/+14
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf and netfilter. Current release - new code bugs: - af_packet: make sure to pull the MAC header, avoid skb panic in GSO - ptp_clockmatrix: fix inverted logic in is_single_shot() - netfilter: flowtable: fix missing FLOWI_FLAG_ANYSRC flag - dt-bindings: net: adin: fix adi,phy-output-clock description syntax - wifi: iwlwifi: pcie: rename CAUSE macro, avoid MIPS build warning Previous releases - regressions: - Revert "net: af_key: add check for pfkey_broadcast in function pfkey_process" - tcp: fix tcp_mtup_probe_success vs wrong snd_cwnd - nf_tables: disallow non-stateful expression in sets earlier - nft_limit: clone packet limits' cost value - nf_tables: double hook unregistration in netns path - ping6: fix ping -6 with interface name Previous releases - always broken: - sched: fix memory barriers to prevent skbs from getting stuck in lockless qdiscs - neigh: set lower cap for neigh_managed_work rearming, avoid constantly scheduling the probe work - bpf: fix probe read error on big endian in ___bpf_prog_run() - amt: memory leak and error handling fixes Misc: - ipv6: expand & rename accept_unsolicited_na to accept_untracked_na" * tag 'net-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (80 commits) net/af_packet: make sure to pull mac header net: add debug info to __skb_pull() net: CONFIG_DEBUG_NET depends on CONFIG_NET stmmac: intel: Add RPL-P PCI ID net: stmmac: use dev_err_probe() for reporting mdio bus registration failure tipc: check attribute length for bearer name ice: fix access-beyond-end in the switch code nfp: remove padding in nfp_nfdk_tx_desc ax25: Fix ax25 session cleanup problems net: usb: qmi_wwan: Add support for Cinterion MV31 with new baseline sfc/siena: fix wrong tx channel offset with efx_separate_tx_channels sfc/siena: fix considering that all channels have TX queues socket: Don't use u8 type in uapi socket.h net/sched: act_api: fix error code in tcf_ct_flow_table_fill_tuple_ipv6() net: ping6: Fix ping -6 with interface name macsec: fix UAF bug for real_dev octeontx2-af: fix error code in is_valid_offset() wifi: mac80211: fix use-after-free in chanctx code bonding: guard ns_targets by CONFIG_IPV6 tcp: tcp_rtx_synack() can be called from process context ...
2022-06-02Merge tag 'memblock-v5.19-rc1' of ↵Linus Torvalds2-117/+278
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock test suite updates from Mike Rapoport: "Comment updates for memblock test suite Update comments in the memblock tests so that they will have consistent style" * tag 'memblock-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: memblock tests: remove completed TODO item memblock tests: update style of comments for memblock_free_*() functions memblock tests: update style of comments for memblock_remove_*() functions memblock tests: update style of comments for memblock_reserve_*() functions memblock tests: update style of comments for memblock_add_*() functions
2022-05-31Merge tag 'riscv-for-linus-5.19-mw0' of ↵Linus Torvalds2-16/+9
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for the Svpbmt extension, which allows memory attributes to be encoded in pages - Support for the Allwinner D1's implementation of page-based memory attributes - Support for running rv32 binaries on rv64 systems, via the compat subsystem - Support for kexec_file() - Support for the new generic ticket-based spinlocks, which allows us to also move to qrwlock. These should have already gone in through the asm-geneic tree as well - A handful of cleanups and fixes, include some larger ones around atomics and XIP * tag 'riscv-for-linus-5.19-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (51 commits) RISC-V: Prepare dropping week attribute from arch_kexec_apply_relocations[_add] riscv: compat: Using seperated vdso_maps for compat_vdso_info RISC-V: Fix the XIP build RISC-V: Split out the XIP fixups into their own file RISC-V: ignore xipImage RISC-V: Avoid empty create_*_mapping definitions riscv: Don't output a bogus mmu-type on a no MMU kernel riscv: atomic: Add custom conditional atomic operation implementation riscv: atomic: Optimize dec_if_positive functions riscv: atomic: Cleanup unnecessary definition RISC-V: Load purgatory in kexec_file RISC-V: Add purgatory RISC-V: Support for kexec_file on panic RISC-V: Add kexec_file support RISC-V: use memcpy for kexec_file mode kexec_file: Fix kexec_file.c build error for riscv platform riscv: compat: Add COMPAT Kbuild skeletal support riscv: compat: ptrace: Add compat_arch_ptrace implement riscv: compat: signal: Add rt_frame implementation riscv: add memory-type errata for T-Head ...
2022-05-31net/ipv6: Expand and rename accept_unsolicited_na to accept_untracked_naArun Ajith S1-12/+11
RFC 9131 changes default behaviour of handling RX of NA messages when the corresponding entry is absent in the neighbour cache. The current implementation is limited to accept just unsolicited NAs. However, the RFC is more generic where it also accepts solicited NAs. Both types should result in adding a STALE entry for this case. Expand accept_untracked_na behaviour to also accept solicited NAs to be compliant with the RFC and rename the sysctl knob to accept_untracked_na. Fixes: f9a2fb73318e ("net/ipv6: Introduce accept_unsolicited_na knob to implement router-side changes for RFC9131") Signed-off-by: Arun Ajith S <aajith@arista.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220530101414.65439-1-aajith@arista.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-30Merge tag 'mips_5.19' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS updates from Thomas Bogendoerfer: "Cleanups and fixes" * tag 'mips_5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (38 commits) MIPS: RALINK: Define pci_remap_iospace under CONFIG_PCI_DRIVERS_GENERIC MIPS: Use memblock_add_node() in early_parse_mem() under CONFIG_NUMA MIPS: Return -EINVAL if mem parameter is empty in early_parse_mem() MIPS: Kconfig: Fix indentation and add endif comment MIPS: bmips: Fix compiler warning observed on W=1 build MIPS: Rewrite `csum_tcpudp_nofold' in plain C mips: setup: use strscpy to replace strlcpy MIPS: Octeon: add SNIC10E board MIPS: Ingenic: Refresh defconfig for CU1000-Neo and CU1830-Neo. MIPS: Ingenic: Refresh device tree for Ingenic SoCs and boards. MIPS: Ingenic: Add PWM nodes for X1830. MIPS: Octeon: fix typo in comment MIPS: loongson32: Kconfig: Remove extra space MIPS: Sibyte: remove unnecessary return variable MIPS: Use NOKPROBE_SYMBOL() instead of __kprobes annotation selftests/ftrace: Save kprobe_events to test log MIPS: tools: no need to initialise statics to 0 MIPS: Loongson: Use hwmon_device_register_with_groups() to register hwmon MIPS: VR41xx: Drop redundant spinlock initialization MIPS: smp: optimization for flush_tlb_mm when exiting ...
2022-05-29Merge tag 'trace-tools-v5.19' of ↵Linus Torvalds8-49/+151
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing tool updates from Steven Rostedt: - Various clean ups and fixes to rtla (Real Time Linux Analysis) * tag 'trace-tools-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: rtla: Remove procps-ng dependency rtla: Fix __set_sched_attr error message rtla: Minor grammar fix for rtla README rtla: Don't overwrite existing directory mode rtla: Avoid record NULL pointer dereference rtla/Makefile: Properly handle dependencies
2022-05-29Merge tag 'perf-tools-for-v5.19-2022-05-28' of ↵Linus Torvalds44-197/+1594
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull more perf tools updates from Arnaldo Carvalho de Melo: - Add BPF based off-CPU profiling - Improvements for system wide recording, specially for Intel PT - Improve DWARF unwinding on arm64 - Support Arm CoreSight trace data disassembly in 'perf script' python - Fix build with new libbpf version, related to supporting older versions of distro released libbpf packages - Fix event syntax error caused by ExtSel in the JSON events infra - Use stdio interface if slang is not supported in 'perf c2c' - Add 'perf test' checking for perf stat CSV output - Sync the msr-index.h copy with the kernel sources * tag 'perf-tools-for-v5.19-2022-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (38 commits) tools arch x86: Sync the msr-index.h copy with the kernel sources perf scripts python: Support Arm CoreSight trace data disassembly perf scripting python: Expose dso and map information perf jevents: Fix event syntax error caused by ExtSel perf tools arm64: Add support for VG register perf unwind arm64: Decouple Libunwind register names from Perf perf unwind: Use dynamic register set for DWARF unwind perf tools arm64: Copy perf_regs.h from the kernel perf unwind arm64: Use perf's copy of kernel headers perf c2c: Use stdio interface if slang is not supported perf test: Add a basic offcpu profiling test perf record: Add cgroup support for off-cpu profiling perf record: Handle argument change in sched_switch perf record: Implement basic filtering for off-cpu perf record: Enable off-cpu analysis with BPF perf report: Do not extend sample type of bpf-output event perf test: Add checking for perf stat CSV output. perf tools: Allow system-wide events to keep their own threads perf tools: Allow system-wide events to keep their own CPUs libperf evsel: Add comments for booleans ...
2022-05-28Merge tag 'powerpc-5.19-1' of ↵Linus Torvalds10-59/+269
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Convert to the generic mmap support (ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT) - Add support for outline-only KASAN with 64-bit Radix MMU (P9 or later) - Increase SIGSTKSZ and MINSIGSTKSZ and add support for AT_MINSIGSTKSZ - Enable the DAWR (Data Address Watchpoint) on POWER9 DD2.3 or later - Drop support for system call instruction emulation - Many other small features and fixes Thanks to Alexey Kardashevskiy, Alistair Popple, Andy Shevchenko, Bagas Sanjaya, Bjorn Helgaas, Bo Liu, Chen Huang, Christophe Leroy, Colin Ian King, Daniel Axtens, Dwaipayan Ray, Fabiano Rosas, Finn Thain, Frank Rowand, Fuqian Huang, Guilherme G. Piccoli, Hangyu Hua, Haowen Bai, Haren Myneni, Hari Bathini, He Ying, Jason Wang, Jiapeng Chong, Jing Yangyang, Joel Stanley, Julia Lawall, Kajol Jain, Kevin Hao, Krzysztof Kozlowski, Laurent Dufour, Lv Ruyi, Madhavan Srinivasan, Magali Lemes, Miaoqian Lin, Minghao Chi, Nathan Chancellor, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Oscar Salvador, Pali Rohár, Paul Mackerras, Peng Wu, Qing Wang, Randy Dunlap, Reza Arbab, Russell Currey, Sohaib Mohamed, Vaibhav Jain, Vasant Hegde, Wang Qing, Wang Wensheng, Xiang wangx, Xiaomeng Tong, Xu Wang, Yang Guang, Yang Li, Ye Bin, YueHaibing, Yu Kuai, Zheng Bin, Zou Wei, and Zucheng Zheng. * tag 'powerpc-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (200 commits) powerpc/64: Include cache.h directly in paca.h powerpc/64s: Only set HAVE_ARCH_UNMAPPED_AREA when CONFIG_PPC_64S_HASH_MMU is set powerpc/xics: Include missing header powerpc/powernv/pci: Drop VF MPS fixup powerpc/fsl_book3e: Don't set rodata RO too early powerpc/microwatt: Add mmu bits to device tree powerpc/powernv/flash: Check OPAL flash calls exist before using powerpc/powermac: constify device_node in of_irq_parse_oldworld() powerpc/powermac: add missing g5_phy_disable_cpu1() declaration selftests/powerpc/pmu: fix spelling mistake "mis-match" -> "mismatch" powerpc: Enable the DAWR on POWER9 DD2.3 and above powerpc/64s: Add CPU_FTRS_POWER10 to ALWAYS mask powerpc/64s: Add CPU_FTRS_POWER9_DD2_2 to CPU_FTRS_ALWAYS mask powerpc: Fix all occurences of "the the" selftests/powerpc/pmu/ebb: remove fixed_instruction.S powerpc/platforms/83xx: Use of_device_get_match_data() powerpc/eeh: Drop redundant spinlock initialization powerpc/iommu: Add missing of_node_put in iommu_init_early_dart powerpc/pseries/vas: Call misc_deregister if sysfs init fails powerpc/papr_scm: Fix leaking nvdimm_events_map elements ...
2022-05-27Merge tag 'cxl-for-5.19' of ↵Linus Torvalds4-28/+31
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl updates from Dan Williams: "Compute Express Link (CXL) updates for this cycle. The highlight is new driver-core infrastructure and CXL subsystem changes for allowing lockdep to validate device_lock() usage. Thanks to PeterZ for setting me straight on the current capabilities of the lockdep API, and Greg acked it as well. On the CXL ACPI side this update adds support for CXL _OSC so that platform firmware knows that it is safe to still grant Linux native control of PCIe hotplug and error handling in the presence of CXL devices. A circular dependency problem was discovered between suspend and CXL memory for cases where the suspend image might be stored in CXL memory where that image also contains the PCI register state to restore to re-enable the device. Disable suspend for now until an architecture is defined to clarify that conflict. Lastly a collection of reworks, fixes, and cleanups to the CXL subsystem where support for snooping mailbox commands and properly handling the "mem_enable" flow are the highlights. Summary: - Add driver-core infrastructure for lockdep validation of device_lock(), and fixup a deadlock report that was previously hidden behind the 'lockdep no validate' policy. - Add CXL _OSC support for claiming native control of CXL hotplug and error handling. - Disable suspend in the presence of CXL memory unless and until a protocol is identified for restoring PCI device context from memory hosted on CXL PCI devices. - Add support for snooping CXL mailbox commands to protect against inopportune changes, like set-partition with the 'immediate' flag set. - Rework how the driver detects legacy CXL 1.1 configurations (CXL DVSEC / 'mem_enable') before enabling new CXL 2.0 decode configurations (CXL HDM Capability). - Miscellaneous cleanups and fixes from -next exposure" * tag 'cxl-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (47 commits) cxl/port: Enable HDM Capability after validating DVSEC Ranges cxl/port: Reuse 'struct cxl_hdm' context for hdm init cxl/port: Move endpoint HDM Decoder Capability init to port driver cxl/pci: Drop @info argument to cxl_hdm_decode_init() cxl/mem: Merge cxl_dvsec_ranges() and cxl_hdm_decode_init() cxl/mem: Skip range enumeration if mem_enable clear cxl/mem: Consolidate CXL DVSEC Range enumeration in the core cxl/pci: Move cxl_await_media_ready() to the core cxl/mem: Validate port connectivity before dvsec ranges cxl/mem: Fix cxl_mem_probe() error exit cxl/pci: Drop wait_for_valid() from cxl_await_media_ready() cxl/pci: Consolidate wait_for_media() and wait_for_media_ready() cxl/mem: Drop mem_enabled check from wait_for_media() nvdimm: Fix firmware activation deadlock scenarios device-core: Kill the lockdep_mutex nvdimm: Drop nd_device_lock() ACPI: NFIT: Drop nfit_device_lock() nvdimm: Replace lockdep_mutex with local lock classes cxl: Drop cxl_device_lock() cxl/acpi: Add root device lockdep validation ...
2022-05-27Merge tag 'v5.19-p1' of ↵Linus Torvalds3-0/+418
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Test in-place en/decryption with two sglists in testmgr - Fix process vs softirq race in cryptd Algorithms: - Add arm64 acceleration for sm4 - Add s390 acceleration for chacha20 Drivers: - Add polarfire soc hwrng support in mpsf - Add support for TI SoC AM62x in sa2ul - Add support for ATSHA204 cryptochip in atmel-sha204a - Add support for PRNG in caam - Restore support for storage encryption in qat - Restore support for storage encryption in hisilicon/sec" * tag 'v5.19-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (116 commits) hwrng: omap3-rom - fix using wrong clk_disable() in omap_rom_rng_runtime_resume() crypto: hisilicon/sec - delete the flag CRYPTO_ALG_ALLOCATES_MEMORY crypto: qat - add support for 401xx devices crypto: qat - re-enable registration of algorithms crypto: qat - honor CRYPTO_TFM_REQ_MAY_SLEEP flag crypto: qat - add param check for DH crypto: qat - add param check for RSA crypto: qat - remove dma_free_coherent() for DH crypto: qat - remove dma_free_coherent() for RSA crypto: qat - fix memory leak in RSA crypto: qat - add backlog mechanism crypto: qat - refactor submission logic crypto: qat - use pre-allocated buffers in datapath crypto: qat - set to zero DH parameters before free crypto: s390 - add crypto library interface for ChaCha20 crypto: talitos - Uniform coding style with defined variable crypto: octeontx2 - simplify the return expression of otx2_cpt_aead_cbc_aes_sha_setkey() crypto: cryptd - Protect per-CPU resource by disabling BH. crypto: sun8i-ce - do not fallback if cryptlen is less than sg length crypto: sun8i-ce - rework debugging ...
2022-05-27Merge tag 'libnvdimm-for-5.19' of ↵Linus Torvalds3-14/+11
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm and DAX updates from Dan Williams: "New support for clearing memory errors when a file is in DAX mode, alongside with some other fixes and cleanups. Previously it was only possible to clear these errors using a truncate or hole-punch operation to trigger the filesystem to reallocate the block, now, any page aligned write can opportunistically clear errors as well. This change spans x86/mm, nvdimm, and fs/dax, and has received the appropriate sign-offs. Thanks to Jane for her work on this. Summary: - Add support for clearing memory error via pwrite(2) on DAX - Fix 'security overwrite' support in the presence of media errors - Miscellaneous cleanups and fixes for nfit_test (nvdimm unit tests)" * tag 'libnvdimm-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: pmem: implement pmem_recovery_write() pmem: refactor pmem_clear_poison() dax: add .recovery_write dax_operation dax: introduce DAX_RECOVERY_WRITE dax access mode mce: fix set_mce_nospec to always unmap the whole page x86/mce: relocate set{clear}_mce_nospec() functions acpi/nfit: rely on mce->misc to determine poison granularity testing: nvdimm: asm/mce.h is not needed in nfit.c testing: nvdimm: iomap: make __nfit_test_ioremap a macro nvdimm: Allow overwrite in the presence of disabled dimms tools/testing/nvdimm: remove unneeded flush_workqueue
2022-05-27Merge tag 'mm-stable-2022-05-27' of ↵Linus Torvalds2-185/+151
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more MM updates from Andrew Morton: - Two follow-on fixes for the post-5.19 series "Use pageblock_order for cma and alloc_contig_range alignment", from Zi Yan. - A series of z3fold cleanups and fixes from Miaohe Lin. - Some memcg selftests work from Michal Koutný <mkoutny@suse.com> - Some swap fixes and cleanups from Miaohe Lin - Several individual minor fixups * tag 'mm-stable-2022-05-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (25 commits) mm/shmem.c: suppress shift warning mm: Kconfig: reorganize misplaced mm options mm: kasan: fix input of vmalloc_to_page() mm: fix is_pinnable_page against a cma page mm: filter out swapin error entry in shmem mapping mm/shmem: fix infinite loop when swap in shmem error at swapoff time mm/madvise: free hwpoison and swapin error entry in madvise_free_pte_range mm/swapfile: fix lost swap bits in unuse_pte() mm/swapfile: unuse_pte can map random data if swap read fails selftests: memcg: factor out common parts of memory.{low,min} tests selftests: memcg: remove protection from top level memcg selftests: memcg: adjust expected reclaim values of protected cgroups selftests: memcg: expect no low events in unprotected sibling selftests: memcg: fix compilation mm/z3fold: fix z3fold_page_migrate races with z3fold_map mm/z3fold: fix z3fold_reclaim_page races with z3fold_free mm/z3fold: always clear PAGE_CLAIMED under z3fold page lock mm/z3fold: put z3fold page back into unbuddied list when reclaim or migration fails revert "mm/z3fold.c: allow __GFP_HIGHMEM in z3fold_alloc" mm/z3fold: throw warning on failure of trylock_page in z3fold_alloc ...
2022-05-27Merge tag 'mm-nonmm-stable-2022-05-26' of ↵Linus Torvalds3-1/+419
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc updates from Andrew Morton: "The non-MM patch queue for this merge window. Not a lot of material this cycle. Many singleton patches against various subsystems. Most notably some maintenance work in ocfs2 and initramfs" * tag 'mm-nonmm-stable-2022-05-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (65 commits) kcov: update pos before writing pc in trace function ocfs2: dlmfs: fix error handling of user_dlm_destroy_lock ocfs2: dlmfs: don't clear USER_LOCK_ATTACHED when destroying lock fs/ntfs: remove redundant variable idx fat: remove time truncations in vfat_create/vfat_mkdir fat: report creation time in statx fat: ignore ctime updates, and keep ctime identical to mtime in memory fat: split fat_truncate_time() into separate functions MAINTAINERS: add Muchun as a memcg reviewer proc/sysctl: make protected_* world readable ia64: mca: drop redundant spinlock initialization tty: fix deadlock caused by calling printk() under tty_port->lock relay: remove redundant assignment to pointer buf fs/ntfs3: validate BOOT sectors_per_clusters lib/string_helpers: fix not adding strarray to device's resource list kernel/crash_core.c: remove redundant check of ck_cmdline ELF, uapi: fixup ELF_ST_TYPE definition ipc/mqueue: use get_tree_nodev() in mqueue_get_tree() ipc: update semtimedop() to use hrtimer ipc/sem: remove redundant assignments ...
2022-05-27selftests: memcg: factor out common parts of memory.{low,min} testsMichal Koutný1-163/+36
The memory protection test setup and runtime is almost equal for memory.low and memory.min cases. It makes modification of the common parts prone to mistakes, since the protections are similar not only in setup but also in principle, factor the common part out. Past exceptions between the tests: - missing memory.min is fine (kept), - test_memcg_low protected orphaned pagecache (adapted like test_memcg_min and we keep the processes of protected memory running). The evaluation in two tests is different (OOM of allocator vs low events of protégés), this is kept different. Link: https://lkml.kernel.org/r/20220518161859.21565-6-mkoutny@suse.com Signed-off-by: Michal Koutný <mkoutny@suse.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> CC: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Richard Palethorpe <rpalethorpe@suse.de> Cc: David Vernet <void@manifault.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-27selftests: memcg: remove protection from top level memcgMichal Koutný1-7/+3
The reclaim is triggered by memory limit in a subtree, therefore the testcase does not need configured protection against external reclaim. Also, correct respective comments. Link: https://lkml.kernel.org/r/20220518161859.21565-5-mkoutny@suse.com Signed-off-by: Michal Koutný <mkoutny@suse.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: David Vernet <void@manifault.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Richard Palethorpe <rpalethorpe@suse.de> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-27selftests: memcg: adjust expected reclaim values of protected cgroupsMichal Koutný2-12/+106
The numbers are not easy to derive in a closed form (certainly mere protections ratios do not apply), therefore use a simulation to obtain expected numbers. Link: https://lkml.kernel.org/r/20220518161859.21565-4-mkoutny@suse.com Signed-off-by: Michal Koutný <mkoutny@suse.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: David Vernet <void@manifault.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Richard Palethorpe <rpalethorpe@suse.de> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-27selftests: memcg: expect no low events in unprotected siblingMichal Koutný1-1/+1
This is effectively a revert of commit cdc69458a5f3 ("cgroup: account for memory_recursiveprot in test_memcg_low()"). The case test_memcg_low will fail with memory_recursiveprot until resolved in reclaim code. However, this patch preserves the existing helpers and variables for later uses. Link: https://lkml.kernel.org/r/20220518161859.21565-3-mkoutny@suse.com Signed-off-by: Michal Koutný <mkoutny@suse.com> Reviewed-by: David Vernet <void@manifault.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Richard Palethorpe <rpalethorpe@suse.de> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-27selftests: memcg: fix compilationMichal Koutný1-11/+14
Patch series "memcontrol selftests fixups", v2. Flushing the patches to make memcontrol selftests check the events behavior we had consensus about (test_memcg_low fails). (test_memcg_reclaim, test_memcg_swap_max fail for me now but it's present even before the refactoring.) The two bigger changes are: - adjustment of the protected values to make tests succeed with the given tolerance, - both test_memcg_low and test_memcg_min check protection of memory in populated cgroups (actually as per Documentation/admin-guide/cgroup-v2.rst memory.min should not apply to empty cgroups, which is not the case currently. Therefore I unified tests with the populated case in order to to bring more broken tests). This patch (of 5): This fixes mis-applied changes from commit 72b1e03aa725 ("cgroup: account for memory_localevents in test_memcg_oom_group_leaf_events()"). Link: https://lkml.kernel.org/r/20220518161859.21565-1-mkoutny@suse.com Link: https://lkml.kernel.org/r/20220518161859.21565-2-mkoutny@suse.com Signed-off-by: Michal Koutný <mkoutny@suse.com> Reviewed-by: David Vernet <void@manifault.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Richard Palethorpe <rpalethorpe@suse.de> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-27tools arch x86: Sync the msr-index.h copy with the kernel sourcesArnaldo Carvalho de Melo1-0/+19
To pick up the changes in: db1af12929c99d15 ("x86/msr-index: Define INTEGRITY_CAPABILITIES MSR") 089be16d5992dd0b ("x86/msr: Add PerfCntrGlobal* registers") f52ba93190457aa2 ("tools/power turbostat: Add Power Limit4 support") Addressing these tools/perf build warnings: diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h' That makes the beautification scripts to pick some new entries: $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after $ diff -u before after --- before 2022-05-26 12:50:01.228612839 -0300 +++ after 2022-05-26 12:50:07.699776166 -0300 @@ -116,6 +116,7 @@ [0x0000026f] = "MTRRfix4K_F8000", [0x00000277] = "IA32_CR_PAT", [0x00000280] = "IA32_MC0_CTL2", + [0x000002d9] = "INTEGRITY_CAPS", [0x000002ff] = "MTRRdefType", [0x00000309] = "CORE_PERF_FIXED_CTR0", [0x0000030a] = "CORE_PERF_FIXED_CTR1", @@ -176,6 +177,7 @@ [0x00000586] = "IA32_RTIT_ADDR3_A", [0x00000587] = "IA32_RTIT_ADDR3_B", [0x00000600] = "IA32_DS_AREA", + [0x00000601] = "VR_CURRENT_CONFIG", [0x00000606] = "RAPL_POWER_UNIT", [0x0000060a] = "PKGC3_IRTL", [0x0000060b] = "PKGC6_IRTL", @@ -260,6 +262,10 @@ [0xc0000102 - x86_64_specific_MSRs_offset] = "KERNEL_GS_BASE", [0xc0000103 - x86_64_specific_MSRs_offset] = "TSC_AUX", [0xc0000104 - x86_64_specific_MSRs_offset] = "AMD64_TSC_RATIO", + [0xc000010f - x86_64_specific_MSRs_offset] = "AMD_DBG_EXTN_CFG", + [0xc0000300 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_STATUS", + [0xc0000301 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_CTL", + [0xc0000302 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_STATUS_CLR", }; #define x86_AMD_V_KVM_MSRs_offset 0xc0010000 @@ -318,4 +324,5 @@ [0xc00102b4 - x86_AMD_V_KVM_MSRs_offset] = "AMD_CPPC_STATUS", [0xc00102f0 - x86_AMD_V_KVM_MSRs_offset] = "AMD_PPIN_CTL", [0xc00102f1 - x86_AMD_V_KVM_MSRs_offset] = "AMD_PPIN", + [0xc0010300 - x86_AMD_V_KVM_MSRs_offset] = "AMD_SAMP_BR_FROM", }; $ Now one can trace systemwide asking to see backtraces to where those MSRs are being read/written, see this example with a previous update: # perf trace -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB" ^C# If we use -v (verbose mode) we can see what it does behind the scenes: # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB" Using CPUID AuthenticAMD-25-21-0 0x6a0 0x6a8 New filter for msr:read_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313) 0x6a0 0x6a8 New filter for msr:write_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313) mmap size 528384B ^C# Example with a frequent msr: # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2 Using CPUID AuthenticAMD-25-21-0 0x48 New filter for msr:read_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841) 0x48 New filter for msr:write_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841) mmap size 528384B Looking at the vmlinux_path (8 entries long) symsrc__init: build id mismatch for vmlinux. Using /proc/kcore for kernel data Using /proc/kallsyms for symbols 0.000 Timer/2525383 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) __switch_to_xtra ([kernel.kallsyms]) __switch_to ([kernel.kallsyms]) __schedule ([kernel.kallsyms]) schedule ([kernel.kallsyms]) futex_wait_queue_me ([kernel.kallsyms]) futex_wait ([kernel.kallsyms]) do_futex ([kernel.kallsyms]) __x64_sys_futex ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64_after_hwframe ([kernel.kallsyms]) __futex_abstimed_wait_common64 (/usr/lib64/libpthread-2.33.so) 0.030 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) __switch_to_xtra ([kernel.kallsyms]) __switch_to ([kernel.kallsyms]) __schedule ([kernel.kallsyms]) schedule_idle ([kernel.kallsyms]) do_idle ([kernel.kallsyms]) cpu_startup_entry ([kernel.kallsyms]) secondary_startup_64_no_verify ([kernel.kallsyms]) # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Len Brown <len.brown@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Cc: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/lkml/Yo+i%252Fj5+UtE9dcix@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-27perf scripts python: Support Arm CoreSight trace data disassemblyLeo Yan1-0/+272
This commit adds python script to parse CoreSight tracing event and print out source line and disassembly, it generates readable program execution flow for easier humans inspecting. The script receives CoreSight tracing packet with below format: +------------+------------+------------+ packet(n): | addr | ip | cpu | +------------+------------+------------+ packet(n+1): | addr | ip | cpu | +------------+------------+------------+ packet::addr presents the start address of the coming branch sample, and packet::ip is the last address of the branch smple. Therefore, a code section between branches starts from packet(n)::addr and it stops at packet(n+1)::ip. As results we combines the two continuous packets to generate the address range for instructions: [ sample(n)::addr .. sample(n+1)::ip ] The script supports both objdump or llvm-objdump for disassembly with specifying option '-d'. If doesn't specify option '-d', the script simply outputs source lines and symbols. Below shows usages with llvm-objdump or objdump to output disassembly. # perf script -s scripts/python/arm-cs-trace-disasm.py -- -d llvm-objdump-11 -k ./vmlinux ARM CoreSight Trace Data Assembler Dump ffff800008eb3198 <etm4_enable_hw>: ffff800008eb3310: c0 38 00 35 cbnz w0, 0xffff800008eb3a28 <etm4_enable_hw+0x890> ffff800008eb3314: 9f 3f 03 d5 dsb sy ffff800008eb3318: df 3f 03 d5 isb ffff800008eb331c: f5 5b 42 a9 ldp x21, x22, [sp, #32] ffff800008eb3320: fb 73 45 a9 ldp x27, x28, [sp, #80] ffff800008eb3324: e0 82 40 39 ldrb w0, [x23, #32] ffff800008eb3328: 60 00 00 34 cbz w0, 0xffff800008eb3334 <etm4_enable_hw+0x19c> ffff800008eb332c: e0 03 19 aa mov x0, x25 ffff800008eb3330: 8c fe ff 97 bl 0xffff800008eb2d60 <etm4_cs_lock.isra.0.part.0> main 6728/6728 [0004] 0.000000000 etm4_enable_hw+0x198 [kernel.kallsyms] ffff800008eb2d60 <etm4_cs_lock.isra.0.part.0>: ffff800008eb2d60: 1f 20 03 d5 nop ffff800008eb2d64: 1f 20 03 d5 nop ffff800008eb2d68: 3f 23 03 d5 hint #25 ffff800008eb2d6c: 00 00 40 f9 ldr x0, [x0] ffff800008eb2d70: 9f 3f 03 d5 dsb sy ffff800008eb2d74: 00 c0 3e 91 add x0, x0, #4016 ffff800008eb2d78: 1f 00 00 b9 str wzr, [x0] ffff800008eb2d7c: bf 23 03 d5 hint #29 ffff800008eb2d80: c0 03 5f d6 ret main 6728/6728 [0004] 0.000000000 etm4_cs_lock.isra.0.part.0+0x20 # perf script -s scripts/python/arm-cs-trace-disasm.py -- -d objdump -k ./vmlinux ARM CoreSight Trace Data Assembler Dump ffff800008eb3310 <etm4_enable_hw+0x178>: ffff800008eb3310: 350038c0 cbnz w0, ffff800008eb3a28 <etm4_enable_hw+0x890> ffff800008eb3314: d5033f9f dsb sy ffff800008eb3318: d5033fdf isb ffff800008eb331c: a9425bf5 ldp x21, x22, [sp, #32] ffff800008eb3320: a94573fb ldp x27, x28, [sp, #80] ffff800008eb3324: 394082e0 ldrb w0, [x23, #32] ffff800008eb3328: 34000060 cbz w0, ffff800008eb3334 <etm4_enable_hw+0x19c> ffff800008eb332c: aa1903e0 mov x0, x25 ffff800008eb3330: 97fffe8c bl ffff800008eb2d60 <etm4_cs_lock.isra.0.part.0> main 6728/6728 [0004] 0.000000000 etm4_enable_hw+0x198 [kernel.kallsyms] ffff800008eb2d60 <etm4_cs_lock.isra.0.part.0>: ffff800008eb2d60: d503201f nop ffff800008eb2d64: d503201f nop ffff800008eb2d68: d503233f paciasp ffff800008eb2d6c: f9400000 ldr x0, [x0] ffff800008eb2d70: d5033f9f dsb sy ffff800008eb2d74: 913ec000 add x0, x0, #0xfb0 ffff800008eb2d78: b900001f str wzr, [x0] ffff800008eb2d7c: d50323bf autiasp ffff800008eb2d80: d65f03c0 ret main 6728/6728 [0004] 0.000000000 etm4_cs_lock.isra.0.part.0+0x20 Signed-off-by: Leo Yan <leo.yan@linaro.org> Co-authored-by: Al Grant <al.grant@arm.com> Co-authored-by: Mathieu Poirier <mathieu.poirier@linaro.org> Co-authored-by: Tor Jeremiassen <tor@ti.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Eelco Chaudron <echaudro@redhat.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Tanmay Jagdale <tanmay@marvell.com> Cc: coresight@lists.linaro.org Cc: zengshun . wu <zengshun.wu@outlook.com> Link: https://lore.kernel.org/r/20220521130446.4163597-3-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-27perf scripting python: Expose dso and map informationLeo Yan1-4/+17
This change adds dso build_id and corresponding map's start and end address. The info of dso build_id can be used to find dso file path, and we can validate if a branch address falls into the range of map's start and end addresses. In addition, the map's start address can be used as an offset for disassembly. Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Grant <al.grant@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Eelco Chaudron <echaudro@redhat.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Tanmay Jagdale <tanmay@marvell.com> Cc: coresight@lists.linaro.org Cc: zengshun . wu <zengshun.wu@outlook.com> Link: https://lore.kernel.org/r/20220521130446.4163597-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-27perf jevents: Fix event syntax error caused by ExtSelZhengjun Xing1-1/+1
In the origin code, when "ExtSel" is 1, the eventcode will change to "eventcode |= 1 << 21”. For event “UNC_Q_RxL_CREDITS_CONSUMED_VN0.DRS", its "ExtSel" is "1", its eventcode will change from 0x1E to 0x20001E, but in fact the eventcode should <=0x1FF, so this will cause the parse fail: # perf stat -e "UNC_Q_RxL_CREDITS_CONSUMED_VN0.DRS" -a sleep 0.1 event syntax error: '.._RxL_CREDITS_CONSUMED_VN0.DRS' \___ value too big for format, maximum is 511 On the perf kernel side, the kernel assumes the valid bits are continuous. It will adjust the 0x100 (bit 8 for perf tool) to bit 21 in HW. DEFINE_UNCORE_FORMAT_ATTR(event_ext, event, "config:0-7,21"); So the perf tool follows the kernel side and just set bit8 other than bit21. Fixes: fedb2b518239cbc0 ("perf jevents: Add support for parsing uncore json files") Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220525140410.1706851-1-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-27perf tools arm64: Add support for VG registerJames Clark2-0/+40
Add the name of the VG register so it can be used in --user-regs The event will fail to open if the register is requested but not available so only add it to the mask if the kernel supports sve and also if it supports that specific register. Committer notes: Add conditional definition of HWCAP_SVE, as suggested by Leo Yan, to build on older systems where this is not available in the system headers. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: James Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Brown <broonie@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220525154114.718321-6-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-27selftests/bpf: fix stacktrace_build_id with missing kprobe/urandom_readSong Liu1-1/+1
Kernel function urandom_read is replaced with urandom_read_iter. Therefore, kprobe on urandom_read is not working any more: [root@eth50-1 bpf]# ./test_progs -n 161 test_stacktrace_build_id:PASS:skel_open_and_load 0 nsec libbpf: kprobe perf_event_open() failed: No such file or directory libbpf: prog 'oncpu': failed to create kprobe 'urandom_read+0x0' \ perf event: No such file or directory libbpf: prog 'oncpu': failed to auto-attach: -2 test_stacktrace_build_id:FAIL:attach_tp err -2 161 stacktrace_build_id:FAIL Fix this by replacing urandom_read with urandom_read_iter in the test. Fixes: 1b388e7765f2 ("random: convert to using fops->read_iter()") Reported-by: Mykola Lysenko <mykolal@fb.com> Signed-off-by: Song Liu <song@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20220526191608.2364049-1-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-26Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds23-174/+1840
Pull kvm updates from Paolo Bonzini: "S390: - ultravisor communication device driver - fix TEID on terminating storage key ops RISC-V: - Added Sv57x4 support for G-stage page table - Added range based local HFENCE functions - Added remote HFENCE functions based on VCPU requests - Added ISA extension registers in ONE_REG interface - Updated KVM RISC-V maintainers entry to cover selftests support ARM: - Add support for the ARMv8.6 WFxT extension - Guard pages for the EL2 stacks - Trap and emulate AArch32 ID registers to hide unsupported features - Ability to select and save/restore the set of hypercalls exposed to the guest - Support for PSCI-initiated suspend in collaboration with userspace - GICv3 register-based LPI invalidation support - Move host PMU event merging into the vcpu data structure - GICv3 ITS save/restore fixes - The usual set of small-scale cleanups and fixes x86: - New ioctls to get/set TSC frequency for a whole VM - Allow userspace to opt out of hypercall patching - Only do MSR filtering for MSRs accessed by rdmsr/wrmsr AMD SEV improvements: - Add KVM_EXIT_SHUTDOWN metadata for SEV-ES - V_TSC_AUX support Nested virtualization improvements for AMD: - Support for "nested nested" optimizations (nested vVMLOAD/VMSAVE, nested vGIF) - Allow AVIC to co-exist with a nested guest running - Fixes for LBR virtualizations when a nested guest is running, and nested LBR virtualization support - PAUSE filtering for nested hypervisors Guest support: - Decoupling of vcpu_is_preempted from PV spinlocks" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (199 commits) KVM: x86: Fix the intel_pt PMI handling wrongly considered from guest KVM: selftests: x86: Sync the new name of the test case to .gitignore Documentation: kvm: reorder ARM-specific section about KVM_SYSTEM_EVENT_SUSPEND x86, kvm: use correct GFP flags for preemption disabled KVM: LAPIC: Drop pending LAPIC timer injection when canceling the timer x86/kvm: Alloc dummy async #PF token outside of raw spinlock KVM: x86: avoid calling x86 emulator without a decoded instruction KVM: SVM: Use kzalloc for sev ioctl interfaces to prevent kernel data leak x86/fpu: KVM: Set the base guest FPU uABI size to sizeof(struct kvm_xsave) s390/uv_uapi: depend on CONFIG_S390 KVM: selftests: x86: Fix test failure on arch lbr capable platforms KVM: LAPIC: Trace LAPIC timer expiration on every vmentry KVM: s390: selftest: Test suppression indication on key prot exception KVM: s390: Don't indicate suppression on dirtying, failing memop selftests: drivers/s390x: Add uvdevice tests drivers/s390/char: Add Ultravisor io device MAINTAINERS: Update KVM RISC-V entry to cover selftests support RISC-V: KVM: Introduce ISA extension register RISC-V: KVM: Cleanup stale TLB entries when host CPU changes RISC-V: KVM: Add remote HFENCE functions based on VCPU requests ...
2022-05-26Merge tag 'mm-stable-2022-05-25' of ↵Linus Torvalds23-685/+1333
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Almost all of MM here. A few things are still getting finished off, reviewed, etc. - Yang Shi has improved the behaviour of khugepaged collapsing of readonly file-backed transparent hugepages. - Johannes Weiner has arranged for zswap memory use to be tracked and managed on a per-cgroup basis. - Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for runtime enablement of the recent huge page vmemmap optimization feature. - Baolin Wang contributes a series to fix some issues around hugetlb pagetable invalidation. - Zhenwei Pi has fixed some interactions between hwpoisoned pages and virtualization. - Tong Tiangen has enabled the use of the presently x86-only page_table_check debugging feature on arm64 and riscv. - David Vernet has done some fixup work on the memcg selftests. - Peter Xu has taught userfaultfd to handle write protection faults against shmem- and hugetlbfs-backed files. - More DAMON development from SeongJae Park - adding online tuning of the feature and support for monitoring of fixed virtual address ranges. Also easier discovery of which monitoring operations are available. - Nadav Amit has done some optimization of TLB flushing during mprotect(). - Neil Brown continues to labor away at improving our swap-over-NFS support. - David Hildenbrand has some fixes to anon page COWing versus get_user_pages(). - Peng Liu fixed some errors in the core hugetlb code. - Joao Martins has reduced the amount of memory consumed by device-dax's compound devmaps. - Some cleanups of the arch-specific pagemap code from Anshuman Khandual. - Muchun Song has found and fixed some errors in the TLB flushing of transparent hugepages. - Roman Gushchin has done more work on the memcg selftests. ... and, of course, many smaller fixes and cleanups. Notably, the customary million cleanup serieses from Miaohe Lin" * tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (381 commits) mm: kfence: use PAGE_ALIGNED helper selftests: vm: add the "settings" file with timeout variable selftests: vm: add "test_hmm.sh" to TEST_FILES selftests: vm: check numa_available() before operating "merge_across_nodes" in ksm_tests selftests: vm: add migration to the .gitignore selftests/vm/pkeys: fix typo in comment ksm: fix typo in comment selftests: vm: add process_mrelease tests Revert "mm/vmscan: never demote for memcg reclaim" mm/kfence: print disabling or re-enabling message include/trace/events/percpu.h: cleanup for "percpu: improve percpu_alloc_percpu event trace" include/trace/events/mmflags.h: cleanup for "tracing: incorrect gfp_t conversion" mm: fix a potential infinite loop in start_isolate_page_range() MAINTAINERS: add Muchun as co-maintainer for HugeTLB zram: fix Kconfig dependency warning mm/shmem: fix shmem folio swapoff hang cgroup: fix an error handling path in alloc_pagecache_max_30M() mm: damon: use HPAGE_PMD_SIZE tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate nodemask.h: fix compilation error with GCC12 ...