linux - Linux Kernel (branches are rebased on master from time to time)

Age	Commit message (Collapse)	Author	Files	Lines
2022-08-10	bpf: Disallow bpf programs call prog_run command.	Alexei Starovoitov	1	-2/+2
	The verifier cannot perform sufficient validation of bpf_attr->test.ctx_in pointer, therefore bpf programs should not be allowed to call BPF_PROG_RUN command from within the program. To fix this issue split bpf_sys_bpf() bpf helper into normal kern_sys_bpf() kernel function that can only be used by the kernel light skeleton directly. Reported-by: YiFei Zhu <zhuyifei@google.com> Fixes: b1d18a7574d0 ("bpf: Extend sys_bpf commands for bpf_syscall programs.") Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-29	libbpf: Add bpf_obj_get_opts()	Joe Burton	3	-0/+21
	Add an extensible variant of bpf_obj_get() capable of setting the `file_flags` parameter. This parameter is needed to enable unprivileged access to BPF maps. Without a method like this, users must manually make the syscall. Signed-off-by: Joe Burton <jevburton@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220729202727.3311806-1-jevburton.kernel@gmail.com
2022-07-28	libbpf: Support PPC in arch_specific_syscall_pfx	Daniel Müller	1	-1/+10
	Commit 708ac5bea0ce ("libbpf: add ksyscall/kretsyscall sections support for syscall kprobes") added the arch_specific_syscall_pfx() function, which returns a string representing the architecture in use. As it turns out this function is currently not aware of Power PC, where NULL is returned. That's being flagged by the libbpf CI system, which builds for ppc64le and the compiler sees a NULL pointer being passed in to a %s format string. With this change we add representations for two more architectures, for Power PC and Power PC 64, and also adjust the string format logic to handle NULL pointers gracefully, in an attempt to prevent similar issues with other architectures in the future. Fixes: 708ac5bea0ce ("libbpf: add ksyscall/kretsyscall sections support for syscall kprobes") Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220728222345.3125975-1-deso@posteo.net
2022-07-26	libbpf: Extend BPF_KSYSCALL documentation	Ilya Leoshkevich	1	-4/+11
	Explicitly list known quirks. Mention that socket-related syscalls can be invoked via socketcall(). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20220726134008.256968-2-iii@linux.ibm.com
2022-07-21	libbpf: Fix str_has_sfx()'s return value	Dan Carpenter	1	-3/+3
	The return from strcmp() is inverted so it wrongly returns true instead of false and vice versa. Fixes: a1c9d61b19cb ("libbpf: Improve library identification for uprobe binary path resolution") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Cc: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/YtZ+/dAA195d99ak@kili
2022-07-21	libbpf: Fix sign expansion bug in btf_dump_get_enum_value()	Dan Carpenter	1	-1/+1
	The code here is supposed to take a signed int and store it in a signed long long. Unfortunately, the way that the type promotion works with this conditional statement is that it takes a signed int, type promotes it to a __u32, and then stores that as a signed long long. The result is never negative. This is from static analysis, but I made a little test program just to test it before I sent the patch: #include <stdio.h> int main(void) { unsigned long long src = -1ULL; signed long long dst1, dst2; int is_signed = 1; dst1 = is_signed ? (int )&src : (unsigned int )0; dst2 = is_signed ? (signed long long)(int )&src : (unsigned int )0; printf("%lld\n", dst1); printf("%lld\n", dst2); return 0; } Fixes: d90ec262b35b ("libbpf: Add enum64 support for btf_dump") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/YtZ+LpgPADm7BeEd@kili
2022-07-19	libbpf: fix an snprintf() overflow check	Dan Carpenter	1	-1/+1
	The snprintf() function returns the number of bytes it would have copied if there were enough space. So it can return > the sizeof(gen->attach_target). Fixes: 67234743736a ("libbpf: Generate loader program out of BPF ELF file.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/YtZ+oAySqIhFl6/J@kili Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19	libbpf: make RINGBUF map size adjustments more eagerly	Andrii Nakryiko	1	-35/+42
	Make libbpf adjust RINGBUF map size (rounding it up to closest power-of-2 of page_size) more eagerly: during open phase when initializing the map and on explicit calls to bpf_map__set_max_entries(). Such approach allows user to check actual size of BPF ringbuf even before it's created in the kernel, but also it prevents various edge case scenarios where BPF ringbuf size can get out of sync with what it would be in kernel. One of them (reported in [0]) is during an attempt to pin/reuse BPF ringbuf. Move adjust_ringbuf_sz() helper closer to its first actual use. The implementation of the helper is unchanged. Also make detection of whether bpf_object is already loaded more robust by checking obj->loaded explicitly, given that map->fd can be < 0 even if bpf_object is already loaded due to ability to disable map creation with bpf_map__set_autocreate(map, false). [0] Closes: https://github.com/libbpf/libbpf/pull/530 Fixes: 0087a681fa8c ("libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220715230952.2219271-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19	libbpf: fallback to tracefs mount point if debugfs is not mounted	Andrii Nakryiko	1	-21/+40
	Teach libbpf to fallback to tracefs mount point (/sys/kernel/tracing) if debugfs (/sys/kernel/debug/tracing) isn't mounted. Acked-by: Yonghong Song <yhs@fb.com> Suggested-by: Connor O'Brien <connoro@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220715185736.898848-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19	libbpf: add ksyscall/kretsyscall sections support for syscall kprobes	Andrii Nakryiko	4	-9/+157
	Add SEC("ksyscall")/SEC("ksyscall/<syscall_name>") and corresponding kretsyscall variants (for return kprobes) to allow users to kprobe syscall functions in kernel. These special sections allow to ignore complexities and differences between kernel versions and host architectures when it comes to syscall wrapper and corresponding __<arch>_sys_<syscall> vs __se_sys_<syscall> differences, depending on whether host kernel has CONFIG_ARCH_HAS_SYSCALL_WRAPPER (though libbpf itself doesn't rely on /proc/config.gz for detecting this, see BPF_KSYSCALL patch for how it's done internally). Combined with the use of BPF_KSYSCALL() macro, this allows to just specify intended syscall name and expected input arguments and leave dealing with all the variations to libbpf. In addition to SEC("ksyscall+") and SEC("kretsyscall+") add bpf_program__attach_ksyscall() API which allows to specify syscall name at runtime and provide associated BPF cookie value. At the moment SEC("ksyscall") and bpf_program__attach_ksyscall() do not handle all the calling convention quirks for mmap(), clone() and compat syscalls. It also only attaches to "native" syscall interfaces. If host system supports compat syscalls or defines 32-bit syscalls in 64-bit kernel, such syscall interfaces won't be attached to by libbpf. These limitations may or may not change in the future. Therefore it is recommended to use SEC("kprobe") for these syscalls or if working with compat and 32-bit interfaces is required. Tested-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220714070755.3235561-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19	libbpf: improve BPF_KPROBE_SYSCALL macro and rename it to BPF_KSYSCALL	Andrii Nakryiko	2	-13/+40
	Improve BPF_KPROBE_SYSCALL (and rename it to shorter BPF_KSYSCALL to match libbpf's SEC("ksyscall") section name, added in next patch) to use __kconfig variable to determine how to properly fetch syscall arguments. Instead of relying on hard-coded knowledge of whether kernel's architecture uses syscall wrapper or not (which only reflects the latest kernel versions, but is not necessarily true for older kernels and won't necessarily hold for later kernel versions on some particular host architecture), determine this at runtime by attempting to create perf_event (with fallback to kprobe event creation through tracefs on legacy kernels, just like kprobe attachment code is doing) for kernel function that would correspond to bpf() syscall on a system that has CONFIG_ARCH_HAS_SYSCALL_WRAPPER set (e.g., for x86-64 it would try '__x64_sys_bpf'). If host kernel uses syscall wrapper, syscall kernel function's first argument is a pointer to struct pt_regs that then contains syscall arguments. In such case we need to use bpf_probe_read_kernel() to fetch actual arguments (which we do through BPF_CORE_READ() macro) from inner pt_regs. But if the kernel doesn't use syscall wrapper approach, input arguments can be read from struct pt_regs directly with no probe reading. All this feature detection is done without requiring /proc/config.gz existence and parsing, and BPF-side helper code uses newly added LINUX_HAS_SYSCALL_WRAPPER virtual __kconfig extern to keep in sync with user-side feature detection of libbpf. BPF_KSYSCALL() macro can be used both with SEC("kprobe") programs that define syscall function explicitly (e.g., SEC("kprobe/__x64_sys_bpf")) and SEC("ksyscall") program added in the next patch (which are the same kprobe program with added benefit of libbpf determining correct kernel function name automatically). Kretprobe and kretsyscall (added in next patch) programs don't need BPF_KSYSCALL as they don't provide access to input arguments. Normal BPF_KRETPROBE is completely sufficient and is recommended. Tested-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220714070755.3235561-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19	libbpf: generalize virtual __kconfig externs and use it for USDT	Andrii Nakryiko	2	-45/+66
	Libbpf supports single virtual __kconfig extern currently: LINUX_KERNEL_VERSION. LINUX_KERNEL_VERSION isn't coming from /proc/kconfig.gz and is intead customly filled out by libbpf. This patch generalizes this approach to support more such virtual __kconfig externs. One such extern added in this patch is LINUX_HAS_BPF_COOKIE which is used for BPF-side USDT supporting code in usdt.bpf.h instead of using CO-RE-based enum detection approach for detecting bpf_get_attach_cookie() BPF helper. This allows to remove otherwise not needed CO-RE dependency and keeps user-space and BPF-side parts of libbpf's USDT support strictly in sync in terms of their feature detection. We'll use similar approach for syscall wrapper detection for BPF_KSYSCALL() BPF-side macro in follow up patch. Generally, currently libbpf reserves CONFIG_ prefix for Kconfig values and LINUX_ for virtual libbpf-backed externs. In the future we might extend the set of prefixes that are supported. This can be done without any breaking changes, as currently any __kconfig extern with unrecognized name is rejected. For LINUX_xxx externs we support the normal "weak rule": if libbpf doesn't recognize given LINUX_xxx extern but such extern is marked as __weak, it is not rejected and defaults to zero. This follows CONFIG_xxx handling logic and will allow BPF applications to opportunistically use newer libbpf virtual externs without breaking on older libbpf versions unnecessarily. Tested-by: Alan Maguire <alan.maguire@oracle.com> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220714070755.3235561-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-15	libbpf: perfbuf: Add API to get the ring buffer	Jon Doron	3	-0/+33
	Add support for writing a custom event reader, by exposing the ring buffer. With the new API perf_buffer__buffer() you will get access to the raw mmaped()'ed per-cpu underlying memory of the ring buffer. This region contains both the perf buffer data and header (struct perf_event_mmap_page), which manages the ring buffer state (head/tail positions, when accessing the head/tail position it's important to take into consideration SMP). With this type of low level access one can implement different types of consumers here are few simple examples where this API helps with: 1. perf_event_read_simple is allocating using malloc, perhaps you want to handle the wrap-around in some other way. 2. Since perf buf is per-cpu then the order of the events is not guarnteed, for example: Given 3 events where each event has a timestamp t0 < t1 < t2, and the events are spread on more than 1 CPU, then we can end up with the following state in the ring buf: CPU[0] => [t0, t2] CPU[1] => [t1] When you consume the events from CPU[0], you could know there is a t1 missing, (assuming there are no drops, and your event data contains a sequential index). So now one can simply do the following, for CPU[0], you can store the address of t0 and t2 in an array (without moving the tail, so there data is not perished) then move on the CPU[1] and set the address of t1 in the same array. So you end up with something like: void **arr[] = [&t0, &t1, &t2], now you can consume it orderely and move the tails as you process in order. 3. Assuming there are multiple CPUs and we want to start draining the messages from them, then we can "pick" with which one to start with according to the remaining free space in the ring buffer. Signed-off-by: Jon Doron <jond@wiz.io> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220715181122.149224-1-arilou@gmail.com
2022-07-13	libbpf: Fix the name of a reused map	Anquan Wu	1	-2/+7
	BPF map name is limited to BPF_OBJ_NAME_LEN. A map name is defined as being longer than BPF_OBJ_NAME_LEN, it will be truncated to BPF_OBJ_NAME_LEN when a userspace program calls libbpf to create the map. A pinned map also generates a path in the /sys. If the previous program wanted to reuse the map， it can not get bpf_map by name, because the name of the map is only partially the same as the name which get from pinned path. The syscall information below show that map name "process_pinned_map" is truncated to "process_pinned_". bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/process_pinned_map", bpf_fd=0, file_flags=0}, 144) = -1 ENOENT (No such file or directory) bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=4,max_entries=1024, map_flags=0, inner_map_fd=0, map_name="process_pinned_",map_ifindex=0, btf_fd=3, btf_key_type_id=6, btf_value_type_id=10,btf_vmlinux_value_type_id=0}, 72) = 4 This patch check that if the name of pinned map are the same as the actual name for the first (BPF_OBJ_NAME_LEN - 1), bpf map still uses the name which is included in bpf object. Fixes: 26736eb9a483 ("tools: libbpf: allow map reuse") Signed-off-by: Anquan Wu <leiqi96@hotmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/OSZP286MB1725CEA1C95C5CB8E7CCC53FB8869@OSZP286MB1725.JPNP286.PROD.OUTLOOK.COM
2022-07-13	libbpf: Error out when binary_path is NULL for uprobe and USDT	Hengqi Chen	1	-6/+7
	binary_path is a required non-null parameter for bpf_program__attach_usdt and bpf_program__attach_uprobe_opts. Check it against NULL to prevent coredump on strchr. Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220712025745.2703995-1-hengqi.chen@gmail.com
2022-07-08	bpf: Correctly propagate errors up from bpf_core_composites_match	Daniel Müller	1	-1/+8
	This change addresses a comment made earlier [0] about a missing return of an error when __bpf_core_types_match is invoked from bpf_core_composites_match, which could have let to us erroneously ignoring errors. Regarding the typedef name check pointed out in the same context, it is not actually an issue, because callers of the function perform a name check for the root type anyway. To make that more obvious, let's add comments to the function (similar to what we have for bpf_core_types_are_compat, which is called in pretty much the same context). [0]: https://lore.kernel.org/bpf/165708121449.4919.13204634393477172905.git-patchwork-notify@kernel.org/T/#m55141e8f8cfd2e8d97e65328fa04852870d01af6 Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220707211931.3415440-1-deso@posteo.net
2022-07-08	libbpf: Disable SEC pragma macro on GCC	James Hilliard	1	-0/+13
	It seems the gcc preprocessor breaks with pragmas when surrounding __attribute__. Disable these pragmas on GCC due to upstream bugs see: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55578 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90400 Fixes errors like: error: expected identifier or '(' before '#pragma' 106 \| SEC("cgroup/bind6") \| ^~~ error: expected '=', ',', ';', 'asm' or '__attribute__' before '#pragma' 114 \| char _license[] SEC("license") = "GPL"; \| ^~~ Signed-off-by: James Hilliard <james.hilliard1@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220706111839.1247911-1-james.hilliard1@gmail.com
2022-07-07	libbpf, riscv: Use a0 for RC register	Yixun Lan	1	-1/+1
	According to the RISC-V calling convention register usage here [0], a0 is used as return value register, so rename it to make it consistent with the spec. [0] section 18.2, table 18.2 https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf Fixes: 589fed479ba1 ("riscv, libbpf: Add RISC-V (RV64) support to bpf_tracing.h") Signed-off-by: Yixun Lan <dlan@gentoo.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn@kernel.org> Acked-by: Amjad OULED-AMEUR <ouledameur.amjad@gmail.com> Link: https://lore.kernel.org/bpf/20220706140204.47926-1-dlan@gentoo.org
2022-07-06	libbpf: Remove unnecessary usdt_rel_ip assignments	Andrii Nakryiko	1	-4/+2
	Coverity detected that usdt_rel_ip is unconditionally overwritten anyways, so there is no need to unnecessarily initialize it with unused value. Clean this up. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20220705224818.4026623-4-andrii@kernel.org
2022-07-05	libbpf: Cleanup the legacy uprobe_event on failed add/attach_event()	Chuang Wang	1	-5/+14
	A potential scenario, when an error is returned after add_uprobe_event_legacy() in perf_event_uprobe_open_legacy(), or bpf_program__attach_perf_event_opts() in bpf_program__attach_uprobe_opts() returns an error, the uprobe_event that was previously created is not cleaned. So, with this patch, when an error is returned, fix this by adding remove_uprobe_event_legacy() Signed-off-by: Chuang Wang <nashuiliang@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220629151848.65587-4-nashuiliang@gmail.com
2022-07-05	libbpf: Fix wrong variable used in perf_event_uprobe_open_legacy()	Chuang Wang	1	-1/+1
	Use "type" as opposed to "err" in pr_warn() after determine_uprobe_perf_type_legacy() returns an error. Signed-off-by: Chuang Wang <nashuiliang@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220629151848.65587-3-nashuiliang@gmail.com
2022-07-05	libbpf: Cleanup the legacy kprobe_event on failed add/attach_event()	Chuang Wang	1	-4/+14
	Before the 0bc11ed5ab60 commit ("kprobes: Allow kprobes coexist with livepatch"), in a scenario where livepatch and kprobe coexist on the same function entry, the creation of kprobe_event using add_kprobe_event_legacy() will be successful, at the same time as a trace event (e.g. /debugfs/tracing/events/kprobe/XXX) will exist, but perf_event_open() will return an error because both livepatch and kprobe use FTRACE_OPS_FL_IPMODIFY. As follows: 1) add a livepatch $ insmod livepatch-XXX.ko 2) add a kprobe using tracefs API (i.e. add_kprobe_event_legacy) $ echo 'p:mykprobe XXX' > /sys/kernel/debug/tracing/kprobe_events 3) enable this kprobe (i.e. sys_perf_event_open) This will return an error, -EBUSY. On Andrii Nakryiko's comment, few error paths in bpf_program__attach_kprobe_opts() that should need to call remove_kprobe_event_legacy(). With this patch, whenever an error is returned after add_kprobe_event_legacy() or bpf_program__attach_perf_event_opts(), this ensures that the created kprobe_event is cleaned. Signed-off-by: Chuang Wang <nashuiliang@gmail.com> Signed-off-by: Jingren Zhou <zhoujingren@didiglobal.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220629151848.65587-2-nashuiliang@gmail.com
2022-07-05	libbpf: add bpf_core_type_matches() helper macro	Andrii Nakryiko	1	-0/+10
	This patch finalizes support for the proposed type match relation in libbpf by adding bpf_core_type_matches() macro which emits TYPE_MATCH relocation. Clang support for this relocation was added in [0]. [0] https://reviews.llvm.org/D126838 Signed-off-by: Daniel Müller <deso@posteo.net>¬ Signed-off-by: Andrii Nakryiko <andrii@kernel.org>¬ Link: https://lore.kernel.org/bpf/20220628160127.607834-7-deso@posteo.net¬
2022-07-05	bpf, libbpf: Add type match support	Daniel Müller	3	-4/+285
	This patch adds support for the proposed type match relation to relo_core where it is shared between userspace and kernel. It plumbs through both kernel-side and libbpf-side support. The matching relation is defined as follows (copy from source): - modifiers and typedefs are stripped (and, hence, effectively ignored) - generally speaking types need to be of same kind (struct vs. struct, union vs. union, etc.) - exceptions are struct/union behind a pointer which could also match a forward declaration of a struct or union, respectively, and enum vs. enum64 (see below) Then, depending on type: - integers: - match if size and signedness match - arrays & pointers: - target types are recursively matched - structs & unions: - local members need to exist in target with the same name - for each member we recursively check match unless it is already behind a pointer, in which case we only check matching names and compatible kind - enums: - local variants have to have a match in target by symbolic name (but not numeric value) - size has to match (but enum may match enum64 and vice versa) - function pointers: - number and position of arguments in local type has to match target - for each argument and the return value we recursively check match Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220628160127.607834-5-deso@posteo.net
2022-07-05	bpf: Introduce TYPE_MATCH related constants/macros	Daniel Müller	1	-0/+1
	In order to provide type match support we require a new type of relocation which, in turn, requires toolchain support. Recent LLVM/Clang versions support a new value for the last argument to the __builtin_preserve_type_info builtin, for example. With this change we introduce the necessary constants into relevant header files, mirroring what the compiler may support. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220628160127.607834-2-deso@posteo.net
2022-06-29	libbpf: implement bpf_prog_query_opts	Stanislav Fomichev	3	-7/+47
	Implement bpf_prog_query_opts as a more expendable version of bpf_prog_query. Expose new prog_attach_flags and attach_btf_func_id as well: * prog_attach_flags is a per-program attach_type; relevant only for lsm cgroup program which might have different attach_flags per attach_btf_id * attach_btf_func_id is a new field expose for prog_query which specifies real btf function id for lsm cgroup attachments Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20220628174314.1216643-10-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-29	libbpf: add lsm_cgoup_sock type	Stanislav Fomichev	1	-0/+3
	lsm_cgroup/ is the prefix for BPF_LSM_CGROUP. Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20220628174314.1216643-9-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28	libbpf: fix up few libbpf.map problems	Andrii Nakryiko	2	-3/+4
	Seems like we missed to add 2 APIs to libbpf.map and another API was misspelled. Fix it in libbpf.map. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-16-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28	libbpf: enforce strict libbpf 1.0 behaviors	Andrii Nakryiko	5	-261/+37
	Remove support for legacy features and behaviors that previously had to be disabled by calling libbpf_set_strict_mode(): - legacy BPF map definitions are not supported now; - RLIMIT_MEMLOCK auto-setting, if necessary, is always on (but see libbpf_set_memlock_rlim()); - program name is used for program pinning (instead of section name); - cleaned up error returning logic; - entry BPF programs should have SEC() always. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-15-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28	libbpf: clean up SEC() handling	Andrii Nakryiko	1	-72/+47
	Get rid of sloppy prefix logic and remove deprecated xdp_{devmap,cpumap} sections. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-13-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28	libbpf: remove internal multi-instance prog support	Andrii Nakryiko	1	-283/+34
	Clean up internals that had to deal with the possibility of multi-instance bpf_programs. Libbpf 1.0 doesn't support this, so all this is not necessary now and can be simplified. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-12-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28	libbpf: cleanup LIBBPF_DEPRECATED_SINCE supporting macros for v0.x	Andrii Nakryiko	1	-13/+3
	Keep the LIBBPF_DEPRECATED_SINCE macro "framework" for future deprecations, but clean up 0.x related helper macros. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-11-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28	libbpf: remove multi-instance and custom private data APIs	Andrii Nakryiko	3	-212/+10
	Remove all the public APIs that are related to creating multi-instance bpf_programs through custom preprocessing callback and generally working with them. Also remove all the bpf_{object,map,program}__[set_]priv() APIs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-10-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28	libbpf: remove most other deprecated high-level APIs	Andrii Nakryiko	3	-421/+26
	Remove a bunch of high-level bpf_object/bpf_map/bpf_program related APIs. All the APIs related to private per-object/map/prog state, program preprocessing callback, and generally everything multi-instance related is removed in a separate patch. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-9-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28	libbpf: remove prog_info_linear APIs	Andrii Nakryiko	3	-317/+0
	Remove prog_info_linear-related APIs previously used by perf. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-8-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28	libbpf: clean up perfbuf APIs	Andrii Nakryiko	3	-112/+18
	Remove deprecated perfbuf APIs and clean up opts structs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-7-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28	libbpf: remove deprecated BTF APIs	Andrii Nakryiko	5	-325/+24
	Get rid of deprecated BTF-related APIs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28	libbpf: remove deprecated probing APIs	Andrii Nakryiko	3	-132/+5
	Get rid of deprecated feature-probing APIs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28	libbpf: remove deprecated XDP APIs	Andrii Nakryiko	3	-78/+8
	Get rid of deprecated bpf_set_link() and bpf_get_link() APIs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28	libbpf: remove deprecated low-level APIs	Andrii Nakryiko	5	-374/+4
	Drop low-level APIs as well as high-level (and very confusingly named) BPF object loading bpf_prog_load_xattr() and bpf_prog_load_deprecated() APIs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28	libbpf: move xsk.{c,h} into selftests/bpf	Andrii Nakryiko	5	-1610/+2
	Remove deprecated xsk APIs from libbpf. But given we have selftests relying on this, move those files (with minimal adjustments to make them compilable) under selftests/bpf. We also remove all the removed APIs from libbpf.map, while overall keeping version inheritance chain, as most APIs are backwards compatible so there is no need to reassign them as LIBBPF_1.0.0 versions. Cc: Magnus Karlsson <magnus.karlsson@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-24	bpf: Merge "types_are_compat" logic into relo_core.c	Daniel Müller	3	-71/+83
	BPF type compatibility checks (bpf_core_types_are_compat()) are currently duplicated between kernel and user space. That's a historical artifact more than intentional doing and can lead to subtle bugs where one implementation is adjusted but another is forgotten. That happened with the enum64 work, for example, where the libbpf side was changed (commit 23b2a3a8f63a ("libbpf: Add enum64 relocation support")) to use the btf_kind_core_compat() helper function but the kernel side was not (commit 6089fb325cf7 ("bpf: Add btf enum64 support")). This patch addresses both the duplication issue, by merging both implementations and moving them into relo_core.c, and fixes the alluded to kind check (by giving preference to libbpf's already adjusted logic). For discussion of the topic, please refer to: https://lore.kernel.org/bpf/CAADnVQKbWR7oarBdewgOBZUPzryhRYvEbkhyPJQHHuxq=0K1gw@mail.gmail.com/T/#mcc99f4a33ad9a322afaf1b9276fb1f0b7add9665 Changelog: v1 -> v2: - limited libbpf recursion limit to 32 - changed name to __bpf_core_types_are_compat - included warning previously present in libbpf version - merged kernel and user space changes into a single patch Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220623182934.2582827-1-deso@posteo.net
2022-06-16	libbpf: add support for sleepable uprobe programs	Delyan Kratunov	1	-1/+4
	Add section mappings for u(ret)probe.s programs. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Delyan Kratunov <delyank@fb.com> Link: https://lore.kernel.org/r/aedbc3b74f3523f00010a7b0df8f3388cca59f16.1655248076.git.delyank@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-17	libbpf: Fix internal USDT address translation logic for shared libraries	Andrii Nakryiko	1	-60/+63
	Perform the same virtual address to file offset translation that libbpf is doing for executable ELF binaries also for shared libraries. Currently libbpf is making a simplifying and sometimes wrong assumption that for shared libraries relative virtual addresses inside ELF are always equal to file offsets. Unfortunately, this is not always the case with LLVM's lld linker, which now by default generates quite more complicated ELF segments layout. E.g., for liburandom_read.so from selftests/bpf, here's an excerpt from readelf output listing ELF segments (a.k.a. program headers): Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align PHDR 0x000040 0x0000000000000040 0x0000000000000040 0x0001f8 0x0001f8 R 0x8 LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x0005e4 0x0005e4 R 0x1000 LOAD 0x0005f0 0x00000000000015f0 0x00000000000015f0 0x000160 0x000160 R E 0x1000 LOAD 0x000750 0x0000000000002750 0x0000000000002750 0x000210 0x000210 RW 0x1000 LOAD 0x000960 0x0000000000003960 0x0000000000003960 0x000028 0x000029 RW 0x1000 Compare that to what is generated by GNU ld (or LLVM lld's with extra -znoseparate-code argument which disables this cleverness in the name of file size reduction): Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x000550 0x000550 R 0x1000 LOAD 0x001000 0x0000000000001000 0x0000000000001000 0x000131 0x000131 R E 0x1000 LOAD 0x002000 0x0000000000002000 0x0000000000002000 0x0000ac 0x0000ac R 0x1000 LOAD 0x002dc0 0x0000000000003dc0 0x0000000000003dc0 0x000262 0x000268 RW 0x1000 You can see from the first example above that for executable (Flg == "R E") PT_LOAD segment (LOAD #2), Offset doesn't match VirtAddr columns. And it does in the second case (GNU ld output). This is important because all the addresses, including USDT specs, operate in a virtual address space, while kernel is expecting file offsets when performing uprobe attach. So such mismatches have to be properly taken care of and compensated by libbpf, which is what this patch is fixing. Also patch clarifies few function and variable names, as well as updates comments to reflect this important distinction (virtaddr vs file offset) and to ephasize that shared libraries are not all that different from executables in this regard. This patch also changes selftests/bpf Makefile to force urand_read and liburand_read.so to be built with Clang and LLVM's lld (and explicitly request this ELF file size optimization through -znoseparate-code linker parameter) to validate libbpf logic and ensure regressions don't happen in the future. I've bundled these selftests changes together with libbpf changes to keep the above description tied with both libbpf and selftests changes. Fixes: 74cc6311cec9 ("libbpf: Add USDT notes parsing and resolution logic") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220616055543.3285835-1-andrii@kernel.org
2022-06-14	libbpf: Fix an unsigned < 0 bug	Yonghong Song	1	-1/+1
	Andrii reported a bug with the following information: 2859 if (enum64_placeholder_id == 0) { 2860 enum64_placeholder_id = btf__add_int(btf, "enum64_placeholder", 1, 0); >>> CID 394804: Control flow issues (NO_EFFECT) >>> This less-than-zero comparison of an unsigned value is never true. "enum64_placeholder_id < 0U". 2861 if (enum64_placeholder_id < 0) 2862 return enum64_placeholder_id; 2863 ... Here enum64_placeholder_id declared as '__u32' so enum64_placeholder_id < 0 is always false. Declare enum64_placeholder_id as 'int' in order to capture the potential error properly. Fixes: f2a625889bb8 ("libbpf: Add enum64 sanitization") Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220613054314.1251905-1-yhs@fb.com
2022-06-09	libbpf: Fix uprobe symbol file offset calculation logic	Andrii Nakryiko	1	-41/+22
	Fix libbpf's bpf_program__attach_uprobe() logic of determining function's file offset (which is what kernel is actually expecting) when attaching uprobe/uretprobe by function name. Previously calculation was determining virtual address offset relative to base load address, which (offset) is not always the same as file offset (though very frequently it is which is why this went unnoticed for a while). Fixes: 433966e3ae04 ("libbpf: Support function name-based attach uprobes") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Riham Selim <rihams@fb.com> Cc: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20220606220143.3796908-1-andrii@kernel.org
2022-06-07	libbpf: Add enum64 relocation support	Yonghong Song	3	-20/+48
	The enum64 relocation support is added. The bpf local type could be either enum or enum64 and the remote type could be either enum or enum64 too. The all combinations of local enum/enum64 and remote enum/enum64 are supported. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062647.3721719-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07	libbpf: Add enum64 support for bpf linking	Yonghong Song	1	-0/+2
	Add BTF_KIND_ENUM64 support for bpf linking, which is very similar to BTF_KIND_ENUM. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062642.3721494-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07	libbpf: Add enum64 sanitization	Yonghong Song	3	-5/+56
	When old kernel does not support enum64 but user space btf contains non-zero enum kflag or enum64, libbpf needs to do proper sanitization so modified btf can be accepted by the kernel. Sanitization for enum kflag can be achieved by clearing the kflag bit. For enum64, the type is replaced with an union of integer member types and the integer member size must be smaller than enum64 size. If such an integer type cannot be found, a new type is created and used for union members. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062636.3721375-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07	libbpf: Add enum64 support for btf_dump	Yonghong Song	2	-34/+108
	Add enum64 btf dumping support. For long long and unsigned long long dump, suffixes 'LL' and 'ULL' are added to avoid compilation errors in some cases. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062631.3720526-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>