linux - Linux Kernel (branches are rebased on master from time to time)

Age	Commit message (Collapse)	Author	Files	Lines
2021-03-30	selftests: xsk: Remove unused defines	Björn Töpel	2	-6/+3
	Remove two unused defines. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-18-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Remove mutex and condition variable	Björn Töpel	2	-30/+6
	The usage of the condition variable is broken, and overkill. Replace it with a pthread barrier. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-17-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Remove thread attribute	Björn Töpel	2	-9/+2
	There is really no reason to have a non-default thread stack size. Remove that. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-16-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Implement bpf_link test	Maciej Fijalkowski	3	-50/+139
	Introduce a test that is supposed to verify the persistence of BPF resources based on underlying bpf_link usage. Test will: 1) create and bind two sockets on queue ids 0 and 1 2) run a traffic on queue ids 0 3) remove xsk sockets from queue 0 on both veth interfaces 4) run a traffic on queues ids 1 Running traffic successfully on qids 1 means that BPF resources were not removed on step 3). In order to make it work, change the command that creates veth pair to have the 4 queue pairs by default. Introduce the arrays of xsks and umems to ifobject struct but keep a pointers to single entities, so rest of the logic around Rx/Tx can be kept as-is. For umem handling, double the size of mmapped space and split that between the two sockets. Rename also bidi_pass to a variable 'second_step' of a boolean type as it's now used also for the test that is introduced here and it doesn't have anything in common with bi-directional testing. Drop opt_queue command line argument as it wasn't working before anyway. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-15-maciej.fijalkowski@intel.com
2021-03-30	veth: Implement ethtool's get_channels() callback	Maciej Fijalkowski	1	-0/+12
	Libbpf's xsk part calls get_channels() API to retrieve the queue count of the underlying driver so that XSKMAP is sized accordingly. Implement that in veth so multi queue scenarios can work properly. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-14-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Remove sync_mutex_tx and atomic var	Maciej Fijalkowski	2	-26/+7
	Although thread_common_ops() are called in both Tx and Rx threads, testapp_validate() will not spawn Tx thread until Rx thread signals that it has finished its initialization via condition variable. Therefore, locking in thread_common_ops is not needed and furthermore Tx thread does not have to spin on atomic variable. Note that this simplification wouldn't be possible if there would still be a common worker thread. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-13-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Refactor teardown/bidi test cases and testapp_validate	Maciej Fijalkowski	2	-53/+78
	Currently, there is a testapp_sockets() that acts like a wrapper around testapp_validate() and it is called for bidi and teardown test types. Other test types call testapp_validate() directly. Split testapp_sockets() onto two separate functions so a bunch of bidi specific logic can be moved there and out of testapp_validate() itself. Introduce function pointer to ifobject struct which will be used for assigning the Rx/Tx function that is assigned to worker thread. Let's also have a global ifobject Rx/Tx pointers so it's easier to swap the vectors on a second run of a bi-directional test. Thread creation now is easey to follow. switching_notify variable is useless, info about vector switch can be printed based on bidi_pass state. Last but not least, init/destroy synchronization variables only once, not per each test. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-12-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Remove Tx synchronization resources	Maciej Fijalkowski	2	-10/+7
	Tx thread needs to be started after the Rx side is fully initialized so that packets are not xmitted until xsk Rx socket is ready to be used. It can be observed that atomic variable spinning_tx is not checked from Rx side in any way, so thread_common_ops can be modified to only address the spinning_rx. This means that spinning_tx can be removed altogheter. signal_tx_condition is never utilized, so simply remove it. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-11-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Split worker thread	Maciej Fijalkowski	1	-79/+77
	Let's a have a separate Tx/Rx worker threads instead of a one common thread packed with Tx/Rx specific checks. Move mmap for umem buffer space and a switch_namespace() call to thread_common_ops. This also allows for a bunch of simplifactions that are the subject of the next commits. The final result will be a code base that is much easier to follow. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-10-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Remove thread for netns switch	Maciej Fijalkowski	2	-119/+14
	Currently, there is a dedicated thread for following remote ns operations: - grabbing the ifindex of the interface moved to remote netns - removing xdp prog from that interface With bpf_link usage in place, this can be simply omitted, so remove mentioned thread, as BPF resources will be managed by bpf_link itself, so there's no further need for creating the thread that will switch to remote netns and do the cleanup. Keep most of the logic for switching the ns, though, but make switch_namespace() return the fd so that it will be possible to close it at the process termination time. Get rid of logic around making sure that it's possible to switch ns in validate_interfaces(). Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-9-maciej.fijalkowski@intel.com
2021-03-30	samples: bpf: Do not unload prog within xdpsock	Maciej Fijalkowski	1	-41/+14
	With the introduction of bpf_link in xsk's libbpf part, there's no further need for explicit unload of prog on xdpsock's termination. When process dies, the bpf_link's refcount will be decremented and resources will be unloaded/freed under the hood in case when there are no more active users. While at it, don't dump stats on error path. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-8-maciej.fijalkowski@intel.com
2021-03-30	libbpf: xsk: Use bpf_link	Maciej Fijalkowski	1	-45/+213
	Currently, if there are multiple xdpsock instances running on a single interface and in case one of the instances is terminated, the rest of them are left in an inoperable state due to the fact of unloaded XDP prog from interface. Consider the scenario below: // load xdp prog and xskmap and add entry to xskmap at idx 10 $ sudo ./xdpsock -i ens801f0 -t -q 10 // add entry to xskmap at idx 11 $ sudo ./xdpsock -i ens801f0 -t -q 11 terminate one of the processes and another one is unable to work due to the fact that the XDP prog was unloaded from interface. To address that, step away from setting bpf prog in favour of bpf_link. This means that refcounting of BPF resources will be done automatically by bpf_link itself. Provide backward compatibility by checking if underlying system is bpf_link capable. Do this by looking up/creating bpf_link on loopback device. If it failed in any way, stick with netlink-based XDP prog. therwise, use bpf_link-based logic. When setting up BPF resources during xsk socket creation, check whether bpf_link for a given ifindex already exists via set of calls to bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd and comparing the ifindexes from bpf_link and xsk socket. For case where resources exist but they are not AF_XDP related, bail out and ask user to remove existing prog and then retry. Lastly, do a bit of refactoring within __xsk_setup_xdp_prog and pull out existing code branches based on prog_id value onto separate functions that are responsible for resource initialization if prog_id was 0 and for lookup existing resources for non-zero prog_id as that implies that XDP program is present on the underlying net device. This in turn makes it easier to follow, especially the teardown part of both branches. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-7-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Simplify frame traversal in dumping thread	Maciej Fijalkowski	1	-26/+21
	Store offsets to each layer in a separate variables rather than compute them every single time. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-6-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Remove inline keyword from source file	Maciej Fijalkowski	1	-10/+7
	Follow the kernel coding style guidelines and let compiler do the decision about inlining. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-5-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Remove unused function	Maciej Fijalkowski	1	-13/+0
	Probably it was ported from xdpsock but is not used anywhere. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-4-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Remove struct ifaceconfigobj	Maciej Fijalkowski	2	-44/+30
	ifaceconfigobj is not really useful, it is possible to keep the functionality and simplify the code. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-3-maciej.fijalkowski@intel.com
2021-03-30	selftests: xsk: Don't call worker_pkt_dump() for stats test	Maciej Fijalkowski	1	-1/+1
	For TEST_TYPE_STATS, worker_pkt_validate() that places frames onto pkt_buf is not called. Therefore, when dump mode is set, don't call worker_pkt_dump() for mentioned test type, so that it won't crash on pkt_buf() access. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-2-maciej.fijalkowski@intel.com
2021-03-30	libbpf: Fix memory leak when emitting final btf_ext	Andrii Nakryiko	1	-8/+16
	Free temporary allocated memory used to construct finalized .BTF.ext data. Found by Coverity static analysis on libbpf's Github repo. Fixes: 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210327042502.969745-1-andrii@kernel.org
2021-03-30	Merge branch 'bpf: Update doc about calling kernel function'	Alexei Starovoitov	2	-0/+29
	Martin KaFai says: ==================== This set updates the document about the bpf program calling kernel function. In particular, updates are regarding to the clang requirement in selftests and kfunc-call not an ABI. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-03-30	bpf: selftests: Update clang requirement in README.rst for testing kfunc call	Martin KaFai Lau	1	-0/+14
	This patch updates the README.rst to specify the clang requirement to compile the bpf selftests that call kernel function. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210330054156.2933804-1-kafai@fb.com
2021-03-30	bpf: Update bpf_design_QA.rst to clarify the kfunc call is not ABI	Martin KaFai Lau	1	-0/+15
	This patch updates bpf_design_QA.rst to clarify that the kernel function callable by bpf program is not an ABI. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210330054150.2933542-1-kafai@fb.com
2021-03-29	bpf: tcp: Limit calling some tcp cc functions to CONFIG_DYNAMIC_FTRACE	Martin KaFai Lau	1	-0/+2
	pahole currently only generates the btf_id for external function and ftrace-able function. Some functions in the bpf_tcp_ca_kfunc_ids are static (e.g. cubictcp_init). Thus, unless CONFIG_DYNAMIC_FTRACE is set, btf_ids for those functions will not be generated and the compilation fails during resolve_btfids. This patch limits those functions to CONFIG_DYNAMIC_FTRACE. I will address the pahole generation in a followup and then remove the CONFIG_DYNAMIC_FTRACE limitation. Fixes: e78aea8b2170 ("bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc") Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Reported-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329221357.834438-1-kafai@fb.com
2021-03-28	bpf: tcp: Fix an error in the bpf_tcp_ca_kfunc_ids list	Martin KaFai Lau	1	-1/+1
	There is a typo in the bbr function, s/even/event/. This patch fixes it. Fixes: e78aea8b2170 ("bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329003213.2274210-1-kafai@fb.com
2021-03-28	bpf: tcp: Remove comma which is causing build error	Atul Gopinathan	1	-1/+1
	Currently, building the bpf-next source with the CONFIG_BPF_SYSCALL enabled is causing a compilation error: "net/ipv4/bpf_tcp_ca.c:209:28: error: expected identifier or '(' before ',' token" Fix this by removing an unnecessary comma. Fixes: e78aea8b2170 ("bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc") Reported-by: syzbot+0b74d8ec3bf0cc4e4209@syzkaller.appspotmail.com Signed-off-by: Atul Gopinathan <atulgopinathan@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210328120515.113895-1-atulgopinathan@gmail.com
2021-03-26	Merge branch 'bpf: Support calling kernel function'	Alexei Starovoitov	25	-319/+1334
	Martin KaFai says: ==================== This series adds support to allow bpf program calling kernel function. The use case included in this set is to allow bpf-tcp-cc to directly call some tcp-cc helper functions (e.g. "tcp_cong_avoid_ai()"). Those functions have already been used by some kernel tcp-cc implementations. This set will also allow the bpf-tcp-cc program to directly call the kernel tcp-cc implementation, For example, a bpf_dctcp may only want to implement its own dctcp_cwnd_event() and reuse other dctcp_*() directly from the kernel tcp_dctcp.c instead of reimplementing (or copy-and-pasting) them. The tcp-cc kernel functions mentioned above will be white listed for the struct_ops bpf-tcp-cc programs to use in a later patch. The white listed functions are not bounded to a fixed ABI contract. Those functions have already been used by the existing kernel tcp-cc. If any of them has changed, both in-tree and out-of-tree kernel tcp-cc implementations have to be changed. The same goes for the struct_ops bpf-tcp-cc programs which have to be adjusted accordingly. Please see individual patch for details. v2: - Patch 2 in v1 is removed. No need to support extern func in kernel. Changed libbpf to adjust the .ksyms datasec for extern func in patch 11. (Andrii) - Name change: btf_check_func_arg_match() and btf_check_subprog_arg_match() in patch 2. (Andrii) - Always set unreliable on any error in patch 2 since it does not matter. (Andrii) - s/kern_func/kfunc/ and s/descriptor/desc/ in this set. (Andrii) - Remove some unnecessary changes in disasm.h and disasm.c in patch 3. In particular, no need to change the function signature in bpf_insn_revmap_call_t. Also, removed the changes in print_bpf_insn(). - Fixed an issue in check_kfunc_call() when the calling kernel function returns a pointer in patch 3. Added a selftest. - Adjusted the verifier selftests due to the changes in the verifier log in patch 3. - Fixed a comparison issue in kfunc_desc_cmp_by_imm() in patch 3. (Andrii) - Name change: is_ldimm64_insn(), new helper: is_call_insn() in patch 10 (Andrii) - Move btf_func_linkage() from btf.h to libbpf.c in patch 11. (Andrii) - Fixed the linker error when CONFIG_BPF_SYSCALL is not defined. Moved the check_kfunc_call from filter.c to test_run.c in patch 14. (kernel test robot) ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-03-26	bpf: selftests: Add kfunc_call test	Martin KaFai Lau	6	-0/+183
	This patch adds a few kernel function bpf_kfunc_call_test() for the selftest's test_run purpose. They will be allowed for tc_cls prog. The selftest calling the kernel function bpf_kfunc_call_test() is also added in this patch. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210325015252.1551395-1-kafai@fb.com
2021-03-26	bpf: selftests: Bpf_cubic and bpf_dctcp calling kernel functions	Martin KaFai Lau	3	-46/+11
	This patch removes the bpf implementation of tcp_slow_start() and tcp_cong_avoid_ai(). Instead, it directly uses the kernel implementation. It also replaces the bpf_cubic_undo_cwnd implementation by directly calling tcp_reno_undo_cwnd(). bpf_dctcp also directly calls tcp_reno_cong_avoid() instead. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210325015246.1551062-1-kafai@fb.com
2021-03-26	bpf: selftests: Rename bictcp to bpf_cubic	Martin KaFai Lau	1	-15/+15
	As a similar chanage in the kernel, this patch gives the proper name to the bpf cubic. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210325015240.1550074-1-kafai@fb.com
2021-03-26	libbpf: Support extern kernel function	Martin KaFai Lau	1	-12/+162
	This patch is to make libbpf able to handle the following extern kernel function declaration and do the needed relocations before loading the bpf program to the kernel. extern int foo(struct sock *) __attribute__((section(".ksyms"))) In the collect extern phase, needed changes is made to bpf_object__collect_externs() and find_extern_btf_id() to collect extern function in ".ksyms" section. The func in the BTF datasec also needs to be replaced by an int var. The idea is similar to the existing handling in extern var. In case the BTF may not have a var, a dummy ksym var is added at the beginning of bpf_object__collect_externs() if there is func under ksyms datasec. It will also change the func linkage from extern to global which the kernel can support. It also assigns a param name if it does not have one. In the collect relo phase, it will record the kernel function call as RELO_EXTERN_FUNC. bpf_object__resolve_ksym_func_btf_id() is added to find the func btf_id of the running kernel. During actual relocation, it will patch the BPF_CALL instruction with src_reg = BPF_PSEUDO_FUNC_CALL and insn->imm set to the running kernel func's btf_id. The required LLVM patch: https://reviews.llvm.org/D93563 Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210325015234.1548923-1-kafai@fb.com
2021-03-26	libbpf: Record extern sym relocation first	Martin KaFai Lau	1	-29/+34
	This patch records the extern sym relocs first before recording subprog relocs. The later patch will have relocs for extern kernel function call which is also using BPF_JMP \| BPF_CALL. It will be easier to handle the extern symbols first in the later patch. is_call_insn() helper is added. The existing is_ldimm64() helper is renamed to is_ldimm64_insn() for consistency. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210325015227.1548623-1-kafai@fb.com
2021-03-26	libbpf: Rename RELO_EXTERN to RELO_EXTERN_VAR	Martin KaFai Lau	1	-3/+3
	This patch renames RELO_EXTERN to RELO_EXTERN_VAR. It is to avoid the confusion with a later patch adding RELO_EXTERN_FUNC. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210325015221.1547722-1-kafai@fb.com
2021-03-26	libbpf: Refactor codes for finding btf id of a kernel symbol	Martin KaFai Lau	1	-11/+33
	This patch refactors code, that finds kernel btf_id by kind and symbol name, to a new function find_ksym_btf_id(). It also adds a new helper __btf_kind_str() to return a string by the numeric kind value. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210325015214.1547069-1-kafai@fb.com
2021-03-26	libbpf: Refactor bpf_object__resolve_ksyms_btf_id	Martin KaFai Lau	1	-57/+67
	This patch refactors most of the logic from bpf_object__resolve_ksyms_btf_id() into a new function bpf_object__resolve_ksym_var_btf_id(). It is to get ready for a later patch adding bpf_object__resolve_ksym_func_btf_id() which resolves a kernel function to the running kernel btf_id. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210325015207.1546749-1-kafai@fb.com
2021-03-26	bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc	Martin KaFai Lau	1	-0/+41
	This patch puts some tcp cong helper functions, tcp_slow_start() and tcp_cong_avoid_ai(), into the allowlist for the bpf-tcp-cc program. A few tcp cc implementation functions are also put into the allowlist. A potential use case is the bpf-tcp-cc implementation may only want to override a subset of a tcp_congestion_ops. For others, the bpf-tcp-cc can directly call the kernel counter parts instead of re-implementing (or copy-and-pasting) them to the bpf program. They will only be available to the bpf-tcp-cc typed program. The allowlist functions are not bounded to a fixed ABI contract. When any of them has changed, the bpf-tcp-cc program has to be changed like any in-tree/out-of-tree kernel tcp-cc implementations do also. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210325015201.1546345-1-kafai@fb.com
2021-03-26	tcp: Rename bictcp function prefix to cubictcp	Martin KaFai Lau	1	-12/+12
	The cubic functions in tcp_cubic.c are using the bictcp prefix as in tcp_bic.c. This patch gives it the proper name cubictcp because the later patch will allow the bpf prog to directly call the cubictcp implementation. Renaming them will avoid the name collision when trying to find the intended one to call during bpf prog load time. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210325015155.1545532-1-kafai@fb.com
2021-03-26	bpf: Support kernel function call in x86-32	Martin KaFai Lau	1	-0/+198
	This patch adds kernel function call support to the x86-32 bpf jit. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210325015149.1545267-1-kafai@fb.com
2021-03-26	bpf: Support bpf program calling kernel function	Martin KaFai Lau	13	-46/+480
	This patch adds support to BPF verifier to allow bpf program calling kernel function directly. The use case included in this set is to allow bpf-tcp-cc to directly call some tcp-cc helper functions (e.g. "tcp_cong_avoid_ai()"). Those functions have already been used by some kernel tcp-cc implementations. This set will also allow the bpf-tcp-cc program to directly call the kernel tcp-cc implementation, For example, a bpf_dctcp may only want to implement its own dctcp_cwnd_event() and reuse other dctcp_() directly from the kernel tcp_dctcp.c instead of reimplementing (or copy-and-pasting) them. The tcp-cc kernel functions mentioned above will be white listed for the struct_ops bpf-tcp-cc programs to use in a later patch. The white listed functions are not bounded to a fixed ABI contract. Those functions have already been used by the existing kernel tcp-cc. If any of them has changed, both in-tree and out-of-tree kernel tcp-cc implementations have to be changed. The same goes for the struct_ops bpf-tcp-cc programs which have to be adjusted accordingly. This patch is to make the required changes in the bpf verifier. First change is in btf.c, it adds a case in "btf_check_func_arg_match()". When the passed in "btf->kernel_btf == true", it means matching the verifier regs' states with a kernel function. This will handle the PTR_TO_BTF_ID reg. It also maps PTR_TO_SOCK_COMMON, PTR_TO_SOCKET, and PTR_TO_TCP_SOCK to its kernel's btf_id. In the later libbpf patch, the insn calling a kernel function will look like: insn->code == (BPF_JMP \| BPF_CALL) insn->src_reg == BPF_PSEUDO_KFUNC_CALL / <- new in this patch / insn->imm == func_btf_id / btf_id of the running kernel */ [ For the future calling function-in-kernel-module support, an array of module btf_fds can be passed at the load time and insn->off can be used to index into this array. ] At the early stage of verifier, the verifier will collect all kernel function calls into "struct bpf_kfunc_desc". Those descriptors are stored in "prog->aux->kfunc_tab" and will be available to the JIT. Since this "add" operation is similar to the current "add_subprog()" and looking for the same insn->code, they are done together in the new "add_subprog_and_kfunc()". In the "do_check()" stage, the new "check_kfunc_call()" is added to verify the kernel function call instruction: 1. Ensure the kernel function can be used by a particular BPF_PROG_TYPE. A new bpf_verifier_ops "check_kfunc_call" is added to do that. The bpf-tcp-cc struct_ops program will implement this function in a later patch. 2. Call "btf_check_kfunc_args_match()" to ensure the regs can be used as the args of a kernel function. 3. Mark the regs' type, subreg_def, and zext_dst. At the later do_misc_fixups() stage, the new fixup_kfunc_call() will replace the insn->imm with the function address (relative to __bpf_call_base). If needed, the jit can find the btf_func_model by calling the new bpf_jit_find_kfunc_model(prog, insn). With the imm set to the function address, "bpftool prog dump xlated" will be able to display the kernel function calls the same way as it displays other bpf helper calls. gpl_compatible program is required to call kernel function. This feature currently requires JIT. The verifier selftests are adjusted because of the changes in the verbose log in add_subprog_and_kfunc(). Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210325015142.1544736-1-kafai@fb.com
2021-03-26	bpf: Refactor btf_check_func_arg_match	Martin KaFai Lau	4	-77/+95
	This patch moved the subprog specific logic from btf_check_func_arg_match() to the new btf_check_subprog_arg_match(). The core logic is left in btf_check_func_arg_match() which will be reused later to check the kernel function call. The "if (!btf_type_is_ptr(t))" is checked first to improve the indentation which will be useful for a later patch. Some of the "btf_kind_str[]" usages is replaced with the shortcut "btf_type_str(t)". Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210325015136.1544504-1-kafai@fb.com
2021-03-26	bpf: Simplify freeing logic in linfo and jited_linfo	Martin KaFai Lau	4	-28/+17
	This patch simplifies the linfo freeing logic by combining "bpf_prog_free_jited_linfo()" and "bpf_prog_free_unused_jited_linfo()" into the new "bpf_prog_jit_attempt_done()". It is a prep work for the kernel function call support. In a later patch, freeing the kernel function call descriptors will also be done in the "bpf_prog_jit_attempt_done()". "bpf_prog_free_linfo()" is removed since it is only called by "__bpf_prog_put_noref()". The kvfree() are directly called instead. It also takes this chance to s/kcalloc/kvcalloc/ for the jited_linfo allocation. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210325015130.1544323-1-kafai@fb.com
2021-03-26	libbpf: Preserve empty DATASEC BTFs during static linking	Andrii Nakryiko	1	-1/+12
	Ensure that BPF static linker preserves all DATASEC BTF types, even if some of them might not have any variable information at all. This may happen if the compiler promotes local initialized variable contents into .rodata section and there are no global or static functions in the program. For example, $ cat t.c struct t { char a; char b; char c; }; void bar(struct t*); void find() { struct t tmp = {1, 2, 3}; bar(&tmp); } $ clang -target bpf -O2 -g -S t.c .long 104 # BTF_KIND_DATASEC(id = 8) .long 251658240 # 0xf000000 .long 0 .ascii ".rodata" # string offset=104 $ clang -target bpf -O2 -g -c t.c $ readelf -S t.o \| grep data [ 4] .rodata PROGBITS 0000000000000000 00000090 Fixes: 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210326043036.3081011-1-andrii@kernel.org
2021-03-26	bpf: struct sock is declared twice in bpf_sk_storage header	Wan Jiabing	1	-1/+0
	struct sock has been declared twice, therefore remove the duplicate. Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210325070602.858024-1-wanjiabing@vivo.com
2021-03-25	bpf: Remove unused headers	Lu Wei	2	-2/+0
	The header <linux/version.h> is useless in sampleip_kern.c and trace_event_kern.c, remove it. Signed-off-by: Lu Wei <luwei32@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210324083147.149278-1-luwei32@huawei.com
2021-03-25	libbpf: Add bpf object kern_version attribute setter	Rafael David Tinoco	3	-0/+12
	Unfortunately some distros don't have their kernel version defined accurately in <linux/version.h> due to different long term support reasons. It is important to have a way to override the bpf kern_version attribute during runtime: some old kernels might still check for kern_version attribute during bpf_prog_load(). Signed-off-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210323040952.2118241-1-rafaeldtinoco@ubuntu.com
2021-03-25	Merge branch 'add support for batched ops in LPM trie'	Alexei Starovoitov	2	-0/+161
	Pedro Tammela says: ==================== The patch itself is straightforward thanks to the infrastructure that is already in-place. The tests follows the other '*_map_batch_ops' tests with minor tweaks. v1 -> v2: Fixes for checkpatch warnings ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-03-25	bpf: selftests: Add tests for batched ops in LPM trie maps	Pedro Tammela	1	-0/+158
	Uses the already existing infrastructure for testing batched ops. The testing code is essentially the same, with minor tweaks for this use case. Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210323025058.315763-3-pctammela@gmail.com
2021-03-25	bpf: Add support for batched ops in LPM trie maps	Pedro Tammela	1	-0/+3
	Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210323025058.315763-2-pctammela@gmail.com
2021-03-25	selftests/bpf: Better error messages for ima_setup.sh failures	KP Singh	1	-2/+4
	The current implementation uses the CHECK_FAIL macro which does not provide useful error messages when the script fails. Use the CHECK macro instead and provide more descriptive messages to aid debugging. Signed-off-by: KP Singh <kpsingh@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210322170720.2926715-1-kpsingh@kernel.org
2021-03-25	bpf: Fix NULL pointer dereference in bpf_get_local_storage() helper	Yonghong Song	5	-19/+86
	Jiri Olsa reported a bug ([1]) in kernel where cgroup local storage pointer may be NULL in bpf_get_local_storage() helper. There are two issues uncovered by this bug: (1). kprobe or tracepoint prog incorrectly sets cgroup local storage before prog run, (2). due to change from preempt_disable to migrate_disable, preemption is possible and percpu storage might be overwritten by other tasks. This issue (1) is fixed in [2]. This patch tried to address issue (2). The following shows how things can go wrong: task 1: bpf_cgroup_storage_set() for percpu local storage preemption happens task 2: bpf_cgroup_storage_set() for percpu local storage preemption happens task 1: run bpf program task 1 will effectively use the percpu local storage setting by task 2 which will be either NULL or incorrect ones. Instead of just one common local storage per cpu, this patch fixed the issue by permitting 8 local storages per cpu and each local storage is identified by a task_struct pointer. This way, we allow at most 8 nested preemption between bpf_cgroup_storage_set() and bpf_cgroup_storage_unset(). The percpu local storage slot is released (calling bpf_cgroup_storage_unset()) by the same task after bpf program finished running. bpf_test_run() is also fixed to use the new bpf_cgroup_storage_set() interface. The patch is tested on top of [2] with reproducer in [1]. Without this patch, kernel will emit error in 2-3 minutes. With this patch, after one hour, still no error. [1] https://lore.kernel.org/bpf/CAKH8qBuXCfUz=w8L+Fj74OaUpbosO29niYwTki7e3Ag044_aww@mail.gmail.com/T [2] https://lore.kernel.org/bpf/20210309185028.3763817-1-yhs@fb.com Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Roman Gushchin <guro@fb.com> Link: https://lore.kernel.org/bpf/20210323055146.3334476-1-yhs@fb.com
2021-03-26	libbpf: Constify few bpf_program getters	Andrii Nakryiko	2	-4/+4
	bpf_program__get_type() and bpf_program__get_expected_attach_type() shouldn't modify given bpf_program, so mark input parameter as const struct bpf_program. This eliminates unnecessary compilation warnings or explicit casts in user programs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210324172941.2609884-1-andrii@kernel.org
2021-03-26	bpf: Fix typo 'accesible' into 'accessible'	Ricardo Ribalda	1	-1/+1
	Trivial fix. Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210318202223.164873-8-ribalda@chromium.org