linux - Linux Kernel (branches are rebased on master from time to time)

Age	Commit message (Collapse)	Author	Files	Lines
2022-09-09	libbpf: Remove gcc support for bpf_tail_call_static for now	Daniel Borkmann	1	-13/+6
	This reverts commit 14e5ce79943a ("libbpf: Add GCC support for bpf_tail_call_static"). Reason is that gcc invented their own BPF asm which is not conform with LLVM one, and going forward this would be more painful to maintain here and in other areas of the library. Thus remove it; ask to gcc folks is to align with LLVM one to use exact same syntax. Fixes: 14e5ce79943a ("libbpf: Add GCC support for bpf_tail_call_static") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: James Hilliard <james.hilliard1@gmail.com> Cc: Jose E. Marchesi <jose.marchesi@oracle.com>
2022-09-07	bpf: Add helper macro bpf_for_each_reg_in_vstate	Kumar Kartikeya Dwivedi	2	-107/+49
	For a lot of use cases in future patches, we will want to modify the state of registers part of some same 'group' (e.g. same ref_obj_id). It won't just be limited to releasing reference state, but setting a type flag dynamically based on certain actions, etc. Hence, we need a way to easily pass a callback to the function that iterates over all registers in current bpf_verifier_state in all frames upto (and including) the curframe. While in C++ we would be able to easily use a lambda to pass state and the callback together, sadly we aren't using C++ in the kernel. The next best thing to avoid defining a function for each case seems like statement expressions in GNU C. The kernel already uses them heavily, hence they can passed to the macro in the style of a lambda. The statement expression will then be substituted in the for loop bodies. Variables __state and __reg are set to current bpf_func_state and reg for each invocation of the expression inside the passed in verifier state. Then, convert mark_ptr_or_null_regs, clear_all_pkt_pointers, release_reference, find_good_pkt_pointers, find_equal_scalars to use bpf_for_each_reg_in_vstate. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220904204145.3089-16-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-07	bpf: Add zero_map_value to zero map value with special fields	Kumar Kartikeya Dwivedi	1	-0/+19
	We need this helper to skip over special fields (bpf_spin_lock, bpf_timer, kptrs) while zeroing a map value. Use the same logic as copy_map_value but memset instead of memcpy. Currently, the code zeroing map value memory does not have to deal with special fields, hence this is a prerequisite for introducing such support. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220904204145.3089-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-07	bpf: Support kptrs in percpu arraymap	Kumar Kartikeya Dwivedi	2	-10/+26
	Enable support for kptrs in percpu BPF arraymap by wiring up the freeing of these kptrs from percpu map elements. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220904204145.3089-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-07	bpf: Add copy_map_value_long to copy to remote percpu memory	Kumar Kartikeya Dwivedi	1	-19/+33
	bpf_long_memcpy is used while copying to remote percpu regions from BPF syscall and helpers, so that the copy is atomic at word size granularity. This might not be possible when you copy from map value hosting kptrs from or to percpu maps, as the alignment or size in disjoint regions may not be multiple of word size. Hence, to avoid complicating the copy loop, we only use bpf_long_memcpy when special fields are not present, otherwise use normal memcpy to copy the disjoint regions. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220904204145.3089-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-07	bpf: Fix resetting logic for unreferenced kptrs	Jules Irenge	1	-1/+1
	Sparse reported a warning at bpf_map_free_kptrs() "warning: Using plain integer as NULL pointer" During the process of fixing this warning, it was discovered that the current code erroneously writes to the pointer variable instead of deferencing and writing to the actual kptr. Hence, Sparse tool accidentally helped to uncover this problem. Fix this by doing WRITE_ONCE(*p, 0) instead of WRITE_ONCE(p, 0). Note that the effect of this bug is that unreferenced kptrs will not be cleared during check_and_free_fields. It is not a problem if the clearing is not done during map_free stage, as there is nothing to free for them. Fixes: 14a324f6a67e ("bpf: Wire up freeing of referenced kptr") Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Link: https://lore.kernel.org/r/Yxi3pJaK6UDjVJSy@playground Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-07	Merge branch 'bpf-core changes for preparation of HID-bpf'	Alexei Starovoitov	10	-60/+678
	Benjamin Tissoires says: ==================== Hi, well, given that the HID changes haven't moved a lot in the past revisions and that I am cc-ing a bunch of people, I have dropped them while we focus on the last 2 requirements in bpf-core changes. I'll submit a HID targeted series when we get these in tree, which would make things a lore more independent. For reference, the whole reasons for these 2 main changes are at https://lore.kernel.org/bpf/20220902132938.2409206-1-benjamin.tissoires@redhat.com/ Compared to v10 (in addition of dropping the HID changes), I have changed the selftests so we can test both light skeletons and libbbpf calls. Cheers, Benjamin ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-07	selftests/bpf: Add tests for kfunc returning a memory pointer	Benjamin Tissoires	4	-0/+197
	We add 2 new kfuncs that are following the RET_PTR_TO_MEM capability from the previous commit. Then we test them in selftests: the first tests are testing valid case, and are not failing, and the later ones are actually preventing the program to be loaded because they are wrong. To work around that, we mark the failing ones as not autoloaded (with SEC("?tc")), and we manually enable them one by one, ensuring the verifier rejects them. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20220906151303.2780789-8-benjamin.tissoires@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-07	bpf/verifier: allow kfunc to return an allocated mem	Benjamin Tissoires	5	-34/+133
	For drivers (outside of network), the incoming data is not statically defined in a struct. Most of the time the data buffer is kzalloc-ed and thus we can not rely on eBPF and BTF to explore the data. This commit allows to return an arbitrary memory, previously allocated by the driver. An interesting extra point is that the kfunc can mark the exported memory region as read only or read/write. So, when a kfunc is not returning a pointer to a struct but to a plain type, we can consider it is a valid allocated memory assuming that: - one of the arguments is either called rdonly_buf_size or rdwr_buf_size - and this argument is a const from the caller point of view We can then use this parameter as the size of the allocated memory. The memory is either read-only or read-write based on the name of the size parameter. Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20220906151303.2780789-7-benjamin.tissoires@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-07	bpf/btf: bump BTF_KFUNC_SET_MAX_CNT	Benjamin Tissoires	1	-1/+1
	net/bpf/test_run.c is already presenting 20 kfuncs. net/netfilter/nf_conntrack_bpf.c is also presenting an extra 10 kfuncs. Given that all the kfuncs are regrouped into one unique set, having only 2 space left prevent us to add more selftests. Bump it to 256. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20220906151303.2780789-6-benjamin.tissoires@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-07	selftests/bpf: add test for accessing ctx from syscall program type	Benjamin Tissoires	4	-7/+214
	We need to also export the kfunc set to the syscall program type, and then add a couple of eBPF programs that are testing those calls. The first one checks for valid access, and the second one is OK from a static analysis point of view but fails at run time because we are trying to access outside of the allocated memory. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20220906151303.2780789-5-benjamin.tissoires@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-07	bpf/verifier: allow all functions to read user provided context	Benjamin Tissoires	1	-0/+19
	When a function was trying to access data from context in a syscall eBPF program, the verifier was rejecting the call unless it was accessing the first element. This is because the syscall context is not known at compile time, and so we need to check this when actually accessing it. Check for the valid memory access if there is no convert_ctx callback, and allow such situation to happen. Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20220906151303.2780789-4-benjamin.tissoires@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-07	bpf: split btf_check_subprog_arg_match in two	Benjamin Tissoires	3	-6/+52
	btf_check_subprog_arg_match() was used twice in verifier.c: - when checking for the type mismatches between a (sub)prog declaration and BTF - when checking the call of a subprog to see if the provided arguments are correct and valid This is problematic when we check if the first argument of a program (pointer to ctx) is correctly accessed: To be able to ensure we access a valid memory in the ctx, the verifier assumes the pointer to context is not null. This has the side effect of marking the program accessing the entire context, even if the context is never dereferenced. For example, by checking the context access with the current code, the following eBPF program would fail with -EINVAL if the ctx is set to null from the userspace: ``` SEC("syscall") int prog(struct my_ctx *args) { return 0; } ``` In that particular case, we do not want to actually check that the memory is correct while checking for the BTF validity, but we just want to ensure that the (sub)prog definition matches the BTF we have. So split btf_check_subprog_arg_match() in two so we can actually check for the memory used when in a call, and ignore that part when not. Note that a further patch is in preparation to disentangled btf_check_func_arg_match() from these two purposes, and so right now we just add a new hack around that by adding a boolean to this function. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220906151303.2780789-3-benjamin.tissoires@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-07	selftests/bpf: regroup and declare similar kfuncs selftests in an array	Benjamin Tissoires	2	-18/+68
	Similar to tools/testing/selftests/bpf/prog_tests/dynptr.c: we declare an array of tests that we run one by one in a for loop. Followup patches will add more similar-ish tests, so avoid a lot of copy paste by grouping the declaration in an array. For light skeletons, we have to rely on the offsetof() macro so we can statically declare which program we are using. In the libbpf case, we can rely on bpf_object__find_program_by_name(). So also change the Makefile to generate both light skeletons and normal ones. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220906151303.2780789-2-benjamin.tissoires@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-06	Merge branch 'bpf: Support struct argument for trampoline base progs'	Alexei Starovoitov	12	-34/+424
	Yonghong Song says: ==================== Currently struct arguments are not supported for trampoline based progs. One of major reason is that struct argument may pass by value which may use more than one registers. This breaks trampoline progs where each argument is assumed to take one register. bcc community reported the issue ([1]) where struct argument is not supported for fentry program. typedef struct { uid_t val; } kuid_t; typedef struct { gid_t val; } kgid_t; int security_path_chown(struct path path, kuid_t uid, kgid_t gid); Inside Meta, we also have a use case to attach to tcp_setsockopt() typedef struct { union { void kernel; void __user user; }; bool is_kernel : 1; } sockptr_t; int tcp_setsockopt(struct sock sk, int level, int optname, sockptr_t optval, unsigned int optlen); This patch added struct value support for bpf tracing programs which uses trampoline. Only <= 16 byte struct size is supported for now which covers use cases in the above. For x86/arm64/bpf, <= 16 struct value will be passed in registers instead of by reference. Only x86_64 is supported in this patch. arm64 support can be added later. [1] https://github.com/iovisor/bcc/issues/3657 Changelog: v3 -> v4: - fix a test failure where no casting for bpf_get_func_arg() value as the value type is 'int'. - add tracing_struct test in DENYLIST.s390x - simplify macro BPF_REG_CNT for BPF_PROG2. v2 -> v3: - previously struct arguments (<= 16 bytes) are passed by reference for bpf programs. Suggested by Alexei, it is passed by value now. - in order to support passing <= 16 struct value, a new macro BPF_PROG2 is invented. rfc v1 -> v2: - changed bpf_func_model struct info fields to arg_flags[] to make it easy to iterate arguments in arch specific {save\|restore}_regs() functions. - added fexit tests to test return values with struct arguments. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-06	selftests/bpf: Add tracing_struct test in DENYLIST.s390x	Yonghong Song	1	-0/+1
	Add tracing_struct test in DENYLIST.s390x since s390x does not support trampoline now. Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220831152723.2081551-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-06	selftests/bpf: Use BPF_PROG2 for some fentry programs without struct arguments	Yonghong Song	1	-2/+2
	Use BPF_PROG2 instead of BPF_PROG for programs in progs/timer.c to test BPF_PROG2 for cases without struct arguments. Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220831152718.2081091-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-06	selftests/bpf: Add struct argument tests with fentry/fexit programs.	Yonghong Song	3	-0/+231
	Add various struct argument tests with fentry/fexit programs. Also add one test with a kernel func which does not have any argument to test BPF_PROG2 macro in such situation. Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220831152713.2080039-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-06	libbpf: Add new BPF_PROG2 macro	Yonghong Song	1	-0/+79
	To support struct arguments in trampoline based programs, existing BPF_PROG doesn't work any more since the type size is needed to find whether a parameter takes one or two registers. So this patch added a new BPF_PROG2 macro to support such trampoline programs. The idea is suggested by Andrii. For example, if the to-be-traced function has signature like typedef struct { void *x; int t; } sockptr; int blah(sockptr x, char y); In the new BPF_PROG2 macro, the argument can be represented as __bpf_prog_call( ({ union { struct { __u64 x, y; } ___z; sockptr x; } ___tmp = { .___z = { ctx[0], ctx[1] }}; ___tmp.x; }), ({ union { struct { __u8 x; } ___z; char y; } ___tmp = { .___z = { ctx[2] }}; ___tmp.y; })); In the above, the values stored on the stack are properly assigned to the actual argument type value by using 'union' magic. Note that the macro also works even if no arguments are with struct types. Note that new BPF_PROG2 works for both llvm16 and pre-llvm16 compilers where llvm16 supports bpf target passing value with struct up to 16 byte size and pre-llvm16 will pass by reference by storing values on the stack. With static functions with struct argument as always inline, the compiler is able to optimize and remove additional stack saving of struct values. Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220831152707.2079473-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-06	bpf: arm64: No support of struct argument in trampoline programs	Yonghong Song	1	-1/+7
	ARM64 does not support struct argument for trampoline based bpf programs yet. Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220831152702.2079066-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-06	bpf: Update descriptions for helpers bpf_get_func_arg[_cnt]()	Yonghong Song	2	-8/+10
	Now instead of the number of arguments, the number of registers holding argument values are stored in trampoline. Update the description of bpf_get_func_arg[_cnt]() helpers. Previous programs without struct arguments should continue to work as usual. Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220831152657.2078805-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-06	bpf: x86: Support in-register struct arguments in trampoline programs	Yonghong Song	1	-17/+51
	In C, struct value can be passed as a function argument. For small structs, struct value may be passed in one or more registers. For trampoline based bpf programs, this would cause complication since one-to-one mapping between function argument and arch argument register is not valid any more. The latest llvm16 added bpf support to pass by values for struct up to 16 bytes ([1]). This is also true for x86_64 architecture where two registers will hold the struct value if the struct size is >8 and <= 16. This may not be true if one of struct member is 'double' type but in current linux source code we don't have such instance yet, so we assume all >8 && <= 16 struct holds two general purpose argument registers. Also change on-stack nr_args value to the number of registers holding the arguments. This will permit bpf_get_func_arg() helper to get all argument values. [1] https://reviews.llvm.org/D132144 Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220831152652.2078600-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-06	bpf: Allow struct argument in trampoline based programs	Yonghong Song	2	-6/+43
	Allow struct argument in trampoline based programs where the struct size should be <= 16 bytes. In such cases, the argument will be put into up to 2 registers for bpf, x86_64 and arm64 architectures. To support arch-specific trampoline manipulation, add arg_flags for additional struct information about arguments in btf_func_model. Such information will be used in arch specific function arch_prepare_bpf_trampoline() to prepare argument access properly in trampoline. Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220831152646.2078089-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-06	bpf: Replace __ksize with ksize.	Alexei Starovoitov	1	-1/+1
	__ksize() was made private. Use ksize() instead. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-06	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	Paolo Abeni	159	-1358/+5225
	Daniel Borkmann says: ==================== pull-request: bpf-next 2022-09-05 The following pull-request contains BPF updates for your net-next tree. We've added 106 non-merge commits during the last 18 day(s) which contain a total of 159 files changed, 5225 insertions(+), 1358 deletions(-). There are two small merge conflicts, resolve them as follows: 1) tools/testing/selftests/bpf/DENYLIST.s390x Commit 27e23836ce22 ("selftests/bpf: Add lru_bug to s390x deny list") in bpf tree was needed to get BPF CI green on s390x, but it conflicted with newly added tests on bpf-next. Resolve by adding both hunks, result: [...] lru_bug # prog 'printk': failed to auto-attach: -524 setget_sockopt # attach unexpected error: -524 (trampoline) cb_refs # expected error message unexpected error: -524 (trampoline) cgroup_hierarchical_stats # JIT does not support calling kernel function (kfunc) htab_update # failed to attach: ERROR: strerror_r(-524)=22 (trampoline) [...] 2) net/core/filter.c Commit 1227c1771dd2 ("net: Fix data-races around sysctl_[rw]mem_(max\|default).") from net tree conflicts with commit 29003875bd5b ("bpf: Change bpf_setsockopt(SOL_SOCKET) to reuse sk_setsockopt()") from bpf-next tree. Take the code as it is from bpf-next tree, result: [...] if (getopt) { if (optname == SO_BINDTODEVICE) return -EINVAL; return sk_getsockopt(sk, SOL_SOCKET, optname, KERNEL_SOCKPTR(optval), KERNEL_SOCKPTR(optlen)); } return sk_setsockopt(sk, SOL_SOCKET, optname, KERNEL_SOCKPTR(optval), optlen); [...] The main changes are: 1) Add any-context BPF specific memory allocator which is useful in particular for BPF tracing with bonus of performance equal to full prealloc, from Alexei Starovoitov. 2) Big batch to remove duplicated code from bpf_{get,set}sockopt() helpers as an effort to reuse the existing core socket code as much as possible, from Martin KaFai Lau. 3) Extend BPF flow dissector for BPF programs to just augment the in-kernel dissector with custom logic. In other words, allow for partial replacement, from Shmulik Ladkani. 4) Add a new cgroup iterator to BPF with different traversal options, from Hao Luo. 5) Support for BPF to collect hierarchical cgroup statistics efficiently through BPF integration with the rstat framework, from Yosry Ahmed. 6) Support bpf_{g,s}et_retval() under more BPF cgroup hooks, from Stanislav Fomichev. 7) BPF hash table and local storages fixes under fully preemptible kernel, from Hou Tao. 8) Add various improvements to BPF selftests and libbpf for compilation with gcc BPF backend, from James Hilliard. 9) Fix verifier helper permissions and reference state management for synchronous callbacks, from Kumar Kartikeya Dwivedi. 10) Add support for BPF selftest's xskxceiver to also be used against real devices that support MAC loopback, from Maciej Fijalkowski. 11) Various fixes to the bpf-helpers(7) man page generation script, from Quentin Monnet. 12) Document BPF verifier's tnum_in(tnum_range(), ...) gotchas, from Shung-Hsi Yu. 13) Various minor misc improvements all over the place. https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (106 commits) bpf: Optimize rcu_barrier usage between hash map and bpf_mem_alloc. bpf: Remove usage of kmem_cache from bpf_mem_cache. bpf: Remove prealloc-only restriction for sleepable bpf programs. bpf: Prepare bpf_mem_alloc to be used by sleepable bpf programs. bpf: Remove tracing program restriction on map types bpf: Convert percpu hash map to per-cpu bpf_mem_alloc. bpf: Add percpu allocation support to bpf_mem_alloc. bpf: Batch call_rcu callbacks instead of SLAB_TYPESAFE_BY_RCU. bpf: Adjust low/high watermarks in bpf_mem_cache bpf: Optimize call_rcu in non-preallocated hash map. bpf: Optimize element count in non-preallocated hash map. bpf: Relax the requirement to use preallocated hash maps in tracing progs. samples/bpf: Reduce syscall overhead in map_perf_test. selftests/bpf: Improve test coverage of test_maps bpf: Convert hash map to bpf_mem_alloc. bpf: Introduce any context BPF specific memory allocator. selftest/bpf: Add test for bpf_getsockopt() bpf: Change bpf_getsockopt(SOL_IPV6) to reuse do_ipv6_getsockopt() bpf: Change bpf_getsockopt(SOL_IP) to reuse do_ip_getsockopt() bpf: Change bpf_getsockopt(SOL_TCP) to reuse do_tcp_getsockopt() ... ==================== Link: https://lore.kernel.org/r/20220905161136.9150-1-daniel@iogearbox.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-06	net: moxa: fix endianness-related issues from 'sparse'	Sergei Antonov	1	-2/+2
	Sparse checker found two endianness-related issues: .../moxart_ether.c:34:15: warning: incorrect type in assignment (different base types) .../moxart_ether.c:34:15: expected unsigned int [usertype] .../moxart_ether.c:34:15: got restricted __le32 [usertype] .../moxart_ether.c:39:16: warning: cast to restricted __le32 Fix them by using __le32 type instead of u32. Signed-off-by: Sergei Antonov <saproj@gmail.com> Link: https://lore.kernel.org/r/20220902125037.1480268-1-saproj@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-06	net: ftmac100: fix endianness-related issues from 'sparse'	Sergei Antonov	1	-6/+6
	Sparse found a number of endianness-related issues of these kinds: .../ftmac100.c:192:32: warning: restricted __le32 degrades to integer .../ftmac100.c:208:23: warning: incorrect type in assignment (different base types) .../ftmac100.c:208:23: expected unsigned int rxdes0 .../ftmac100.c:208:23: got restricted __le32 [usertype] .../ftmac100.c:249:23: warning: invalid assignment: &= .../ftmac100.c:249:23: left side has type unsigned int .../ftmac100.c:249:23: right side has type restricted __le32 .../ftmac100.c:527:16: warning: cast to restricted __le32 Change type of some fields from 'unsigned int' to '__le32' to fix it. Signed-off-by: Sergei Antonov <saproj@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220902113749.1408562-1-saproj@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-06	net: lan966x: Extend lan966x with RGMII support	Horatiu Vultur	2	-0/+4
	Extend lan966x with RGMII support. The MAC supports all RGMII_* modes. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Link: https://lore.kernel.org/r/20220902111548.614525-1-horatiu.vultur@microchip.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-05	r8169: remove not needed net_ratelimit() check	Heiner Kallweit	1	-2/+1
	We're not in a hot path and don't want to miss this message, therefore remove the net_ratelimit() check. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-05	netlink: Bounds-check struct nlmsgerr creation	Kees Cook	2	-6/+10
	In preparation for FORTIFY_SOURCE doing bounds-check on memcpy(), switch from __nlmsg_put to nlmsg_put(), and explain the bounds check for dealing with the memcpy() across a composite flexible array struct. Avoids this future run-time warning: memcpy: detected field-spanning write (size 32) of single field "&errmsg->msg" at net/netlink/af_netlink.c:2447 (size 16) Cc: Jakub Kicinski <kuba@kernel.org> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Jozsef Kadlecsik <kadlec@netfilter.org> Cc: Florian Westphal <fw@strlen.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: syzbot <syzkaller@googlegroups.com> Cc: netfilter-devel@vger.kernel.org Cc: coreteam@netfilter.org Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220901071336.1418572-1-keescook@chromium.org Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-05	Merge branch 'bpf-allocator'	Daniel Borkmann	10	-134/+820
	Alexei Starovoitov says: ==================== Introduce any context BPF specific memory allocator. Tracing BPF programs can attach to kprobe and fentry. Hence they run in unknown context where calling plain kmalloc() might not be safe. Front-end kmalloc() with per-cpu cache of free elements. Refill this cache asynchronously from irq_work. Major achievements enabled by bpf_mem_alloc: - Dynamically allocated hash maps used to be 10 times slower than fully preallocated. With bpf_mem_alloc and subsequent optimizations the speed of dynamic maps is equal to full prealloc. - Tracing bpf programs can use dynamically allocated hash maps. Potentially saving lots of memory. Typical hash map is sparsely populated. - Sleepable bpf programs can used dynamically allocated hash maps. Future work: - Expose bpf_mem_alloc as uapi FD to be used in dynptr_alloc, kptr_alloc - Convert lru map to bpf_mem_alloc - Further cleanup htab code. Example: htab_use_raw_lock can be removed. Changelog: v5->v6: - Debugged the reason for selftests/bpf/test_maps ooming in a small VM that BPF CI is using. Added patch 16 that optimizes the usage of rcu_barrier-s between bpf_mem_alloc and hash map. It drastically improved the speed of htab destruction. v4->v5: - Fixed missing migrate_disable in hash tab free path (Daniel) - Replaced impossible "memory leak" with WARN_ON_ONCE (Martin) - Dropped sysctl kernel.bpf_force_dyn_alloc patch (Daniel) - Added Andrii's ack - Added new patch 15 that removes kmem_cache usage from bpf_mem_alloc. It saves memory, speeds up map create/destroy operations while maintains hash map update/delete performance. v3->v4: - fix build issue due to missing local.h on 32-bit arch - add Kumar's ack - proposal for next steps from Delyan: https://lore.kernel.org/bpf/d3f76b27f4e55ec9e400ae8dcaecbb702a4932e8.camel@fb.com/ v2->v3: - Rewrote the free_list algorithm based on discussions with Kumar. Patch 1. - Allowed sleepable bpf progs use dynamically allocated maps. Patches 13 and 14. - Added sysctl to force bpf_mem_alloc in hash map even if pre-alloc is requested to reduce memory consumption. Patch 15. - Fix: zero-fill percpu allocation - Single rcu_barrier at the end instead of each cpu during bpf_mem_alloc destruction v2 thread: https://lore.kernel.org/bpf/20220817210419.95560-1-alexei.starovoitov@gmail.com/ v1->v2: - Moved unsafe direct call_rcu() from hash map into safe place inside bpf_mem_alloc. Patches 7 and 9. - Optimized atomic_inc/dec in hash map with percpu_counter. Patch 6. - Tuned watermarks per allocation size. Patch 8 - Adopted this approach to per-cpu allocation. Patch 10. - Fully converted hash map to bpf_mem_alloc. Patch 11. - Removed tracing prog restriction on map types. Combination of all patches and final patch 12. v1 thread: https://lore.kernel.org/bpf/20220623003230.37497-1-alexei.starovoitov@gmail.com/ LWN article: https://lwn.net/Articles/899274/ ==================== Link: https://lore.kernel.org/r/ Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2022-09-05	bpf: Optimize rcu_barrier usage between hash map and bpf_mem_alloc.	Alexei Starovoitov	3	-19/+69
	User space might be creating and destroying a lot of hash maps. Synchronous rcu_barrier-s in a destruction path of hash map delay freeing of hash buckets and other map memory and may cause artificial OOM situation under stress. Optimize rcu_barrier usage between bpf hash map and bpf_mem_alloc: - remove rcu_barrier from hash map, since htab doesn't use call_rcu directly and there are no callback to wait for. - bpf_mem_alloc has call_rcu_in_progress flag that indicates pending callbacks. Use it to avoid barriers in fast path. - When barriers are needed copy bpf_mem_alloc into temp structure and wait for rcu barrier-s in the worker to let the rest of hash map freeing to proceed. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220902211058.60789-17-alexei.starovoitov@gmail.com
2022-09-05	bpf: Remove usage of kmem_cache from bpf_mem_cache.	Alexei Starovoitov	1	-36/+14
	For bpf_mem_cache based hash maps the following stress test: for (i = 1; i <= 512; i <<= 1) for (j = 1; j <= 1 << 18; j <<= 1) fd = bpf_map_create(BPF_MAP_TYPE_HASH, NULL, i, j, 2, 0); creates many kmem_cache-s that are not mergeable in debug kernels and consume unnecessary amount of memory. Turned out bpf_mem_cache's free_list logic does batching well, so usage of kmem_cache for fixes size allocations doesn't bring any performance benefits vs normal kmalloc. Hence get rid of kmem_cache in bpf_mem_cache. That saves memory, speeds up map create/destroy operations, while maintains hash map update/delete performance. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220902211058.60789-16-alexei.starovoitov@gmail.com
2022-09-05	bpf: Remove prealloc-only restriction for sleepable bpf programs.	Alexei Starovoitov	1	-23/+0
	Since hash map is now converted to bpf_mem_alloc and it's waiting for rcu and rcu_tasks_trace GPs before freeing elements into global memory slabs it's safe to use dynamically allocated hash maps in sleepable bpf programs. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-15-alexei.starovoitov@gmail.com
2022-09-05	bpf: Prepare bpf_mem_alloc to be used by sleepable bpf programs.	Alexei Starovoitov	1	-1/+14
	Use call_rcu_tasks_trace() to wait for sleepable progs to finish. Then use call_rcu() to wait for normal progs to finish and finally do free_one() on each element when freeing objects into global memory pool. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-14-alexei.starovoitov@gmail.com
2022-09-05	bpf: Remove tracing program restriction on map types	Alexei Starovoitov	1	-42/+0
	The hash map is now fully converted to bpf_mem_alloc. Its implementation is not allocating synchronously and not calling call_rcu() directly. It's now safe to use non-preallocated hash maps in all types of tracing programs including BPF_PROG_TYPE_PERF_EVENT that runs out of NMI context. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-13-alexei.starovoitov@gmail.com
2022-09-05	bpf: Convert percpu hash map to per-cpu bpf_mem_alloc.	Alexei Starovoitov	1	-26/+19
	Convert dynamic allocations in percpu hash map from alloc_percpu() to bpf_mem_cache_alloc() from per-cpu bpf_mem_alloc. Since bpf_mem_alloc frees objects after RCU gp the call_rcu() is removed. pcpu_init_value() now needs to zero-fill per-cpu allocations, since dynamically allocated map elements are now similar to full prealloc, since alloc_percpu() is not called inline and the elements are reused in the freelist. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-12-alexei.starovoitov@gmail.com
2022-09-05	bpf: Add percpu allocation support to bpf_mem_alloc.	Alexei Starovoitov	3	-7/+41
	Extend bpf_mem_alloc to cache free list of fixed size per-cpu allocations. Once such cache is created bpf_mem_cache_alloc() will return per-cpu objects. bpf_mem_cache_free() will free them back into global per-cpu pool after observing RCU grace period. per-cpu flavor of bpf_mem_alloc is going to be used by per-cpu hash maps. The free list cache consists of tuples { llist_node, per-cpu pointer } Unlike alloc_percpu() that returns per-cpu pointer the bpf_mem_cache_alloc() returns a pointer to per-cpu pointer and bpf_mem_cache_free() expects to receive it back. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-11-alexei.starovoitov@gmail.com
2022-09-05	bpf: Batch call_rcu callbacks instead of SLAB_TYPESAFE_BY_RCU.	Alexei Starovoitov	2	-4/+66
	SLAB_TYPESAFE_BY_RCU makes kmem_caches non mergeable and slows down kmem_cache_destroy. All bpf_mem_cache are safe to share across different maps and programs. Convert SLAB_TYPESAFE_BY_RCU to batched call_rcu. This change solves the memory consumption issue, avoids kmem_cache_destroy latency and keeps bpf hash map performance the same. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-10-alexei.starovoitov@gmail.com
2022-09-05	bpf: Adjust low/high watermarks in bpf_mem_cache	Alexei Starovoitov	1	-14/+36
	The same low/high watermarks for every bucket in bpf_mem_cache consume significant amount of memory. Preallocating 64 elements of 4096 bytes each in the free list is not efficient. Make low/high watermarks and batching value dependent on element size. This change brings significant memory savings. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-9-alexei.starovoitov@gmail.com
2022-09-05	bpf: Optimize call_rcu in non-preallocated hash map.	Alexei Starovoitov	3	-14/+7
	Doing call_rcu() million times a second becomes a bottle neck. Convert non-preallocated hash map from call_rcu to SLAB_TYPESAFE_BY_RCU. The rcu critical section is no longer observed for one htab element which makes non-preallocated hash map behave just like preallocated hash map. The map elements are released back to kernel memory after observing rcu critical section. This improves 'map_perf_test 4' performance from 100k events per second to 250k events per second. bpf_mem_alloc + percpu_counter + typesafe_by_rcu provide 10x performance boost to non-preallocated hash map and make it within few % of preallocated map while consuming fraction of memory. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-8-alexei.starovoitov@gmail.com
2022-09-05	bpf: Optimize element count in non-preallocated hash map.	Alexei Starovoitov	1	-8/+62
	The atomic_inc/dec might cause extreme cache line bouncing when multiple cpus access the same bpf map. Based on specified max_entries for the hash map calculate when percpu_counter becomes faster than atomic_t and use it for such maps. For example samples/bpf/map_perf_test is using hash map with max_entries 1000. On a system with 16 cpus the 'map_perf_test 4' shows 14k events per second using atomic_t. On a system with 15 cpus it shows 100k events per second using percpu. map_perf_test is an extreme case where all cpus colliding on atomic_t which causes extreme cache bouncing. Note that the slow path of percpu_counter is 5k events per secound vs 14k for atomic, so the heuristic is necessary. See comment in the code why the heuristic is based on num_online_cpus(). Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-7-alexei.starovoitov@gmail.com
2022-09-05	bpf: Relax the requirement to use preallocated hash maps in tracing progs.	Alexei Starovoitov	1	-9/+22
	Since bpf hash map was converted to use bpf_mem_alloc it is safe to use from tracing programs and in RT kernels. But per-cpu hash map is still using dynamic allocation for per-cpu map values, hence keep the warning for this map type. In the future alloc_percpu_gfp can be front-end-ed with bpf_mem_cache and this restriction will be completely lifted. perf_event (NMI) bpf programs have to use preallocated hash maps, because free_htab_elem() is using call_rcu which might crash if re-entered. Sleepable bpf programs have to use preallocated hash maps, because life time of the map elements is not protected by rcu_read_lock/unlock. This restriction can be lifted in the future as well. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-6-alexei.starovoitov@gmail.com
2022-09-05	samples/bpf: Reduce syscall overhead in map_perf_test.	Alexei Starovoitov	2	-17/+29
	Make map_perf_test for preallocated and non-preallocated hash map spend more time inside bpf program to focus performance analysis on the speed of update/lookup/delete operations performed by bpf program. It makes 'perf report' of bpf_mem_alloc look like: 11.76% map_perf_test [k] _raw_spin_lock_irqsave 11.26% map_perf_test [k] htab_map_update_elem 9.70% map_perf_test [k] _raw_spin_lock 9.47% map_perf_test [k] htab_map_delete_elem 8.57% map_perf_test [k] memcpy_erms 5.58% map_perf_test [k] alloc_htab_elem 4.09% map_perf_test [k] __htab_map_lookup_elem 3.44% map_perf_test [k] syscall_exit_to_user_mode 3.13% map_perf_test [k] lookup_nulls_elem_raw 3.05% map_perf_test [k] migrate_enable 3.04% map_perf_test [k] memcmp 2.67% map_perf_test [k] unit_free 2.39% map_perf_test [k] lookup_elem_raw Reduce default iteration count as well to make 'map_perf_test' quick enough even on debug kernels. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-5-alexei.starovoitov@gmail.com
2022-09-05	selftests/bpf: Improve test coverage of test_maps	Alexei Starovoitov	1	-14/+24
	Make test_maps more stressful with more parallelism in update/delete/lookup/walk including different value sizes. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-4-alexei.starovoitov@gmail.com
2022-09-05	bpf: Convert hash map to bpf_mem_alloc.	Alexei Starovoitov	1	-5/+16
	Convert bpf hash map to use bpf memory allocator. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-3-alexei.starovoitov@gmail.com
2022-09-05	bpf: Introduce any context BPF specific memory allocator.	Alexei Starovoitov	3	-1/+507
	Tracing BPF programs can attach to kprobe and fentry. Hence they run in unknown context where calling plain kmalloc() might not be safe. Front-end kmalloc() with minimal per-cpu cache of free elements. Refill this cache asynchronously from irq_work. BPF programs always run with migration disabled. It's safe to allocate from cache of the current cpu with irqs disabled. Free-ing is always done into bucket of the current cpu as well. irq_work trims extra free elements from buckets with kfree and refills them with kmalloc, so global kmalloc logic takes care of freeing objects allocated by one cpu and freed on another. struct bpf_mem_alloc supports two modes: - When size != 0 create kmem_cache and bpf_mem_cache for each cpu. This is typical bpf hash map use case when all elements have equal size. - When size == 0 allocate 11 bpf_mem_cache-s for each cpu, then rely on kmalloc/kfree. Max allocation size is 4096 in this case. This is bpf_dynptr and bpf_kptr use case. bpf_mem_alloc/bpf_mem_free are bpf specific 'wrappers' of kmalloc/kfree. bpf_mem_cache_alloc/bpf_mem_cache_free are 'wrappers' of kmem_cache_alloc/kmem_cache_free. The allocators are NMI-safe from bpf programs only. They are not NMI-safe in general. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220902211058.60789-2-alexei.starovoitov@gmail.com
2022-09-05	net: phy: Add 1000BASE-KX interface mode	Sean Anderson	4	-0/+12
	Add 1000BASE-KX interface mode. This 1G backplane ethernet as described in clause 70. Clause 73 autonegotiation is mandatory, and only full duplex operation is supported. Although at the PMA level this interface mode is identical to 1000BASE-X, it uses a different form of in-band autonegation. This justifies a separate interface mode, since the interface mode (along with the MLO_AN_* autonegotiation mode) sets the type of autonegotiation which will be used on a link. This results in more than just electrical differences between the link modes. With regard to 1000BASE-X, 1000BASE-KX holds a similar position to SGMII: same signaling, but different autonegotiation. PCS drivers (which typically handle in-band autonegotiation) may only support 1000BASE-X, and not 1000BASE-KX. Similarly, the phy mode is used to configure serdes phys with phy_set_mode_ext. Due to the different electrical standards (SFI or XFI vs Clause 70), they will likely want to use different configuration. Adding a phy interface mode for 1000BASE-KX helps simplify configuration in these areas. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-05	Merge branch 'dpaa-cleanups'	David S. Miller	13	-613/+507
	Sean Anderson says: ==================== net: dpaa: Cleanups in preparation for phylink conversion (part 2) This series contains several cleanup patches for dpaa/fman. While they are intended to prepare for a phylink conversion, they stand on their own. This series was originally submitted as part of [1]. [1] https://lore.kernel.org/netdev/20220715215954.1449214-1-sean.anderson@seco.com Changes in v5: - Reduce line length of tgec_config - Reduce line length of qman_update_cgr_safe - Rebase onto net-next/master Changes in v4: - weer -> were - tricy -> tricky - Use mac_dev for calling change_addr - qman_cgr_create -> qman_create_cgr Changes in v2: - Fix prototype for dtsec_initialization - Fix warning if sizeof(void *) != sizeof(resource_size_t) - Specify type of mac_dev for exception_cb - Add helper for sanity checking cgr ops - Add CGR update function - Adjust queue depth on rate change ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-05	net: dpaa: Adjust queue depth on rate change	Sean Anderson	5	-6/+44
	Instead of setting the queue depth once during probe, adjust it on the fly whenever we configure the link. This is a bit unusal, since usually the DPAA driver calls into the FMAN driver, but here we do the opposite. We need to add a netdev to struct mac_device for this, but it will soon live in the phylink config. I haven't tested this extensively, but it doesn't seem to break anything. We could possibly optimize this a bit by keeping track of the last rate, but for now we just update every time. 10GEC probably doesn't need to call into this at all, but I've added it for consistency. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>