summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2022-02-18mctp: replace mctp_address_ok with more fine-grained helpersJeremy Kerr4-4/+14
Currently, we have mctp_address_ok(), which checks if an EID is in the "valid" range of 8-254 inclusive. However, 0 and 255 may also be valid addresses, depending on context. 0 is the NULL EID, which may be set when physical addressing is used. 255 is valid as a destination address for broadcasts. This change renames mctp_address_ok to mctp_address_unicast, and adds similar helpers for broadcast and null EIDs, which will be used in an upcoming commit. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-18net: Add new protocol attribute to IP addressesJacques de Laval6-7/+41
This patch adds a new protocol attribute to IPv4 and IPv6 addresses. Inspiration was taken from the protocol attribute of routes. User space applications like iproute2 can set/get the protocol with the Netlink API. The attribute is stored as an 8-bit unsigned integer. The protocol attribute is set by kernel for these categories: - IPv4 and IPv6 loopback addresses - IPv6 addresses generated from router announcements - IPv6 link local addresses User space may pass custom protocols, not defined by the kernel. Grouping addresses on their origin is useful in scenarios where you want to distinguish between addresses based on who added them, e.g. kernel vs. user space. Tagging addresses with a string label is an existing feature that could be used as a solution. Unfortunately the max length of a label is 15 characters, and for compatibility reasons the label must be prefixed with the name of the device followed by a colon. Since device names also have a max length of 15 characters, only -1 characters is guaranteed to be available for any origin tag, which is not that much. A reference implementation of user space setting and getting protocols is available for iproute2: https://github.com/westermo/iproute2/commit/9a6ea18bd79f47f293e5edc7780f315ea42ff540 Signed-off-by: Jacques de Laval <Jacques.De.Laval@westermo.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220217150202.80802-1-Jacques.De.Laval@westermo.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-18Merge branch 'ionic-driver-updates'Jakub Kicinski6-19/+15
Shannon Nelson says: ==================== ionic: driver updates These are a couple of checkpatch cleanup patches, a bug fix, and something to alleviate memory pressure in tight places. ==================== Link: https://lore.kernel.org/r/20220217220252.52293-1-snelson@pensando.io Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-18ionic: clean up comments and whitespaceShannon Nelson3-5/+3
Fix up some checkpatch complaints that have crept in: doubled words words, mispellled words, doubled lines. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-18ionic: prefer strscpy over strlcpyShannon Nelson2-4/+4
Replace strlcpy with strscpy to clean up a checkpatch complaint. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-18ionic: Use vzalloc for large per-queue related buffersBrett Creeley1-8/+6
Use vzalloc for per-queue info structs that don't need any DMA mapping to help relieve memory pressure found when used in our limited SOC environment. Signed-off-by: Brett Creeley <brett@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-18ionic: catch transition back to RUNNING with fw_generation 0Shannon Nelson1-2/+2
In some graceful updates that get initially triggered by the RESET event, especially with older firmware, the fw_generation bits don't change but the fw_status is seen to go to 0 then back to 1. However, the driver didn't perform the restart, remained waiting for fw_generation to change, and got left in limbo. This is because the clearing of idev->fw_status_ready to 0 didn't happen correctly as it was buried in the transition trigger: since the transition down was triggered not here but in the RESET event handler, the clear to 0 didn't happen, so the transition back to 1 wasn't detected. Fix this particular case by bringing the setting of idev->fw_status_ready back out to where it was before. Fixes: 398d1e37f960 ("ionic: add FW_STOPPING state") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-18net: avoid quadratic behavior in netdev_wait_allrefs_any()Eric Dumazet1-3/+1
If the list of devices has N elements, netdev_wait_allrefs_any() is called N times, and linkwatch_forget_dev() is called N*(N-1)/2 times. Fix this by calling linkwatch_forget_dev() only once per device. Fixes: faab39f63c1f ("net: allow out-of-order netdev unregistration") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220218065430.2613262-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-18ipv6: annotate some data-races around sk->sk_protEric Dumazet2-8/+22
IPv6 has this hack changing sk->sk_prot when an IPv6 socket is 'converted' to an IPv4 one with IPV6_ADDRFORM option. This operation is only performed for TCP and UDP, knowing their 'struct proto' for the two network families are populated in the same way, and can not disappear while a reader might use and dereference sk->sk_prot. If we think about it all reads of sk->sk_prot while either socket lock or RTNL is not acquired should be using READ_ONCE(). Also note that other layers like MPTCP, XFRM, CHELSIO_TLS also write over sk->sk_prot. BUG: KCSAN: data-race in inet6_recvmsg / ipv6_setsockopt write to 0xffff8881386f7aa8 of 8 bytes by task 26932 on cpu 0: do_ipv6_setsockopt net/ipv6/ipv6_sockglue.c:492 [inline] ipv6_setsockopt+0x3758/0x3910 net/ipv6/ipv6_sockglue.c:1019 udpv6_setsockopt+0x85/0x90 net/ipv6/udp.c:1649 sock_common_setsockopt+0x5d/0x70 net/core/sock.c:3489 __sys_setsockopt+0x209/0x2a0 net/socket.c:2180 __do_sys_setsockopt net/socket.c:2191 [inline] __se_sys_setsockopt net/socket.c:2188 [inline] __x64_sys_setsockopt+0x62/0x70 net/socket.c:2188 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff8881386f7aa8 of 8 bytes by task 26911 on cpu 1: inet6_recvmsg+0x7a/0x210 net/ipv6/af_inet6.c:659 ____sys_recvmsg+0x16c/0x320 ___sys_recvmsg net/socket.c:2674 [inline] do_recvmmsg+0x3f5/0xae0 net/socket.c:2768 __sys_recvmmsg net/socket.c:2847 [inline] __do_sys_recvmmsg net/socket.c:2870 [inline] __se_sys_recvmmsg net/socket.c:2863 [inline] __x64_sys_recvmmsg+0xde/0x160 net/socket.c:2863 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0xffffffff85e0e980 -> 0xffffffff85e01580 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 26911 Comm: syz-executor.3 Not tainted 5.17.0-rc2-syzkaller-00316-g0457e5153e0e-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18net/ibmvnic: Cleanup workaround doing an EOI after partition migrationCédric Le Goater1-9/+26
There were a fair amount of changes to workaround a firmware bug leaving a pending interrupt after migration of the ibmvnic device : commit 2df5c60e198c ("net/ibmvnic: Ignore H_FUNCTION return from H_EOI to tolerate XIVE mode") commit 284f87d2f387 ("Revert "net/ibmvnic: Fix EOI when running in XIVE mode"") commit 11d49ce9f794 ("net/ibmvnic: Fix EOI when running in XIVE mode.") commit f23e0643cd0b ("ibmvnic: Clear pending interrupt after device reset") Here is the final one taking into account the XIVE interrupt mode. Cc: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Cc: Dany Madden <drt@linux.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18teaming: deliver link-local packets with the link they arrive onjeffreyji1-0/+5
skb is ignored if team port is disabled. We want the skb to be delivered if it's an link layer packet. Issue is already fixed for bonding in commit b89f04c61efe ("bonding: deliver link-local packets with skb->dev set to link that packets arrived on") changelog: v2: change LLDP -> link layer in comments/commit descrip, comment format Signed-off-by: jeffreyji <jeffreyji@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18Merge branch 'qca8k-phylink'David S. Miller4-322/+429
Russell King says: ==================== net: dsa: qca8k: convert to phylink_pcs and mark as non-legacy This series adds support into DSA for the mac_select_pcs method, and converts qca8k to make use of this, eventually marking qca8k as non- legacy. Patch 1 adds DSA support for mac_select_pcs. Patch 2 and patch 3 moves code around in qca8k to make patch 4 more readable. Patch 4 does a simple conversion to phylink_pcs. Patch 5 moves the serdes configuration to phylink_pcs. Patch 6 marks qca8k as non-legacy. v2: fix dsa_phylink_mac_select_pcs() formatting and double-blank line in patch 5 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18net: dsa: qca8k: mark as non-legacyRussell King (Oracle)1-0/+2
The qca8k driver does not make use of the speed, duplex, pause or advertisement in its phylink_mac_config() implementation, so it can be marked as a non-legacy driver. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18net: dsa: qca8k: move pcs configurationRussell King (Oracle)1-66/+84
Move the PCS configuration to qca8k_pcs_config(). Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18net: dsa: qca8k: convert to use phylink_pcsRussell King (Oracle)2-12/+81
Convert the qca8k driver to use the phylink_pcs support to talk to the SGMII PCS. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18net: dsa: qca8k: move qca8k_phylink_mac_link_state()Russell King (Oracle)1-42/+42
Move qca8k_phylink_mac_link_state() to separate the code movement from code changes. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18net: dsa: qca8k: move qca8k_setup()Russell King (Oracle)1-214/+214
Move qca8k_setup() to be later in the file to avoid needing prototypes for called functions. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18net: dsa: add support for phylink mac_select_pcs()Russell King (Oracle)2-0/+18
Add DSA support for the phylink mac_select_pcs() method so DSA drivers can return provide phylink with the appropriate PCS for the PHY interface mode. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18net: ethernet: xilinx: cleanup commentsTom Rix5-6/+6
Remove the second 'the'. Replacements: endiannes to endianness areconnected to are connected Mamagement to Management undoccumented to undocumented Xilink to Xilinx strucutre to structure Change kernel-doc comment style to c style for /* Management ... Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Michal Simek <michal.simek@xilinx.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18net: gro: Fix a 'directive in macro's argument list' sparse warningGal Pressman1-2/+3
Following the cited commit, sparse started complaining about: ../include/net/gro.h:58:1: warning: directive in macro's argument list ../include/net/gro.h:59:1: warning: directive in macro's argument list Fix that by moving the defines out of the struct_group() macro. Fixes: de5a1f3ce4c8 ("net: gro: minor optimization for dev_gro_receive()") Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Acked-by: Alexander Lobakin <alexandr.lobakin@intel.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-17s390/qeth: Remove redundant 'flush_workqueue()' callsXu Wang1-1/+0
'destroy_workqueue()' already drains the queue before destroying it, so there is no need to flush it explicitly. Remove the redundant 'flush_workqueue()' calls. Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Acked-by: Alexandra Winter <wintera@linux.ibm.com> Link: https://lore.kernel.org/r/20220216075155.940-1-vulab@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17net: dsa: delete unused exported symbols for ethtool PHY statsVladimir Oltean2-60/+0
Introduced in commit cf963573039a ("net: dsa: Allow providing PHY statistics from CPU port"), it appears these were never used. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220216193726.2926320-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17net: add sanity check in proto_register()Eric Dumazet1-0/+4
prot->memory_allocated should only be set if prot->sysctl_mem is also set. This is a followup of commit 25206111512d ("crypto: af_alg - get rid of alg_memory_allocated"). Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220216171801.3604366-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17net: ll_temac: Use GFP_KERNEL instead of GFP_ATOMIC when possibleChristophe JAILLET1-2/+3
XTE_MAX_JUMBO_FRAME_SIZE is over 9000 bytes and the default value for 'rx_bd_num' is RX_BD_NUM_DEFAULT (i.e. 1024) So this loop allocates more than 9 Mo of memory. Previous memory allocations in this function already use GFP_KERNEL, so use __netdev_alloc_skb_ip_align() and an explicit GFP_KERNEL instead of a implicit GFP_ATOMIC. This gives more opportunities of successful allocation. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/694abd65418b2b3974106a82d758e3474c65ae8f.1645042560.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17net: nixge: Use GFP_KERNEL instead of GFP_ATOMIC when possibleChristophe JAILLET1-2/+3
NIXGE_MAX_JUMBO_FRAME_SIZE is over 9000 bytes and RX_BD_NUM 128. So this loop allocates more than 1 Mo of memory. Previous memory allocations in this function already use GFP_KERNEL, so use __netdev_alloc_skb_ip_align() and an explicit GFP_KERNEL instead of a implicit GFP_ATOMIC. This gives more opportunities of successful allocation. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/28d2c8e05951ad02a57eb48333672947c8bb4f81.1645043881.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17Merge branch 'mptcp-selftest-fine-tuning-and-cleanup'Jakub Kicinski3-66/+85
Mat Martineau says: ==================== mptcp: Selftest fine-tuning and cleanup Patch 1 adjusts the mptcp selftest timeout to account for slow machines running debug builds. Patch 2 simplifies one test function. Patches 3-6 do some cleanup, like deleting unused variables and avoiding extra work when only printing usage information. Patch 7 improves the checksum tests by utilizing existing checksum MIBs. ==================== Link: https://lore.kernel.org/r/20220218030311.367536-1-mathew.j.martineau@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17selftests: mptcp: add csum mib check for mptcp_connectGeliang Tang1-0/+19
This patch added the data checksum error mib counters check for the script mptcp_connect.sh when the data checksum is enabled. In do_transfer(), got the mib counters twice, before and after running the mptcp_connect commands. The latter minus the former is the actual number of the data checksum mib counter. The output looks like this: ns1 MPTCP -> ns2 (dead:beef:1::2:10007) MPTCP (duration 86ms) [ OK ] ns1 MPTCP -> ns2 (10.0.2.1:10008 ) MPTCP (duration 66ms) [ FAIL ] server got 1 data checksum error[s] Fixes: 94d66ba1d8e48 ("selftests: mptcp: enable checksum in mptcp_connect.sh") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/255 Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17selftests: mptcp: join: check for tools only if neededMatthieu Baerts1-18/+20
To allow showing the 'help' menu even if these tools are not available. While at it, also avoid launching the command then checking $?. Instead, the check is directly done in the 'if'. Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17selftests: mptcp: join: create tmp files only if neededMatthieu Baerts1-13/+24
These tmp files will only be created when a test will be launched. This avoid 'dd' output when '-h' is used for example. While at it, also avoid creating netns that will be removed when starting the first test. Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17selftests: mptcp: join: remove unused varsMatthieu Baerts1-4/+1
Shellcheck found that these variables were set but never used. Note that rndh is no longer prefixed with '0-' but it doesn't change anything. Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17selftests: mptcp: join: exit after usage()Matthieu Baerts1-1/+12
With an error if it is an unknown option. Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17selftests: mptcp: simplify pm_nl_change_endpointGeliang Tang1-29/+8
This patch simplified pm_nl_change_endpoint(), using id-based address lookups only. And dropped the fragile way of parsing 'addr' and 'id' from the output of pm_nl_show_endpoints(). Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17selftests: mptcp: increase timeout to 20 minutesMatthieu Baerts1-1/+1
With the increase number of tests, one CI instance, using a debug kernel config and not recent hardware, takes around 10 minutes to execute the slowest MPTCP test: mptcp_join.sh. Even if most CIs don't take that long to execute these tests -- typically max 10 minutes to run all selftests -- it will help some of them if the timeout is increased. The timeout could be disabled but it is always good to have an extra safeguard, just in case. Please note that on slow public CIs with kernel debug settings, it has been observed it can easily take up to 45 minutes to execute all tests in this very slow environment with other jobs running in parallel. The slowest test, mptcp_join.sh takes ~30 minutes in this case. In such environments, the selftests timeout set in the 'settings' file is disabled because this environment is known as being exceptionnally slow. It has been decided not to take such exceptional environments into account and set the timeout to 20min. Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski34-524/+1502
Daniel Borkmann says: ==================== bpf-next 2022-02-17 We've added 29 non-merge commits during the last 8 day(s) which contain a total of 34 files changed, 1502 insertions(+), 524 deletions(-). The main changes are: 1) Add BTFGen support to bpftool which allows to use CO-RE in kernels without BTF info, from Mauricio Vásquez, Rafael David Tinoco, Lorenzo Fontana and Leonardo Di Donato. (Details: https://lpc.events/event/11/contributions/948/) 2) Prepare light skeleton to be used in both kernel module and user space and convert bpf_preload.ko to use light skeleton, from Alexei Starovoitov. 3) Rework bpftool's versioning scheme and align with libbpf's version number; also add linked libbpf version info to "bpftool version", from Quentin Monnet. 4) Add minimal C++ specific additions to bpftool's skeleton codegen to facilitate use of C skeletons in C++ applications, from Andrii Nakryiko. 5) Add BPF verifier sanity check whether relative offset on kfunc calls overflows desc->imm and reject the BPF program if the case, from Hou Tao. 6) Fix libbpf to use a dynamically allocated buffer for netlink messages to avoid receiving truncated messages on some archs, from Toke Høiland-Jørgensen. 7) Various follow-up fixes to the JIT bpf_prog_pack allocator, from Song Liu. 8) Various BPF selftest and vmtest.sh fixes, from Yucong Sun. 9) Fix bpftool pretty print handling on dumping map keys/values when no BTF is available, from Jiri Olsa and Yinjun Zhang. 10) Extend XDP frags selftest to check for invalid length, from Lorenzo Bianconi. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (29 commits) bpf: bpf_prog_pack: Set proper size before freeing ro_header selftests/bpf: Fix crash in core_reloc when bpftool btfgen fails selftests/bpf: Fix vmtest.sh to launch smp vm. libbpf: Fix memleak in libbpf_netlink_recv() bpftool: Fix C++ additions to skeleton bpftool: Fix pretty print dump for maps without BTF loaded selftests/bpf: Test "bpftool gen min_core_btf" bpftool: Gen min_core_btf explanation and examples bpftool: Implement btfgen_get_btf() bpftool: Implement "gen min_core_btf" logic bpftool: Add gen min_core_btf command libbpf: Expose bpf_core_{add,free}_cands() to bpftool libbpf: Split bpf_core_apply_relo() bpf: Reject kfunc calls that overflow insn->imm selftests/bpf: Add Skeleton templated wrapper as an example bpftool: Add C++-specific open/load/etc skeleton wrappers selftests/bpf: Fix GCC11 compiler warnings in -O2 mode bpftool: Fix the error when lookup in no-btf maps libbpf: Use dynamically allocated buffer when receiving netlink messages libbpf: Fix libbpf.map inheritance chain for LIBBPF_0.7.0 ... ==================== Link: https://lore.kernel.org/r/20220217232027.29831-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17bpf: bpf_prog_pack: Set proper size before freeing ro_headerSong Liu1-0/+1
bpf_prog_pack_free() uses header->size to decide whether the header should be freed with module_memfree() or the bpf_prog_pack logic. However, in kvmalloc() failure path of bpf_jit_binary_pack_alloc(), header->size is not set yet. As a result, bpf_prog_pack_free() may treat a slice of a pack as a standalone kvmalloc'd header and call module_memfree() on the whole pack. This in turn causes use-after-free by other users of the pack. Fix this by setting ro_header->size before freeing ro_header. Fixes: 33c9805860e5 ("bpf: Introduce bpf_jit_binary_pack_[alloc|finalize|free]") Reported-by: syzbot+2f649ec6d2eea1495a8f@syzkaller.appspotmail.com Reported-by: syzbot+ecb1e7e51c52f68f7481@syzkaller.appspotmail.com Reported-by: syzbot+87f65c75f4a72db05445@syzkaller.appspotmail.com Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220217183001.1876034-1-song@kernel.org
2022-02-17Merge branch 'prestera-route-offloading'David S. Miller7-7/+652
Yevhen Orlov says: ==================== net: marvell: prestera: add basic routes offloading Add support for blackhole and local routes for Marvell Prestera driver. Subscribe on fib notifications and handle add/del. Add features: - Support route adding. e.g.: "ip route add blackhole 7.7.1.1/24" e.g.: "ip route add local 9.9.9.9 dev sw1p30" - Support "rt_trap", "rt_offload", "rt_offload_failed" flags - Handle case, when route in "local" table overlaps route in "main" table example: ip ro add blackhole 7.7.7.7 ip ro add local 7.7.7.7 dev sw1p30 # blackhole route will be deoffloaded. rt_offload flag disappeared Limitations: - Only "blackhole" and "local" routes supported. "nexthop" routes is TRAP for now and will be implemented soon. - Only "local" and "main" tables supported ==================== Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-17net: marvell: prestera: handle fib notificationsYevhen Orlov3-0/+427
For now we support only TRAP or DROP, so we can offload only "local" or "blackhole" routes. Nexthop routes is TRAP for now. Will be implemented soon. Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-17net: marvell: prestera: add hardware router objects accounting for lpmYevhen Orlov3-7/+170
Add new router_hw object "fib_node". For now it support only DROP and TRAP mode. Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-17net: marvell: prestera: Add router LPM ABIYevhen Orlov2-0/+55
Add functions to create/delete lpm entry in hw. prestera_hw_lpm_add() take index of allocated virtual router. Also it takes grp_id, which is index of allocated nexthop group. ABI to create nexthop group will be added soon. Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski8-15/+119
Fast path bpf marge for some -next work. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski8-15/+119
Alexei Starovoitov says: ==================== pull-request: bpf 2022-02-17 We've added 8 non-merge commits during the last 7 day(s) which contain a total of 8 files changed, 119 insertions(+), 15 deletions(-). The main changes are: 1) Add schedule points in map batch ops, from Eric. 2) Fix bpf_msg_push_data with len 0, from Felix. 3) Fix crash due to incorrect copy_map_value, from Kumar. 4) Fix crash due to out of bounds access into reg2btf_ids, from Kumar. 5) Fix a bpf_timer initialization issue with clang, from Yonghong. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Add schedule points in batch ops bpf: Fix crash due to out of bounds access into reg2btf_ids. selftests: bpf: Check bpf_msg_push_data return value bpf: Fix a bpf_timer initialization issue bpf: Emit bpf_timer in vmlinux BTF selftests/bpf: Add test for bpf_timer overwriting crash bpf: Fix crash due to incorrect copy_map_value bpf: Do not try bpf_msg_push_data with len 0 ==================== Link: https://lore.kernel.org/r/20220217190000.37925-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski325-1801/+2304
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17Merge tag 'net-5.17-rc5' of ↵Linus Torvalds85-788/+520
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from wireless and netfilter. Current release - regressions: - dsa: lantiq_gswip: fix use after free in gswip_remove() - smc: avoid overwriting the copies of clcsock callback functions Current release - new code bugs: - iwlwifi: - fix use-after-free when no FW is present - mei: fix the pskb_may_pull check in ipv4 - mei: retry mapping the shared area - mvm: don't feed the hardware RFKILL into iwlmei Previous releases - regressions: - ipv6: mcast: use rcu-safe version of ipv6_get_lladdr() - tipc: fix wrong publisher node address in link publications - iwlwifi: mvm: don't send SAR GEO command for 3160 devices, avoid FW assertion - bgmac: make idm and nicpm resource optional again - atl1c: fix tx timeout after link flap Previous releases - always broken: - vsock: remove vsock from connected table when connect is interrupted by a signal - ping: change destination interface checks to match raw sockets - crypto: af_alg - get rid of alg_memory_allocated to avoid confusing semantics (and null-deref) after SO_RESERVE_MEM was added - ipv6: make exclusive flowlabel checks per-netns - bonding: force carrier update when releasing slave - sched: limit TC_ACT_REPEAT loops - bridge: multicast: notify switchdev driver whenever MC processing gets disabled because of max entries reached - wifi: brcmfmac: fix crash in brcm_alt_fw_path when WLAN not found - iwlwifi: fix locking when "HW not ready" - phy: mediatek: remove PHY mode check on MT7531 - dsa: mv88e6xxx: flush switchdev FDB workqueue before removing VLAN - dsa: lan9303: - fix polarity of reset during probe - fix accelerated VLAN handling" * tag 'net-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (65 commits) bonding: force carrier update when releasing slave nfp: flower: netdev offload check for ip6gretap ipv6: fix data-race in fib6_info_hw_flags_set / fib6_purge_rt ipv4: fix data races in fib_alias_hw_flags_set net: dsa: lan9303: add VLAN IDs to master device net: dsa: lan9303: handle hwaccel VLAN tags vsock: remove vsock from connected table when connect is interrupted by a signal Revert "net: ethernet: bgmac: Use devm_platform_ioremap_resource_byname" ping: fix the dif and sdif check in ping_lookup net: usb: cdc_mbim: avoid altsetting toggling for Telit FN990 net: sched: limit TC_ACT_REPEAT loops tipc: fix wrong notification node addresses net: dsa: lantiq_gswip: fix use after free in gswip_remove() ipv6: per-netns exclusive flowlabel checks net: bridge: multicast: notify switchdev driver whenever MC processing gets disabled CDC-NCM: avoid overflow in sanity checking mctp: fix use after free net: mscc: ocelot: fix use-after-free in ocelot_vlan_del() bonding: fix data-races around agg_select_timer dpaa2-eth: Initialize mutex used in one step timestamping path ...
2022-02-17selftests/bpf: Fix crash in core_reloc when bpftool btfgen failsYucong Sun1-2/+2
Avoid unnecessary goto cleanup, as there is nothing to clean up. Signed-off-by: Yucong Sun <fallentree@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220217180210.2981502-1-fallentree@fb.com
2022-02-17selftests/bpf: Fix vmtest.sh to launch smp vm.Yucong Sun1-1/+1
Fix typo in vmtest.sh to make sure it launch proper vm with 8 cpus. Signed-off-by: Yucong Sun <fallentree@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220217155212.2309672-1-fallentree@fb.com
2022-02-17bonding: force carrier update when releasing slaveZhang Changzhong1-3/+2
In __bond_release_one(), bond_set_carrier() is only called when bond device has no slave. Therefore, if we remove the up slave from a master with two slaves and keep the down slave, the master will remain up. Fix this by moving bond_set_carrier() out of if (!bond_has_slaves(bond)) statement. Reproducer: $ insmod bonding.ko mode=0 miimon=100 max_bonds=2 $ ifconfig bond0 up $ ifenslave bond0 eth0 eth1 $ ifconfig eth0 down $ ifenslave -d bond0 eth1 $ cat /proc/net/bonding/bond0 Fixes: ff59c4563a8d ("[PATCH] bonding: support carrier state for master") Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Link: https://lore.kernel.org/r/1645021088-38370-1-git-send-email-zhangchangzhong@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17bpf: Add schedule points in batch opsEric Dumazet1-0/+3
syzbot reported various soft lockups caused by bpf batch operations. INFO: task kworker/1:1:27 blocked for more than 140 seconds. INFO: task hung in rcu_barrier Nothing prevents batch ops to process huge amount of data, we need to add schedule points in them. Note that maybe_wait_bpf_programs(map) calls from generic_map_delete_batch() can be factorized by moving the call after the loop. This will be done later in -next tree once we get this fix merged, unless there is strong opinion doing this optimization sooner. Fixes: aa2e93b8e58e ("bpf: Add generic support for update and delete batch ops") Fixes: cb4d03ab499d ("bpf: Add generic support for lookup batch op") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Stanislav Fomichev <sdf@google.com> Acked-by: Brian Vazquez <brianvv@google.com> Link: https://lore.kernel.org/bpf/20220217181902.808742-1-eric.dumazet@gmail.com
2022-02-17fs/file_table: fix adding missing kmemleak_not_leak()Luis Chamberlain1-2/+6
Commit b42bc9a3c511 ("Fix regression due to "fs: move binfmt_misc sysctl to its own file") fixed a regression, however it failed to add a kmemleak_not_leak(). Fixes: b42bc9a3c511 ("Fix regression due to "fs: move binfmt_misc sysctl to its own file") Reported-by: Tong Zhang <ztong0001@gmail.com> Cc: Tong Zhang <ztong0001@gmail.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-17Merge tag 'perf-tools-fixes-for-v5.17-2022-02-17' of ↵Linus Torvalds16-32/+77
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix corrupt inject files when only last branch option is enabled with ARM CoreSight ETM - Fix use-after-free for realloc(..., 0) in libsubcmd, found by gcc 12 - Defer freeing string after possible strlen() on it in the BPF loader, found by gcc 12 - Avoid early exit in 'perf trace' due SIGCHLD from non-workload processes - Fix arm64 perf_event_attr 'perf test's wrt --call-graph initialization - Fix libperf 32-bit build for 'perf test' wrt uint64_t printf - Fix perf_cpu_map__for_each_cpu macro in libperf, providing access to the CPU iterator - Sync linux/perf_event.h UAPI with the kernel sources - Update Jiri Olsa's email address in MAINTAINERS * tag 'perf-tools-fixes-for-v5.17-2022-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf bpf: Defer freeing string after possible strlen() on it perf test: Fix arm64 perf_event_attr tests wrt --call-graph initialization libsubcmd: Fix use-after-free for realloc(..., 0) libperf: Fix perf_cpu_map__for_each_cpu macro perf cs-etm: Fix corrupt inject files when only last branch option is enabled perf cs-etm: No-op refactor of synth opt usage libperf: Fix 32-bit build for tests uint64_t printf tools headers UAPI: Sync linux/perf_event.h with the kernel sources perf trace: Avoid early exit due SIGCHLD from non-workload processes MAINTAINERS: Update Jiri's email address
2022-02-17Merge tag 'modules-5.17-rc5' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull module fix from Luis Chamberlain: "Fixes module decompression when CONFIG_SYSFS=n The only fix trickled down for v5.17-rc cycle so far is the fix for module decompression when CONFIG_SYSFS=n. This was reported through 0-day" * tag 'modules-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: module: fix building with sysfs disabled