summaryrefslogtreecommitdiffstats
path: root/net/netfilter
AgeCommit message (Collapse)AuthorFilesLines
2020-06-14treewide: replace '---help---' in Kconfig files with 'help'Masahiro Yamada2-56/+56
Since commit 84af7a6194e4 ("checkpatch: kconfig: prefer 'help' over '---help---'"), the number of '---help---' has been gradually decreasing, but there are still more than 2400 instances. This commit finishes the conversion. While I touched the lines, I also fixed the indentation. There are a variety of indentation styles found. a) 4 spaces + '---help---' b) 7 spaces + '---help---' c) 8 spaces + '---help---' d) 1 space + 1 tab + '---help---' e) 1 tab + '---help---' (correct indentation) f) 1 tab + 1 space + '---help---' g) 1 tab + 2 spaces + '---help---' In order to convert all of them to 1 tab + 'help', I ran the following commend: $ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/' Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-06-10Merge branch 'rwonce/rework' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/will/linux Pull READ/WRITE_ONCE rework from Will Deacon: "This the READ_ONCE rework I've been working on for a while, which bumps the minimum GCC version and improves code-gen on arm64 when stack protector is enabled" [ Side note: I'm _really_ tempted to raise the minimum gcc version to 4.9, so that we can just say that we require _Generic() support. That would allow us to more cleanly handle a lot of the cases where we depend on very complex macros with 'sizeof' or __builtin_choose_expr() with __builtin_types_compatible_p() etc. This branch has a workaround for sparse not handling _Generic(), either, but that was already fixed in the sparse development branch, so it's really just gcc-4.9 that we'd require. - Linus ] * 'rwonce/rework' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux: compiler_types.h: Use unoptimized __unqual_scalar_typeof for sparse compiler_types.h: Optimize __unqual_scalar_typeof compilation time compiler.h: Enforce that READ_ONCE_NOCHECK() access size is sizeof(long) compiler-types.h: Include naked type in __pick_integer_type() match READ_ONCE: Fix comment describing 2x32-bit atomicity gcov: Remove old GCC 3.4 support arm64: barrier: Use '__unqual_scalar_typeof' for acquire/release macros locking/barriers: Use '__unqual_scalar_typeof' for load-acquire macros READ_ONCE: Drop pointer qualifiers when reading from scalar types READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses READ_ONCE: Simplify implementations of {READ,WRITE}_ONCE() arm64: csum: Disable KASAN for do_csum() fault_inject: Don't rely on "return value" from WRITE_ONCE() net: tls: Avoid assigning 'const' pointer to non-const pointer netfilter: Avoid assigning 'const' pointer to non-const pointer compiler/gcc: Raise minimum GCC version for kernel builds to 4.8
2020-06-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds14-304/+808
Pull networking updates from David Miller: 1) Allow setting bluetooth L2CAP modes via socket option, from Luiz Augusto von Dentz. 2) Add GSO partial support to igc, from Sasha Neftin. 3) Several cleanups and improvements to r8169 from Heiner Kallweit. 4) Add IF_OPER_TESTING link state and use it when ethtool triggers a device self-test. From Andrew Lunn. 5) Start moving away from custom driver versions, use the globally defined kernel version instead, from Leon Romanovsky. 6) Support GRO vis gro_cells in DSA layer, from Alexander Lobakin. 7) Allow hard IRQ deferral during NAPI, from Eric Dumazet. 8) Add sriov and vf support to hinic, from Luo bin. 9) Support Media Redundancy Protocol (MRP) in the bridging code, from Horatiu Vultur. 10) Support netmap in the nft_nat code, from Pablo Neira Ayuso. 11) Allow UDPv6 encapsulation of ESP in the ipsec code, from Sabrina Dubroca. Also add ipv6 support for espintcp. 12) Lots of ReST conversions of the networking documentation, from Mauro Carvalho Chehab. 13) Support configuration of ethtool rxnfc flows in bcmgenet driver, from Doug Berger. 14) Allow to dump cgroup id and filter by it in inet_diag code, from Dmitry Yakunin. 15) Add infrastructure to export netlink attribute policies to userspace, from Johannes Berg. 16) Several optimizations to sch_fq scheduler, from Eric Dumazet. 17) Fallback to the default qdisc if qdisc init fails because otherwise a packet scheduler init failure will make a device inoperative. From Jesper Dangaard Brouer. 18) Several RISCV bpf jit optimizations, from Luke Nelson. 19) Correct the return type of the ->ndo_start_xmit() method in several drivers, it's netdev_tx_t but many drivers were using 'int'. From Yunjian Wang. 20) Add an ethtool interface for PHY master/slave config, from Oleksij Rempel. 21) Add BPF iterators, from Yonghang Song. 22) Add cable test infrastructure, including ethool interfaces, from Andrew Lunn. Marvell PHY driver is the first to support this facility. 23) Remove zero-length arrays all over, from Gustavo A. R. Silva. 24) Calculate and maintain an explicit frame size in XDP, from Jesper Dangaard Brouer. 25) Add CAP_BPF, from Alexei Starovoitov. 26) Support terse dumps in the packet scheduler, from Vlad Buslov. 27) Support XDP_TX bulking in dpaa2 driver, from Ioana Ciornei. 28) Add devm_register_netdev(), from Bartosz Golaszewski. 29) Minimize qdisc resets, from Cong Wang. 30) Get rid of kernel_getsockopt and kernel_setsockopt in order to eliminate set_fs/get_fs calls. From Christoph Hellwig. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2517 commits) selftests: net: ip_defrag: ignore EPERM net_failover: fixed rollback in net_failover_open() Revert "tipc: Fix potential tipc_aead refcnt leak in tipc_crypto_rcv" Revert "tipc: Fix potential tipc_node refcnt leak in tipc_rcv" vmxnet3: allow rx flow hash ops only when rss is enabled hinic: add set_channels ethtool_ops support selftests/bpf: Add a default $(CXX) value tools/bpf: Don't use $(COMPILE.c) bpf, selftests: Use bpf_probe_read_kernel s390/bpf: Use bcr 0,%0 as tail call nop filler s390/bpf: Maintain 8-byte stack alignment selftests/bpf: Fix verifier test selftests/bpf: Fix sample_cnt shared between two threads bpf, selftests: Adapt cls_redirect to call csum_level helper bpf: Add csum_level helper for fixing up csum levels bpf: Fix up bpf_skb_adjust_room helper's skb csum setting sfc: add missing annotation for efx_ef10_try_update_nic_stats_vf() crypto/chtls: IPv6 support for inline TLS Crypto/chcr: Fixes a coccinile check error Crypto/chcr: Fixes compilations warnings ...
2020-06-02Merge tag 'audit-pr-20200601' of ↵Linus Torvalds1-9/+5
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit updates from Paul Moore: "Summary of the significant patches: - Record information about binds/unbinds to the audit multicast socket. This helps identify which processes have/had access to the information in the audit stream. - Cleanup and add some additional information to the netfilter configuration events collected by audit. - Fix some of the audit error handling code so we don't leak network namespace references" * tag 'audit-pr-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: add subj creds to NETFILTER_CFG record to audit: Replace zero-length array with flexible-array audit: make symbol 'audit_nfcfgs' static netfilter: add audit table unregister actions audit: tidy and extend netfilter_cfg x_tables audit: log audit netlink multicast bind and unbind audit: fix a net reference leak in audit_list_rules_send() audit: fix a net reference leak in audit_send_reply()
2020-06-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller6-135/+650
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next to extend ctnetlink and the flowtable infrastructure: 1) Extend ctnetlink kernel side netlink dump filtering capabilities, from Romain Bellan. 2) Generalise the flowtable hook parser to take a hook list. 3) Pass a hook list to the flowtable hook registration/unregistration. 4) Add a helper function to release the flowtable hook list. 5) Update the flowtable event notifier to pass a flowtable hook list. 6) Allow users to add new devices to an existing flowtables. 7) Allow users to remove devices to an existing flowtables. 8) Allow for registering a flowtable with no initial devices. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01net: remove indirect block netdev event registrationPablo Neira Ayuso2-118/+1
Drivers do not register to netdev events to set up indirect blocks anymore. Remove __flow_indr_block_cb_register() and __flow_indr_block_cb_unregister(). The frontends set up the callbacks through flow_indr_dev_setup_block() Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01net: use flow_indr_dev_setup_offload()Pablo Neira Ayuso2-9/+38
Update existing frontends to use flow_indr_dev_setup_offload(). This new function must be called if ->ndo_setup_tc is unset to deal with tunnel devices. If there is no driver that is subscribed to new tunnel device flow_block bindings, then this function bails out with EOPNOTSUPP. If the driver module is removed, the ->cleanup() callback removes the entries that belong to this tunnel device. This cleanup procedures is triggered when the device unregisters the tunnel device offload handler. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01netfilter: nf_flowtable: expose nf_flow_table_gc_cleanup()Pablo Neira Ayuso1-3/+3
This function schedules the flow teardown state and it forces a gc run. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller4-36/+111
xdp_umem.c had overlapping changes between the 64-bit math fix for the calculation of npgs and the removal of the zerocopy memory type which got rid of the chunk_size_nohdr member. The mlx5 Kconfig conflict is a case where we just take the net-next copy of the Kconfig entry dependency as it takes on the ESWITCH dependency by one level of indirection which is what the 'net' conflicting change is trying to ensure. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27netfilter: nf_tables: skip flowtable hooknum and priority on device updatesPablo Neira Ayuso1-18/+35
On device updates, the hooknum and priority attributes are not required. This patch makes optional these two netlink attributes. Moreover, bail out with EOPNOTSUPP if userspace tries to update the hooknum and priority for existing flowtables. While at this, turn EINVAL into EOPNOTSUPP in case the hooknum is not ingress. EINVAL is reserved for missing netlink attribute / malformed netlink messages. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-27netfilter: nf_tables: allow to register flowtable with no devicesPablo Neira Ayuso1-9/+11
A flowtable might be composed of dynamic interfaces only. Such dynamic interfaces might show up at a later stage. This patch allows users to register a flowtable with no devices. Once the dynamic interface becomes available, the user adds the dynamic devices to the flowtable. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-27netfilter: nf_tables: delete devices from flowtablePablo Neira Ayuso1-16/+97
This patch allows users to delete devices from existing flowtables. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-27netfilter: nf_tables: add devices to existing flowtablePablo Neira Ayuso1-11/+86
This patch allows users to add devices to an existing flowtable. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-27netfilter: nf_tables: pass hook list to flowtable event notifierPablo Neira Ayuso1-5/+11
Update the flowtable netlink notifier to take the list of hooks as input. This allows to reuse this function in incremental flowtable hook updates. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-27netfilter: nf_tables: add nft_flowtable_hooks_destroy()Pablo Neira Ayuso1-4/+11
This patch adds a helper function destroy the flowtable hooks. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-27netfilter: nf_tables: pass hook list to nft_{un,}register_flowtable_net_hooks()Pablo Neira Ayuso1-7/+10
This patch prepares for incremental flowtable hook updates. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-27netfilter: nf_tables: generalise flowtable hook parsingPablo Neira Ayuso1-11/+25
Update nft_flowtable_parse_hook() to take the flowtable hook list as parameter. This allows to reuse this function to update the hooks. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-27netfilter: ctnetlink: add kernel side filtering for dumpRomain Bellan5-71/+381
Conntrack dump does not support kernel side filtering (only get exists, but it returns only one entry. And user has to give a full valid tuple) It means that userspace has to implement filtering after receiving many irrelevant entries, consuming resources (conntrack table is sometimes very huge, much more than a routing table for example). This patch adds filtering in kernel side. To achieve this goal, we: * Add a new CTA_FILTER netlink attributes, actually a flag list to parametize filtering * Convert some *nlattr_to_tuple() functions, to allow a partial parsing of CTA_TUPLE_ORIG and CTA_TUPLE_REPLY (so nf_conntrack_tuple it not fully set) Filtering is now possible on: * IP SRC/DST values * Ports for TCP and UDP flows * IMCP(v6) codes types and IDs Filtering is done as an "AND" operator. For example, when flags PROTO_SRC_PORT, PROTO_NUM and IP_SRC are sets, only entries matching all values are dumped. Changes since v1: Set NLM_F_DUMP_FILTERED in nlm flags if entries are filtered Changes since v2: Move several constants to nf_internals.h Move a fix on netlink values check in a separate patch Add a check on not-supported flags Return EOPNOTSUPP if CDA_FILTER is set in ctnetlink_flush_conntrack (not yet implemented) Code style issues Changes since v3: Fix compilation warning reported by kbuild test robot Changes since v4: Fix a regression introduced in v3 (returned EINVAL for valid netlink messages without CTA_MARK) Changes since v5: Change definition of CTA_FILTER_F_ALL Fix a regression when CTA_TUPLE_ZONE is not set Signed-off-by: Romain Bellan <romain.bellan@wifirst.fr> Signed-off-by: Florent Fourcot <florent.fourcot@wifirst.fr> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-27netfilter: nf_conntrack_pptp: fix compilation warning with W=1 buildPablo Neira Ayuso1-1/+1
>> include/linux/netfilter/nf_conntrack_pptp.h:13:20: warning: 'const' type qualifier on return type has no effect [-Wignored-qualifiers] extern const char *const pptp_msg_name(u_int16_t msg); ^~~~~~ Reported-by: kbuild test robot <lkp@intel.com> Fixes: 4c559f15efcc ("netfilter: nf_conntrack_pptp: prevent buffer overflows in debug code") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-27netfilter: conntrack: comparison of unsigned in cthelper confirmationPablo Neira Ayuso1-1/+1
net/netfilter/nf_conntrack_core.c: In function nf_confirm_cthelper: net/netfilter/nf_conntrack_core.c:2117:15: warning: comparison of unsigned expression in < 0 is always false [-Wtype-limits] 2117 | if (protoff < 0 || (frag_off & htons(~0x7)) != 0) | ^ ipv6_skip_exthdr() returns a signed integer. Reported-by: Colin Ian King <colin.king@canonical.com> Fixes: 703acd70f249 ("netfilter: nfnetlink_cthelper: unbreak userspace helper support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-27netfilter: conntrack: Pass value of ctinfo to __nf_conntrack_updateNathan Chancellor1-3/+3
Clang warns: net/netfilter/nf_conntrack_core.c:2068:21: warning: variable 'ctinfo' is uninitialized when used here [-Wuninitialized] nf_ct_set(skb, ct, ctinfo); ^~~~~~ net/netfilter/nf_conntrack_core.c:2024:2: note: variable 'ctinfo' is declared here enum ip_conntrack_info ctinfo; ^ 1 warning generated. nf_conntrack_update was split up into nf_conntrack_update and __nf_conntrack_update, where the assignment of ctinfo is in nf_conntrack_update but it is used in __nf_conntrack_update. Pass the value of ctinfo from nf_conntrack_update to __nf_conntrack_update so that uninitialized memory is not used and everything works properly. Fixes: ee04805ff54a ("netfilter: conntrack: make conntrack userspace helpers work again") Link: https://github.com/ClangBuiltLinux/linux/issues/1039 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-25netfilter: nfnetlink_cthelper: unbreak userspace helper supportPablo Neira Ayuso1-1/+2
Restore helper data size initialization and fix memcopy of the helper data size. Fixes: 157ffffeb5dc ("netfilter: nfnetlink_cthelper: reject too large userspace allocation requests") Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-25netfilter: conntrack: make conntrack userspace helpers work againPablo Neira Ayuso1-6/+72
Florian Westphal says: "Problem is that after the helper hook was merged back into the confirm one, the queueing itself occurs from the confirm hook, i.e. we queue from the last netfilter callback in the hook-list. Therefore, on return, the packet bypasses the confirm action and the connection is never committed to the main conntrack table. To fix this there are several ways: 1. revert the 'Fixes' commit and have a extra helper hook again. Works, but has the drawback of adding another indirect call for everyone. 2. Special case this: split the hooks only when userspace helper gets added, so queueing occurs at a lower priority again, and normal enqueue reinject would eventually call the last hook. 3. Extend the existing nf_queue ct update hook to allow a forced confirmation (plus run the seqadj code). This goes for 3)." Fixes: 827318feb69cb ("netfilter: conntrack: remove helper hook again") Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-25netfilter: nf_conntrack_pptp: prevent buffer overflows in debug codePablo Neira Ayuso1-27/+35
Dan Carpenter says: "Smatch complains that the value for "cmd" comes from the network and can't be trusted." Add pptp_msg_name() helper function that checks for the array boundary. Fixes: f09943fefe6b ("[NETFILTER]: nf_conntrack/nf_nat: add PPTP helper port") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-25netfilter: ipset: Fix subcounter update skipPhil Sutter1-1/+1
If IPSET_FLAG_SKIP_SUBCOUNTER_UPDATE is set, user requested to not update counters in sub sets. Therefore IPSET_FLAG_SKIP_COUNTER_UPDATE must be set, not unset. Fixes: 6e01781d1c80e ("netfilter: ipset: set match: add support to match the counters") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller4-8/+38
Move the bpf verifier trace check into the new switch statement in HEAD. Resolve the overlapping changes in hinic, where bug fixes overlap the addition of VF support. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12netfilter: nft_set_rbtree: Add missing expired checksPhil Sutter1-0/+11
Expired intervals would still match and be dumped to user space until garbage collection wiped them out. Make sure they stop matching and disappear (from users' perspective) as soon as they expire. Fixes: 8d8540c4f5e03 ("netfilter: nft_set_rbtree: add timeout support") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-12netfilter: flowtable: set NF_FLOW_TEARDOWN flag on entry expirationPablo Neira Ayuso1-3/+5
If the flow timer expires, the gc sets on the NF_FLOW_TEARDOWN flag. Otherwise, the flowtable software path might race to refresh the timeout, leaving the state machine in inconsistent state. Fixes: c29f74e0df7a ("netfilter: nf_flow_table: hardware offload support") Reported-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-11netfilter: conntrack: fix infinite loop on rmmodFlorian Westphal1-1/+12
'rmmod nf_conntrack' can hang forever, because the netns exit gets stuck in nf_conntrack_cleanup_net_list(): i_see_dead_people: busy = 0; list_for_each_entry(net, net_exit_list, exit_list) { nf_ct_iterate_cleanup(kill_all, net, 0, 0); if (atomic_read(&net->ct.count) != 0) busy = 1; } if (busy) { schedule(); goto i_see_dead_people; } When nf_ct_iterate_cleanup iterates the conntrack table, all nf_conn structures can be found twice: once for the original tuple and once for the conntracks reply tuple. get_next_corpse() only calls the iterator when the entry is in original direction -- the idea was to avoid unneeded invocations of the iterator callback. When support for clashing entries was added, the assumption that all nf_conn objects are added twice, once in original, once for reply tuple no longer holds -- NF_CLASH_BIT entries are only added in the non-clashing reply direction. Thus, if at least one NF_CLASH entry is in the list then nf_conntrack_cleanup_net_list() always skips it completely. During normal netns destruction, this causes a hang of several seconds, until the gc worker removes the entry (NF_CLASH entries always have a 1 second timeout). But in the rmmod case, the gc worker has already been stopped, so ct.count never becomes 0. We can fix this in two ways: 1. Add a second test for CLASH_BIT and call iterator for those entries as well, or: 2. Skip the original tuple direction and use the reply tuple. 2) is simpler, so do that. Fixes: 6a757c07e51f80ac ("netfilter: conntrack: allow insertion of clashing entries") Reported-by: Chen Yi <yiche@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-11netfilter: flowtable: Remove WQ_MEM_RECLAIM from workqueueRoi Dayan1-1/+1
This workqueue is in charge of handling offloaded flow tasks like add/del/stats we should not use WQ_MEM_RECLAIM flag. The flag can result in the following warning. [ 485.557189] ------------[ cut here ]------------ [ 485.562976] workqueue: WQ_MEM_RECLAIM nf_flow_table_offload:flow_offload_worr [ 485.562985] WARNING: CPU: 7 PID: 3731 at kernel/workqueue.c:2610 check_flush0 [ 485.590191] Kernel panic - not syncing: panic_on_warn set ... [ 485.597100] CPU: 7 PID: 3731 Comm: kworker/u112:8 Not tainted 5.7.0-rc1.21802 [ 485.606629] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/177 [ 485.615487] Workqueue: nf_flow_table_offload flow_offload_work_handler [nf_f] [ 485.624834] Call Trace: [ 485.628077] dump_stack+0x50/0x70 [ 485.632280] panic+0xfb/0x2d7 [ 485.636083] ? check_flush_dependency+0x110/0x130 [ 485.641830] __warn.cold.12+0x20/0x2a [ 485.646405] ? check_flush_dependency+0x110/0x130 [ 485.652154] ? check_flush_dependency+0x110/0x130 [ 485.657900] report_bug+0xb8/0x100 [ 485.662187] ? sched_clock_cpu+0xc/0xb0 [ 485.666974] do_error_trap+0x9f/0xc0 [ 485.671464] do_invalid_op+0x36/0x40 [ 485.675950] ? check_flush_dependency+0x110/0x130 [ 485.681699] invalid_op+0x28/0x30 Fixes: 7da182a998d6 ("netfilter: flowtable: Use work entry per offload command") Reported-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-11netfilter: flowtable: Add pending bit for offload workPaul Blakey1-1/+7
Gc step can queue offloaded flow del work or stats work. Those work items can race each other and a flow could be freed before the stats work is executed and querying it. To avoid that, add a pending bit that if a work exists for a flow don't queue another work for it. This will also avoid adding multiple stats works in case stats work didn't complete but gc step started again. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-10netfilter: conntrack: avoid gcc-10 zero-length-bounds warningArnd Bergmann1-2/+2
gcc-10 warns around a suspicious access to an empty struct member: net/netfilter/nf_conntrack_core.c: In function '__nf_conntrack_alloc': net/netfilter/nf_conntrack_core.c:1522:9: warning: array subscript 0 is outside the bounds of an interior zero-length array 'u8[0]' {aka 'unsigned char[0]'} [-Wzero-length-bounds] 1522 | memset(&ct->__nfct_init_offset[0], 0, | ^~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from net/netfilter/nf_conntrack_core.c:37: include/net/netfilter/nf_conntrack.h:90:5: note: while referencing '__nfct_init_offset' 90 | u8 __nfct_init_offset[0]; | ^~~~~~~~~~~~~~~~~~ The code is correct but a bit unusual. Rework it slightly in a way that does not trigger the warning, using an empty struct instead of an empty array. There are probably more elegant ways to do this, but this is the smallest change. Fixes: c41884ce0562 ("netfilter: conntrack: avoid zeroing timer") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-05-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller2-8/+8
Conflicts were all overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller3-5/+5
Alexei Starovoitov says: ==================== pull-request: bpf-next 2020-05-01 (v2) The following pull-request contains BPF updates for your *net-next* tree. We've added 61 non-merge commits during the last 6 day(s) which contain a total of 153 files changed, 6739 insertions(+), 3367 deletions(-). The main changes are: 1) pulled work.sysctl from vfs tree with sysctl bpf changes. 2) bpf_link observability, from Andrii. 3) BTF-defined map in map, from Andrii. 4) asan fixes for selftests, from Andrii. 5) Allow bpf_map_lookup_elem for SOCKMAP and SOCKHASH, from Jakub. 6) production cloudflare classifier as a selftes, from Lorenz. 7) bpf_ktime_get_*_ns() helper improvements, from Maciej. 8) unprivileged bpftool feature probe, from Quentin. 9) BPF_ENABLE_STATS command, from Song. 10) enable bpf_[gs]etsockopt() helpers for sock_ops progs, from Stanislav. 11) enable a bunch of common helpers for cg-device, sysctl, sockopt progs, from Stanislav. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30docs: networking: convert tproxy.txt to ReSTMauro Carvalho Chehab1-1/+1
- add SPDX header; - adjust title markup; - mark code blocks and literals as such; - adjust identation, whitespaces and blank lines where needed; - add to networking/index.rst. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-29netfilter: nf_osf: avoid passing pointer to local varArnd Bergmann1-5/+7
gcc-10 points out that a code path exists where a pointer to a stack variable may be passed back to the caller: net/netfilter/nfnetlink_osf.c: In function 'nf_osf_hdr_ctx_init': cc1: warning: function may return address of local variable [-Wreturn-local-addr] net/netfilter/nfnetlink_osf.c:171:16: note: declared here 171 | struct tcphdr _tcph; | ^~~~~ I am not sure whether this can happen in practice, but moving the variable declaration into the callers avoids the problem. Fixes: 31a9c29210e2 ("netfilter: nf_osf: add struct nf_osf_hdr_ctx") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-04-28netfilter: add audit table unregister actionsRichard Guy Briggs1-0/+2
Audit the action of unregistering ebtables and x_tables. See: https://github.com/linux-audit/audit-kernel/issues/44 Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-04-28audit: tidy and extend netfilter_cfg x_tablesRichard Guy Briggs1-9/+3
NETFILTER_CFG record generation was inconsistent for x_tables and ebtables configuration changes. The call was needlessly messy and there were supporting records missing at times while they were produced when not requested. Simplify the logging call into a new audit_log_nfcfg call. Honour the audit_enabled setting while more consistently recording information including supporting records by tidying up dummy checks. Add an op= field that indicates the operation being performed (register or replace). Here is the enhanced sample record: type=NETFILTER_CFG msg=audit(1580905834.919:82970): table=filter family=2 entries=83 op=replace Generate audit NETFILTER_CFG records on ebtables table registration. Previously this was being done for x_tables registration and replacement operations and ebtables table replacement only. See: https://github.com/linux-audit/audit-kernel/issues/25 See: https://github.com/linux-audit/audit-kernel/issues/35 See: https://github.com/linux-audit/audit-kernel/issues/43 Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-04-28Merge branch 'work.sysctl' of ↵Daniel Borkmann3-5/+5
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull in Christoph Hellwig's series that changes the sysctl's ->proc_handler methods to take kernel pointers instead. It gets rid of the set_fs address space overrides used by BPF. As per discussion, pull in the feature branch into bpf-next as it relates to BPF sysctl progs. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200427071508.GV23230@ZenIV.linux.org.uk/T/
2020-04-28netfilter: nft_nat: add netmap supportPablo Neira Ayuso1-1/+45
This patch allows you to NAT the network address prefix onto another network address prefix, a.k.a. netmapping. Userspace must specify the NF_NAT_RANGE_NETMAP flag and the prefix address through the NFTA_NAT_REG_ADDR_MIN and NFTA_NAT_REG_ADDR_MAX netlink attributes. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-04-28netfilter: nft_nat: add helper function to set up NAT address and protocolPablo Neira Ayuso1-22/+34
This patch add nft_nat_setup_addr() and nft_nat_setup_proto() to set up the NAT mangling. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-04-28netfilter: nft_nat: set flags from initialization pathPablo Neira Ayuso1-4/+6
This patch sets the NAT flags from the control plane path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-04-28netfilter: nft_nat: return EOPNOTSUPP if type or flags are not supportedPablo Neira Ayuso1-2/+2
Instead of EINVAL which should be used for malformed netlink messages. Fixes: eb31628e37a0 ("netfilter: nf_tables: Add support for IPv6 NAT") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-04-27netfilter: nf_tables: allow up to 64 bytes in the set element data areaPablo Neira Ayuso1-12/+26
So far, the set elements could store up to 128-bits in the data area. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-04-27sysctl: pass kernel pointers to ->proc_handlerChristoph Hellwig3-5/+5
Instead of having all the sysctl handlers deal with user pointers, which is rather hairy in terms of the BPF interaction, copy the input to and from userspace in common code. This also means that the strings are always NUL-terminated by the common code, making the API a little bit safer. As most handler just pass through the data to one of the common handlers a lot of the changes are mechnical. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-04-26netfilter: nat: never update the UDP checksum when it's 0Guillaume Nault1-3/+1
If the UDP header of a local VXLAN endpoint is NAT-ed, and the VXLAN device has disabled UDP checksums and enabled Tx checksum offloading, then the skb passed to udp_manip_pkt() has hdr->check == 0 (outer checksum disabled) and skb->ip_summed == CHECKSUM_PARTIAL (inner packet checksum offloaded). Because of the ->ip_summed value, udp_manip_pkt() tries to update the outer checksum with the new address and port, leading to an invalid checksum sent on the wire, as the original null checksum obviously didn't take the old address and port into account. So, we can't take ->ip_summed into account in udp_manip_pkt(), as it might not refer to the checksum we're acting on. Instead, we can base the decision to update the UDP checksum entirely on the value of hdr->check, because it's null if and only if checksum is disabled: * A fully computed checksum can't be 0, since a 0 checksum is represented by the CSUM_MANGLED_0 value instead. * A partial checksum can't be 0, since the pseudo-header always adds at least one non-zero value (the UDP protocol type 0x11) and adding more values to the sum can't make it wrap to 0 as the carry is then added to the wrapped number. * A disabled checksum uses the special value 0. The problem seems to be there from day one, although it was probably not visible before UDP tunnels were implemented. Fixes: 5b1158e909ec ("[NETFILTER]: Add NAT support for nf_conntrack") Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-04-26netfilter: nf_conntrack: add IPS_HW_OFFLOAD status bitBodong Wang2-1/+6
This bit indicates that the conntrack entry is offloaded to hardware flow table. nf_conntrack entry will be tagged with [HW_OFFLOAD] if it's offload to hardware. cat /proc/net/nf_conntrack ipv4 2 tcp 6 \ src=1.1.1.17 dst=1.1.1.16 sport=56394 dport=5001 \ src=1.1.1.16 dst=1.1.1.17 sport=5001 dport=56394 [HW_OFFLOAD] \ mark=0 zone=0 use=3 Note that HW_OFFLOAD/OFFLOAD/ASSURED are mutually exclusive. Changelog: * V1->V2: - Remove check of lastused from stats. It was meant for cases such as removing driver module while traffic still running. Better to handle such cases from garbage collector. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-04-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller2-4/+6
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) flow_block_cb memleak in nf_flow_table_offload_del_cb(), from Roi Dayan. 2) Fix error path handling in nf_nat_inet_register_fn(), from Hillf Danton. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-19netfilter: nat: fix error handling upon registering inet hookHillf Danton1-2/+2
A case of warning was reported by syzbot. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 19934 at net/netfilter/nf_nat_core.c:1106 nf_nat_unregister_fn+0x532/0x5c0 net/netfilter/nf_nat_core.c:1106 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 19934 Comm: syz-executor.5 Not tainted 5.6.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x188/0x20d lib/dump_stack.c:118 panic+0x2e3/0x75c kernel/panic.c:221 __warn.cold+0x2f/0x35 kernel/panic.c:582 report_bug+0x27b/0x2f0 lib/bug.c:195 fixup_bug arch/x86/kernel/traps.c:175 [inline] fixup_bug arch/x86/kernel/traps.c:170 [inline] do_error_trap+0x12b/0x220 arch/x86/kernel/traps.c:267 do_invalid_op+0x32/0x40 arch/x86/kernel/traps.c:286 invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027 RIP: 0010:nf_nat_unregister_fn+0x532/0x5c0 net/netfilter/nf_nat_core.c:1106 Code: ff df 48 c1 ea 03 80 3c 02 00 75 75 48 8b 44 24 10 4c 89 ef 48 c7 00 00 00 00 00 e8 e8 f8 53 fb e9 4d fe ff ff e8 ee 9c 16 fb <0f> 0b e9 41 fe ff ff e8 e2 45 54 fb e9 b5 fd ff ff 48 8b 7c 24 20 RSP: 0018:ffffc90005487208 EFLAGS: 00010246 RAX: 0000000000040000 RBX: 0000000000000004 RCX: ffffc9001444a000 RDX: 0000000000040000 RSI: ffffffff865c94a2 RDI: 0000000000000005 RBP: ffff88808b5cf000 R08: ffff8880a2620140 R09: fffffbfff14bcd79 R10: ffffc90005487208 R11: fffffbfff14bcd78 R12: 0000000000000000 R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000000 nf_nat_ipv6_unregister_fn net/netfilter/nf_nat_proto.c:1017 [inline] nf_nat_inet_register_fn net/netfilter/nf_nat_proto.c:1038 [inline] nf_nat_inet_register_fn+0xfc/0x140 net/netfilter/nf_nat_proto.c:1023 nf_tables_register_hook net/netfilter/nf_tables_api.c:224 [inline] nf_tables_addchain.constprop.0+0x82e/0x13c0 net/netfilter/nf_tables_api.c:1981 nf_tables_newchain+0xf68/0x16a0 net/netfilter/nf_tables_api.c:2235 nfnetlink_rcv_batch+0x83a/0x1610 net/netfilter/nfnetlink.c:433 nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:543 [inline] nfnetlink_rcv+0x3af/0x420 net/netfilter/nfnetlink.c:561 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline] netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329 netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:672 ____sys_sendmsg+0x6bf/0x7e0 net/socket.c:2362 ___sys_sendmsg+0x100/0x170 net/socket.c:2416 __sys_sendmsg+0xec/0x1b0 net/socket.c:2449 do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295 entry_SYSCALL_64_after_hwframe+0x49/0xb3 and to quiesce it, unregister NFPROTO_IPV6 hook instead of NFPROTO_INET in case of failing to register NFPROTO_IPV4 hook. Reported-by: syzbot <syzbot+33e06702fd6cffc24c40@syzkaller.appspotmail.com> Fixes: d164385ec572 ("netfilter: nat: add inet family nat support") Cc: Florian Westphal <fw@strlen.de> Cc: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Hillf Danton <hdanton@sina.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-04-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds6-22/+27
Pull networking fixes from David Miller: 1) Disable RISCV BPF JIT builds when !MMU, from Björn Töpel. 2) nf_tables leaves dangling pointer after free, fix from Eric Dumazet. 3) Out of boundary write in __xsk_rcv_memcpy(), fix from Li RongQing. 4) Adjust icmp6 message source address selection when routes have a preferred source address set, from Tim Stallard. 5) Be sure to validate HSR protocol version when creating new links, from Taehee Yoo. 6) CAP_NET_ADMIN should be sufficient to manage l2tp tunnels even in non-initial namespaces, from Michael Weiß. 7) Missing release firmware call in mlx5, from Eran Ben Elisha. 8) Fix variable type in macsec_changelink(), caught by KASAN. Fix from Taehee Yoo. 9) Fix pause frame negotiation in marvell phy driver, from Clemens Gruber. 10) Record RX queue early enough in tun packet paths such that XDP programs will see the correct RX queue index, from Gilberto Bertin. 11) Fix double unlock in mptcp, from Florian Westphal. 12) Fix offset overflow in ARM bpf JIT, from Luke Nelson. 13) marvell10g needs to soft reset PHY when coming out of low power mode, from Russell King. 14) Fix MTU setting regression in stmmac for some chip types, from Florian Fainelli. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (101 commits) amd-xgbe: Use __napi_schedule() in BH context mISDN: make dmril and dmrim static net: stmmac: dwmac-sunxi: Provide TX and RX fifo sizes net: dsa: mt7530: fix tagged frames pass-through in VLAN-unaware mode tipc: fix incorrect increasing of link window Documentation: Fix tcp_challenge_ack_limit default value net: tulip: make early_486_chipsets static dt-bindings: net: ethernet-phy: add desciption for ethernet-phy-id1234.d400 ipv6: remove redundant assignment to variable err net/rds: Use ERR_PTR for rds_message_alloc_sgs() net: mscc: ocelot: fix untagged packet drops when enslaving to vlan aware bridge selftests/bpf: Check for correct program attach/detach in xdp_attach test libbpf: Fix type of old_fd in bpf_xdp_set_link_opts libbpf: Always specify expected_attach_type on program load if supported xsk: Add missing check on user supplied headroom size mac80211: fix channel switch trigger from unknown mesh peer mac80211: fix race in ieee80211_register_hw() net: marvell10g: soft-reset the PHY when coming out of low power net: marvell10g: report firmware version net/cxgb4: Check the return from t4_query_params properly ...