Age | Commit message (Collapse) | Author | Files | Lines |
|
User space could ask for very large hash tables, we need to make sure
our size computations wont overflow.
nf_tables_newset() needs to double check the u64 size
will fit into size_t field.
Fixes: 0ed6389c483d ("netfilter: nf_tables: rename set implementations")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Number of buckets being stored in 32bit variables, we have to
ensure that no overflows occur in nft_hash_buckets()
syzbot injected a size == 0x40000000 and reported:
UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13
shift exponent 64 is too large for 64-bit type 'long unsigned int'
CPU: 1 PID: 29539 Comm: syz-executor.4 Not tainted 5.12.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x141/0x1d7 lib/dump_stack.c:120
ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
__ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:327
__roundup_pow_of_two include/linux/log2.h:57 [inline]
nft_hash_buckets net/netfilter/nft_set_hash.c:411 [inline]
nft_hash_estimate.cold+0x19/0x1e net/netfilter/nft_set_hash.c:652
nft_select_set_ops net/netfilter/nf_tables_api.c:3586 [inline]
nf_tables_newset+0xe62/0x3110 net/netfilter/nf_tables_api.c:4322
nfnetlink_rcv_batch+0xa09/0x24b0 net/netfilter/nfnetlink.c:488
nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:612 [inline]
nfnetlink_rcv+0x3af/0x420 net/netfilter/nfnetlink.c:630
netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338
netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927
sock_sendmsg_nosec net/socket.c:654 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:674
____sys_sendmsg+0x6e8/0x810 net/socket.c:2350
___sys_sendmsg+0xf3/0x170 net/socket.c:2404
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2433
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
Fixes: 0ed6389c483d ("netfilter: nf_tables: rename set implementations")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Release object name if userdata allocation fails.
Fixes: b131c96496b3 ("netfilter: nf_tables: add userdata support for nft_object")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Several conntrack helpers and the TCP tracker assume that
skb_header_pointer() never fails based on upfront header validation.
Even if this should not ever happen, BUG_ON() is a too drastic measure,
remove them.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Do not assume that the tcph->doff field is correct when parsing for TCP
options, skb_header_pointer() might fail to fetch these bits.
Fixes: 11eeef41d5f6 ("netfilter: passive OS fingerprint xtables match")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Reported by syzbot :
BUG: sleeping function called from invalid context at include/linux/sched/mm.h:201
in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 26899, name: syz-executor.5
1 lock held by syz-executor.5/26899:
#0: ffffffff8bf797a0 (rcu_read_lock){....}-{1:2}, at: nfnetlink_get_subsys net/netfilter/nfnetlink.c:148 [inline]
#0: ffffffff8bf797a0 (rcu_read_lock){....}-{1:2}, at: nfnetlink_rcv_msg+0x1da/0x1300 net/netfilter/nfnetlink.c:226
Preemption disabled at:
[<ffffffff8917799e>] preempt_schedule_irq+0x3e/0x90 kernel/sched/core.c:5533
CPU: 1 PID: 26899 Comm: syz-executor.5 Not tainted 5.12.0-next-20210504-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x141/0x1d7 lib/dump_stack.c:120
___might_sleep.cold+0x1f1/0x237 kernel/sched/core.c:8338
might_alloc include/linux/sched/mm.h:201 [inline]
slab_pre_alloc_hook mm/slab.h:500 [inline]
slab_alloc_node mm/slub.c:2845 [inline]
kmem_cache_alloc_node+0x33d/0x3e0 mm/slub.c:2960
__alloc_skb+0x20b/0x340 net/core/skbuff.c:413
alloc_skb include/linux/skbuff.h:1107 [inline]
nlmsg_new include/net/netlink.h:953 [inline]
netlink_ack+0x1ed/0xaa0 net/netlink/af_netlink.c:2437
netlink_rcv_skb+0x33d/0x420 net/netlink/af_netlink.c:2508
nfnetlink_rcv+0x1ac/0x420 net/netfilter/nfnetlink.c:650
netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338
netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927
sock_sendmsg_nosec net/socket.c:654 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:674
____sys_sendmsg+0x6e8/0x810 net/socket.c:2350
___sys_sendmsg+0xf3/0x170 net/socket.c:2404
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2433
do_syscall_64+0x3a/0xb0 arch/x86/entry/common.c:47
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x4665f9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fa8a03ee188 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 00000000004665f9
RDX: 0000000000000000 RSI: 0000000020000480 RDI: 0000000000000004
RBP: 00000000004bfce1 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60
R13: 00007fffe864480f R14: 00007fa8a03ee300 R15: 0000000000022000
================================================
WARNING: lock held when returning to user space!
5.12.0-next-20210504-syzkaller #0 Tainted: G W
------------------------------------------------
syz-executor.5/26899 is leaving the kernel with locks still held!
1 lock held by syz-executor.5/26899:
#0: ffffffff8bf797a0 (rcu_read_lock){....}-{1:2}, at: nfnetlink_get_subsys net/netfilter/nfnetlink.c:148 [inline]
#0: ffffffff8bf797a0 (rcu_read_lock){....}-{1:2}, at: nfnetlink_rcv_msg+0x1da/0x1300 net/netfilter/nfnetlink.c:226
------------[ cut here ]------------
WARNING: CPU: 0 PID: 26899 at kernel/rcu/tree_plugin.h:359 rcu_note_context_switch+0xfd/0x16e0 kernel/rcu/tree_plugin.h:359
Modules linked in:
CPU: 0 PID: 26899 Comm: syz-executor.5 Tainted: G W 5.12.0-next-20210504-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:rcu_note_context_switch+0xfd/0x16e0 kernel/rcu/tree_plugin.h:359
Code: 48 89 fa 48 c1 ea 03 0f b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 2e 0d 00 00 8b bd cc 03 00 00 85 ff 7e 02 <0f> 0b 65 48 8b 2c 25 00 f0 01 00 48 8d bd cc 03 00 00 48 b8 00 00
RSP: 0000:ffffc90002fffdb0 EFLAGS: 00010002
RAX: 0000000000000007 RBX: ffff8880b9c36080 RCX: ffffffff8dc99bac
RDX: 0000000000000000 RSI: 0000000000000008 RDI: 0000000000000001
RBP: ffff88808b9d1c80 R08: 0000000000000000 R09: ffffffff8dc96917
R10: fffffbfff1b92d22 R11: 0000000000000000 R12: 0000000000000000
R13: ffff88808b9d1c80 R14: ffff88808b9d1c80 R15: ffffc90002ff8000
FS: 00007fa8a03ee700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f09896ed000 CR3: 0000000032070000 CR4: 00000000001526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
__schedule+0x214/0x23e0 kernel/sched/core.c:5044
schedule+0xcf/0x270 kernel/sched/core.c:5226
exit_to_user_mode_loop kernel/entry/common.c:162 [inline]
exit_to_user_mode_prepare+0x13e/0x280 kernel/entry/common.c:208
irqentry_exit_to_user_mode+0x5/0x40 kernel/entry/common.c:314
asm_sysvec_reschedule_ipi+0x12/0x20 arch/x86/include/asm/idtentry.h:637
RIP: 0033:0x4665f9
Fixes: 50f2db9e368f ("netfilter: nfnetlink: consolidate callback types")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This extension breaks when trying to delete rules, add a new revision to
fix this.
Fixes: 5e6874cdb8de ("[SECMARK]: Add xtables SECMARK target")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
In some configurations, the sock_cgroup_ptr() function is not available:
net/netfilter/nft_socket.c: In function 'nft_sock_get_eval_cgroupv2':
net/netfilter/nft_socket.c:47:16: error: implicit declaration of function 'sock_cgroup_ptr'; did you mean 'obj_cgroup_put'? [-Werror=implicit-function-declaration]
47 | cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
| ^~~~~~~~~~~~~~~
| obj_cgroup_put
net/netfilter/nft_socket.c:47:14: error: assignment to 'struct cgroup *' from 'int' makes pointer from integer without a cast [-Werror=int-conversion]
47 | cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
| ^
Change the caller to match the same #ifdef check, only calling it
when the function is defined.
Fixes: e0bb96db96f8 ("netfilter: nft_socket: add support for cgroupsv2")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The variable is only used in an #ifdef, causing a harmless warning:
net/netfilter/nft_socket.c: In function 'nft_socket_init':
net/netfilter/nft_socket.c:137:27: error: unused variable 'level' [-Werror=unused-variable]
137 | unsigned int len, level;
| ^~~~~
Move it into the same #ifdef block.
Fixes: e0bb96db96f8 ("netfilter: nft_socket: add support for cgroupsv2")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This patch extends the set infrastructure to add a special catch-all set
element. If the lookup fails to find an element (or range) in the set,
then the catch-all element is selected. Users can specify a mapping,
expression(s) and timeout to be attached to the catch-all element.
This patch adds a catchall list to the set, this list might contain more
than one single catch-all element (e.g. in case that the catch-all
element is removed and a new one is added in the same transaction).
However, most of the time, there will be either one element or no
elements at all in this list.
The catch-all element is identified via NFT_SET_ELEM_CATCHALL flag and
such special element has no NFTA_SET_ELEM_KEY attribute. There is a new
nft_set_elem_catchall object that stores a reference to the dummy
catch-all element (catchall->elem) whose layout is the same of the set
element type to reuse the existing set element codebase.
The set size does not apply to the catch-all element, users can define a
catch-all element even if the set is full.
The check for valid set element flags hava been updates to report
EOPNOTSUPP in case userspace requests flags that are not supported when
using new userspace nftables and old kernel.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
When binding sets to rule, validate set element data according to
set definition. This patch adds a helper function to be reused by
the catch-all set element support.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This patch adds nft_set_flush() which prepares for the catch-all
element support.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This patch adds nft_check_loops() to reuse it in the new catch-all
element codebase.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Rename:
- nft_set_elem_activate() to nft_set_elem_data_activate().
- nft_set_elem_deactivate() to nft_set_elem_data_deactivate().
To prepare for updates in the set element infrastructure to add support
for the special catch-all element.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The compat layer needs to parse untrusted input (the ruleset)
to translate it to a 64bit compatible format.
We had a number of bugs in this department in the past, so allow users
to turn this feature off.
Add CONFIG_NETFILTER_XTABLES_COMPAT kconfig knob and make it default to y
to keep existing behaviour.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Add enum nfnl_callback_type to identify the callback type to provide one
single callback.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Update batch callbacks to use the nfnl_info structure. Rename one
clashing info variable to expr_info.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Update rcu callbacks to use the nfnl_info structure.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Add a new structure to reduce callback footprint and to facilite
extensions of the nfnetlink callback interface in the future.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Consolidate call to net_generic(net, nf_tables_net_id) in this
wrapper function.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Without this, a stale pointer remains in pernet loggers after module
unload causing a kernel oops during dereference. Easily reproduced by:
| # modprobe nf_log_syslog
| # rmmod nf_log_syslog
| # cat /proc/net/netfilter/nf_log
Fixes: 77ccee96a6742 ("netfilter: nf_log_bridge: merge with nf_log_syslog")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
iptable_x modules rely on 'struct net' to contain a pointer to the
table that should be evaluated.
In order to remove these pointers from struct net, pass them via
the 'priv' pointer in a similar fashion as nf_tables passes the
rule data.
To do that, duplicate the nf_hook_info array passed in from the
iptable_x modules, update the ops->priv pointers of the copy to
refer to the table and then change the hookfn implementations to
just pass the 'priv' argument to the traverser.
After this patch, the xt_table pointers can already be removed
from struct net.
However, changes to struct net result in re-compile of the entire
network stack, so do the removal after arptables and ip6tables
have been converted as well.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This will be used to obtain the xt_table struct given address family and
table name.
Followup patches will reduce the number of direct accesses to the xt_table
structures via net->ipv{4,6}.ip(6)table_{nat,mangle,...} pointers, then
remove them.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
When I changed defrag hooks to no longer get registered by default I
intentionally made it so that registration can only be un-done by unloading
the nf_defrag_ipv4/6 module.
In hindsight this was too conservative; there is no reason to keep defrag
on while there is no feature dependency anymore.
Moreover, this won't work if user isn't allowed to remove nf_defrag module.
This adds the disable() functions for both ipv4 and ipv6 and calls them
from conntrack, TPROXY and the xtables socket module.
ipvs isn't converted here, it will behave as before this patch and
will need module removal.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Allow to match on the cgroupsv2 id from ancestor level.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
remove the export and make it static.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next:
1) Add vlan match and pop actions to the flowtable offload,
patches from wenxu.
2) Reduce size of the netns_ct structure, which itself is
embedded in struct net Make netns_ct a read-mostly structure.
Patches from Florian Westphal.
3) Add FLOW_OFFLOAD_XMIT_UNSPEC to skip dst check from garbage
collector path, as required by the tc CT action. From Roi Dayan.
4) VLAN offload fixes for nftables: Allow for matching on both s-vlan
and c-vlan selectors. Fix match of VLAN id due to incorrect
byteorder. Add a new routine to properly populate flow dissector
ethertypes.
5) Missing keys in ip{6}_route_me_harder() results in incorrect
routes. This includes an update for selftest infra. Patches
from Ido Schimmel.
6) Add counter hardware offload support through FLOW_CLS_STATS.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds the .offload_stats operation to synchronize hardware
stats with the expression data. Update the counter expression to use
this new interface. The hardware stats are retrieved from the netlink
dump path via FLOW_CLS_STATS command to the driver.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The nftables offload parser sets FLOW_DISSECTOR_KEY_BASIC .n_proto to the
ethertype field in the ethertype frame. However:
- FLOW_DISSECTOR_KEY_BASIC .n_proto field always stores either IPv4 or IPv6
ethertypes.
- FLOW_DISSECTOR_KEY_VLAN .vlan_tpid stores either the 802.1q and 802.1ad
ethertypes. Same as for FLOW_DISSECTOR_KEY_CVLAN.
This function adjusts the flow dissector to handle two scenarios:
1) FLOW_DISSECTOR_KEY_VLAN .vlan_tpid is set to 802.1q or 802.1ad.
Then, transfer:
- the .n_proto field to FLOW_DISSECTOR_KEY_VLAN .tpid.
- the original FLOW_DISSECTOR_KEY_VLAN .tpid to the
FLOW_DISSECTOR_KEY_CVLAN .tpid
- the original FLOW_DISSECTOR_KEY_CVLAN .tpid to the .n_proto field.
2) .n_proto is set to 802.1q or 802.1ad. Then, transfer:
- the .n_proto field to FLOW_DISSECTOR_KEY_VLAN .tpid.
- the original FLOW_DISSECTOR_KEY_VLAN .tpid to the .n_proto field.
Fixes: a82055af5959 ("netfilter: nft_payload: add VLAN offload support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The flow dissector representation expects the VLAN id in host byteorder.
Add the NFT_OFFLOAD_F_NETWORK2HOST flag to swap the bytes from nft_cmp.
Fixes: a82055af5959 ("netfilter: nft_payload: add VLAN offload support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
- add another struct flow_dissector_key_vlan for C-VLAN
- update layer 3 dependency to allow to match on IPv4/IPv6
Fixes: 89d8fd44abfb ("netfilter: nft_payload: add C-VLAN offload support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
- keep the ZC code, drop the code related to reinit
net/bridge/netfilter/ebtables.c
- fix build after move to net_generic
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
It could be xmit type was not set and would default to FLOW_OFFLOAD_XMIT_NEIGH
and in this type the gc expect to have a route info.
Fix that by adding FLOW_OFFLOAD_XMIT_UNSPEC which defaults to 0.
Fixes: 8b9229d15877 ("netfilter: flowtable: dst_check() from garbage collector path")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
log_invalid sysctl allows values of 0 to 255 inclusive so we no longer
need a range check: the min/max values can be removed.
This also removes all member variables that were moved to net_generic
data in previous patches.
This reduces size of netns_ct struct by one cache line.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Its only needed from slowpath (sysctl, ctnetlink, gc worker) and
when a new conntrack object is allocated.
Furthermore, each write dirties the otherwise read-mostly pernet
data in struct net.ct, which are accessed from packet path.
Move it to the net_generic data. This makes struct netns_ct
read-mostly.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Creation of a new conntrack entry isn't a frequent operation (compared
to 'ct entry already exists'). Creation of a new entry that is also an
expected (related) connection even less so.
Place this counter in net_generic data.
A followup patch will also move the conntrack count -- this will make
netns_ct a read-mostly structure.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
While at it, make it an u8, no need to use an integer for a boolean.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Not accessed in fast path, place this is generic_net data instead.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This patch adds vlan pop action offload in the flowtable offload.
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This patch adds support for vlan_id, vlan_priority and vlan_proto match
for flowtable offload.
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
memcpy() breaks when using connlimit in set elements. Use
nft_expr_clone() to initialize the connlimit expression list, otherwise
connlimit garbage collector crashes when walking on the list head copy.
[ 493.064656] Workqueue: events_power_efficient nft_rhash_gc [nf_tables]
[ 493.064685] RIP: 0010:find_or_evict+0x5a/0x90 [nf_conncount]
[ 493.064694] Code: 2b 43 40 83 f8 01 77 0d 48 c7 c0 f5 ff ff ff 44 39 63 3c 75 df 83 6d 18 01 48 8b 43 08 48 89 de 48 8b 13 48 8b 3d ee 2f 00 00 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 03 48 83
[ 493.064699] RSP: 0018:ffffc90000417dc0 EFLAGS: 00010297
[ 493.064704] RAX: 0000000000000000 RBX: ffff888134f38410 RCX: 0000000000000000
[ 493.064708] RDX: 0000000000000000 RSI: ffff888134f38410 RDI: ffff888100060cc0
[ 493.064711] RBP: ffff88812ce594a8 R08: ffff888134f38438 R09: 00000000ebb9025c
[ 493.064714] R10: ffffffff8219f838 R11: 0000000000000017 R12: 0000000000000001
[ 493.064718] R13: ffffffff82146740 R14: ffff888134f38410 R15: 0000000000000000
[ 493.064721] FS: 0000000000000000(0000) GS:ffff88840e440000(0000) knlGS:0000000000000000
[ 493.064725] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 493.064729] CR2: 0000000000000008 CR3: 00000001330aa002 CR4: 00000000001706e0
[ 493.064733] Call Trace:
[ 493.064737] nf_conncount_gc_list+0x8f/0x150 [nf_conncount]
[ 493.064746] nft_rhash_gc+0x106/0x390 [nf_tables]
Reported-by: Laura Garcia Liebana <nevola@gmail.com>
Fixes: 409444522976 ("netfilter: nf_tables: add elements with stateful expressions")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
xt_compat_match/target_from_user doesn't check that zeroing the area
to start of next rule won't write past end of allocated ruleset blob.
Remove this code and zero the entire blob beforehand.
Reported-by: syzbot+cfc0247ac173f597aaaa@syzkaller.appspotmail.com
Reported-by: Andy Nguyen <theflow@google.com>
Fixes: 9fa492cdc160c ("[NETFILTER]: x_tables: simplify compat API")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
These sysctls point to global variables:
- NF_SYSCTL_CT_MAX (&nf_conntrack_max)
- NF_SYSCTL_CT_EXPECT_MAX (&nf_ct_expect_max)
- NF_SYSCTL_CT_BUCKETS (&nf_conntrack_htable_size_user)
Because their data pointers are not updated to point to per-netns
structures, they must be marked read-only in a non-init_net ns.
Otherwise, changes in any net namespace are reflected in (leaked into)
all other net namespaces. This problem has existed since the
introduction of net namespaces.
The current logic marks them read-only only if the net namespace is
owned by an unprivileged user (other than init_user_ns).
Commit d0febd81ae77 ("netfilter: conntrack: re-visit sysctls in
unprivileged namespaces") "exposes all sysctls even if the namespace is
unpriviliged." Since we need to mark them readonly in any case, we can
forego the unprivileged user check altogether.
Fixes: d0febd81ae77 ("netfilter: conntrack: re-visit sysctls in unprivileged namespaces")
Signed-off-by: Jonathon Reinhart <Jonathon.Reinhart@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
div_u64() divides u64 by u32.
nft_limit_init() wants to divide u64 by u64, use the appropriate
math function (div64_u64)
divide error: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 8390 Comm: syz-executor188 Not tainted 5.12.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:div_u64_rem include/linux/math64.h:28 [inline]
RIP: 0010:div_u64 include/linux/math64.h:127 [inline]
RIP: 0010:nft_limit_init+0x2a2/0x5e0 net/netfilter/nft_limit.c:85
Code: ef 4c 01 eb 41 0f 92 c7 48 89 de e8 38 a5 22 fa 4d 85 ff 0f 85 97 02 00 00 e8 ea 9e 22 fa 4c 0f af f3 45 89 ed 31 d2 4c 89 f0 <49> f7 f5 49 89 c6 e8 d3 9e 22 fa 48 8d 7d 48 48 b8 00 00 00 00 00
RSP: 0018:ffffc90009447198 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000200000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff875152e6 RDI: 0000000000000003
RBP: ffff888020f80908 R08: 0000200000000000 R09: 0000000000000000
R10: ffffffff875152d8 R11: 0000000000000000 R12: ffffc90009447270
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 000000000097a300(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000200001c4 CR3: 0000000026a52000 CR4: 00000000001506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
nf_tables_newexpr net/netfilter/nf_tables_api.c:2675 [inline]
nft_expr_init+0x145/0x2d0 net/netfilter/nf_tables_api.c:2713
nft_set_elem_expr_alloc+0x27/0x280 net/netfilter/nf_tables_api.c:5160
nf_tables_newset+0x1997/0x3150 net/netfilter/nf_tables_api.c:4321
nfnetlink_rcv_batch+0x85a/0x21b0 net/netfilter/nfnetlink.c:456
nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:580 [inline]
nfnetlink_rcv+0x3af/0x420 net/netfilter/nfnetlink.c:598
netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338
netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927
sock_sendmsg_nosec net/socket.c:654 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:674
____sys_sendmsg+0x6e8/0x810 net/socket.c:2350
___sys_sendmsg+0xf3/0x170 net/socket.c:2404
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2433
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xae
Fixes: c26844eda9d4 ("netfilter: nf_tables: Fix nft limit burst handling")
Fixes: 3e0f64b7dd31 ("netfilter: nft_limit: fix packet ratelimiting")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Diagnosed-by: Luigi Rizzo <lrizzo@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
dwork struct is large (>128 byte) and not needed when conntrack module
is not loaded.
Place it in net_generic data instead. The struct net dwork member is now
obsolete and will be removed in a followup patch.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
No need to keep this in struct net, place it in the net_generic data.
The sysctl pointer is removed from struct net in a followup patch.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Will reduce struct net size by 208 bytes.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This moves all nf_tables pernet data from struct net to a net_generic
extension, with the exception of the gencursor.
The latter is used in the data path and also outside of the nf_tables
core. All others are only used from the configuration plane.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
reduce size of struct net and make this self-contained.
The member in struct net is kept to minimize changes to struct net
layout, it will be removed in a separate patch.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
No need to place it in struct net, nfnetlink is a module and usage
doesn't occur in fastpath.
Also remove rcu usage:
Not a single reader of net->nfnl uses rcu accessors.
When exit_batch callbacks are executed the net namespace is already dead
so no calls to these functions are possible anymore (else we'd get NULL
deref crash too).
If the module is removed, then modules that call any of those functions
have been removed too so no calls to nfnl functions are possible either.
The nfnl and nfl_stash pointers in struct net are no longer used, they
will be removed in a followup patch to minimize changes to struct net
(causes rebuild for entire network stack).
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|