summaryrefslogtreecommitdiffstats
path: root/net/ipv4/netfilter/ipt_CLUSTERIP.c
AgeCommit message (Collapse)AuthorFilesLines
2022-11-18netfilter: conntrack: Fix data-races around ct markDaniel Xu1-2/+2
nf_conn:mark can be read from and written to in parallel. Use READ_ONCE()/WRITE_ONCE() for reads and writes to prevent unwanted compiler optimizations. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-01-22proc: remove PDE_DATA() completelyMuchun Song1-3/+3
Remove PDE_DATA() completely and replace it with pde_data(). [akpm@linux-foundation.org: fix naming clash in drivers/nubus/proc.c] [akpm@linux-foundation.org: now fix it properly] Link: https://lkml.kernel.org/r/20211124081956.87711-2-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Alexey Gladkov <gladkov.alexey@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-05netfilter: ipt_CLUSTERIP: fix refcount leak in clusterip_tg_check()Xin Xiong1-1/+4
The issue takes place in one error path of clusterip_tg_check(). When memcmp() returns nonzero, the function simply returns the error code, forgetting to decrease the reference count of a clusterip_config object, which is bumped earlier by clusterip_config_find_get(). This may incur reference count leak. Fix this issue by decrementing the refcount of the object in specific error path. Fixes: 06aa151ad1fc74 ("netfilter: ipt_CLUSTERIP: check MAC address when duplicate config is set") Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn> Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-08-01netfilter: ipt_CLUSTERIP: use clusterip_net to store pernet warningFlorian Westphal1-2/+3
No need to use struct net for this. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-08-01netfilter: ipt_CLUSTERIP: only add arp mangle hook when requiredFlorian Westphal1-17/+34
Do not register the arp mangling hooks from pernet init path. As-is, load of the module is enough for these hooks to become active in each net namespace. Use checkentry instead so hook is only added if a CLUSTERIP rule is used. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-04-26netfilter: allow to turn off xtables compat layerFlorian Westphal1-4/+4
The compat layer needs to parse untrusted input (the ruleset) to translate it to a 64bit compatible format. We had a number of bugs in this department in the past, so allow users to turn this feature off. Add CONFIG_NETFILTER_XTABLES_COMPAT kconfig knob and make it default to y to keep existing behaviour. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-07-06Replace HTTP links with HTTPS ones: IPv*Alexander A. Klimov1-1/+1
Rationale: Reduces attack surface on kernel devs opening the links for MITM as HTTPS traffic is much harder to manipulate. Deterministic algorithm: For each file: If not .svg: For each line: If doesn't contain `\bxmlns\b`: For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`: If both the HTTP and HTTPS versions return 200 OK and serve the same content: Replace HTTP with HTTPS. Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-04proc: convert everything to "struct proc_ops"Alexey Dobriyan1-8/+8
The most notable change is DEFINE_SHOW_ATTRIBUTE macro split in seq_file.h. Conversion rule is: llseek => proc_lseek unlocked_ioctl => proc_ioctl xxx => proc_xxx delete ".owner = THIS_MODULE" line [akpm@linux-foundation.org: fix drivers/isdn/capi/kcapi_proc.c] [sfr@canb.auug.org.au: fix kernel/sched/psi.c] Link: http://lkml.kernel.org/r/20200122180545.36222f50@canb.auug.org.au Link: http://lkml.kernel.org/r/20191225172546.GB13378@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16netfilter: Update obsolete comments referring to ip_conntrackYonatan Goldschmidt1-2/+2
In 9fb9cbb1082d ("[NETFILTER]: Add nf_conntrack subsystem.") the new generic nf_conntrack was introduced, and it came to supersede the old ip_conntrack. This change updates (some) of the obsolete comments referring to old file/function names of the ip_conntrack mechanism, as well as removes a few self-referencing comments that we shouldn't maintain anymore. I did not update any comments referring to historical actions (e.g, comments like "this file was derived from ..." were left untouched, even if the referenced file is no longer here). Signed-off-by: Yonatan Goldschmidt <yon.goldschmidt@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-06-19treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500Thomas Gleixner1-5/+1
Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-09net/ipv4/netfilter: Update comment from call_rcu_bh() to call_rcu()Paul E. McKenney1-1/+1
The RCU flavors have been consolidated, so this commit replaces a comment's mention of call_rcu_bh() with call_rcu(). Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Cc: Florian Westphal <fw@strlen.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: <netfilter-devel@vger.kernel.org> Cc: <coreteam@netfilter.org> Cc: <netdev@vger.kernel.org> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller1-1/+1
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter/IPVS updates for you net-next tree: 1) Missing NFTA_RULE_POSITION_ID netlink attribute validation, from Phil Sutter. 2) Restrict matching on tunnel metadata to rx/tx path, from wenxu. 3) Avoid indirect calls for IPV6=y, from Florian Westphal. 4) Add two indirections to prepare merger of IPV4 and IPV6 nat modules, from Florian Westphal. 5) Broken indentation in ctnetlink, from Colin Ian King. 6) Patches to use struct_size() from netfilter and IPVS, from Gustavo A. R. Silva. 7) Display kernel splat only once in case of racing to confirm conntrack from bridge plus nfqueue setups, from Chieh-Min Wang. 8) Skip checksum validation for layer 4 protocols that don't need it, patch from Alin Nastac. 9) Sparse warning due to symbol that should be static in CLUSTERIP, from Wei Yongjun. 10) Add new toggle to disable SDP payload translation when media endpoint is reachable though the same interface as the signalling peer, from Alin Nastac. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-16netfilter: ipt_CLUSTERIP: make symbol 'cip_netdev_notifier' staticWei Yongjun1-1/+1
Fixes the following sparse warnings: net/ipv4/netfilter/ipt_CLUSTERIP.c:867:23: warning: symbol 'cip_netdev_notifier' was not declared. Should it be static? Fixes: 5a86d68bcf02 ("netfilter: ipt_CLUSTERIP: fix deadlock in netns exit routine") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-01-28netfilter: ipt_CLUSTERIP: fix warning unused variable cnAnders Roxell1-1/+1
When CONFIG_PROC_FS isn't set the variable cn isn't used. net/ipv4/netfilter/ipt_CLUSTERIP.c: In function ‘clusterip_net_exit’: net/ipv4/netfilter/ipt_CLUSTERIP.c:849:24: warning: unused variable ‘cn’ [-Wunused-variable] struct clusterip_net *cn = clusterip_pernet(net); ^~ Rework so the variable 'cn' is declared inside "#ifdef CONFIG_PROC_FS". Fixes: b12f7bad5ad3 ("netfilter: ipt_CLUSTERIP: remove wrong WARN_ON_ONCE in netns exit routine") Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-12-18netfilter: ipt_CLUSTERIP: check MAC address when duplicate config is setTaehee Yoo1-1/+2
If same destination IP address config is already existing, that config is just used. MAC address also should be same. However, there is no MAC address checking routine. So that MAC address checking routine is added. test commands: %iptables -A INPUT -p tcp -i lo -d 192.168.0.5 --dport 80 \ -j CLUSTERIP --new --hashmode sourceip \ --clustermac 01:00:5e:00:00:20 --total-nodes 2 --local-node 1 %iptables -A INPUT -p tcp -i lo -d 192.168.0.5 --dport 80 \ -j CLUSTERIP --new --hashmode sourceip \ --clustermac 01:00:5e:00:00:21 --total-nodes 2 --local-node 1 After this patch, above commands are disallowed. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-12-18netfilter: ipt_CLUSTERIP: fix sleep-in-atomic bug in ↵Taehee Yoo1-5/+14
clusterip_config_entry_put() A proc_remove() can sleep. so that it can't be inside of spin_lock. Hence proc_remove() is moved to outside of spin_lock. and it also adds mutex to sync create and remove of proc entry(config->pde). test commands: SHELL#1 %while :; do iptables -A INPUT -p udp -i enp2s0 -d 192.168.1.100 \ --dport 9000 -j CLUSTERIP --new --hashmode sourceip \ --clustermac 01:00:5e:00:00:21 --total-nodes 3 --local-node 3; \ iptables -F; done SHELL#2 %while :; do echo +1 > /proc/net/ipt_CLUSTERIP/192.168.1.100; \ echo -1 > /proc/net/ipt_CLUSTERIP/192.168.1.100; done [ 2949.569864] BUG: sleeping function called from invalid context at kernel/sched/completion.c:99 [ 2949.579944] in_atomic(): 1, irqs_disabled(): 0, pid: 5472, name: iptables [ 2949.587920] 1 lock held by iptables/5472: [ 2949.592711] #0: 000000008f0ebcf2 (&(&cn->lock)->rlock){+...}, at: refcount_dec_and_lock+0x24/0x50 [ 2949.603307] CPU: 1 PID: 5472 Comm: iptables Tainted: G W 4.19.0-rc5+ #16 [ 2949.604212] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015 [ 2949.604212] Call Trace: [ 2949.604212] dump_stack+0xc9/0x16b [ 2949.604212] ? show_regs_print_info+0x5/0x5 [ 2949.604212] ___might_sleep+0x2eb/0x420 [ 2949.604212] ? set_rq_offline.part.87+0x140/0x140 [ 2949.604212] ? _rcu_barrier_trace+0x400/0x400 [ 2949.604212] wait_for_completion+0x94/0x710 [ 2949.604212] ? wait_for_completion_interruptible+0x780/0x780 [ 2949.604212] ? __kernel_text_address+0xe/0x30 [ 2949.604212] ? __lockdep_init_map+0x10e/0x5c0 [ 2949.604212] ? __lockdep_init_map+0x10e/0x5c0 [ 2949.604212] ? __init_waitqueue_head+0x86/0x130 [ 2949.604212] ? init_wait_entry+0x1a0/0x1a0 [ 2949.604212] proc_entry_rundown+0x208/0x270 [ 2949.604212] ? proc_reg_get_unmapped_area+0x370/0x370 [ 2949.604212] ? __lock_acquire+0x4500/0x4500 [ 2949.604212] ? complete+0x18/0x70 [ 2949.604212] remove_proc_subtree+0x143/0x2a0 [ 2949.708655] ? remove_proc_entry+0x390/0x390 [ 2949.708655] clusterip_tg_destroy+0x27a/0x630 [ipt_CLUSTERIP] [ ... ] Fixes: b3e456fce9f5 ("netfilter: ipt_CLUSTERIP: fix a race condition of proc file creation") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-12-18netfilter: ipt_CLUSTERIP: remove wrong WARN_ON_ONCE in netns exit routineTaehee Yoo1-1/+0
When network namespace is destroyed, both clusterip_tg_destroy() and clusterip_net_exit() are called. and clusterip_net_exit() is called before clusterip_tg_destroy(). Hence cleanup check code in clusterip_net_exit() doesn't make sense. test commands: %ip netns add vm1 %ip netns exec vm1 bash %ip link set lo up %iptables -A INPUT -p tcp -i lo -d 192.168.0.5 --dport 80 \ -j CLUSTERIP --new --hashmode sourceip \ --clustermac 01:00:5e:00:00:20 --total-nodes 2 --local-node 1 %exit %ip netns del vm1 splat looks like: [ 341.184508] WARNING: CPU: 1 PID: 87 at net/ipv4/netfilter/ipt_CLUSTERIP.c:840 clusterip_net_exit+0x319/0x380 [ipt_CLUSTERIP] [ 341.184850] Modules linked in: ipt_CLUSTERIP nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp iptable_filter bpfilter ip_tables x_tables [ 341.184850] CPU: 1 PID: 87 Comm: kworker/u4:2 Not tainted 4.19.0-rc5+ #16 [ 341.227509] Workqueue: netns cleanup_net [ 341.227509] RIP: 0010:clusterip_net_exit+0x319/0x380 [ipt_CLUSTERIP] [ 341.227509] Code: 0f 85 7f fe ff ff 48 c7 c2 80 64 2c c0 be a8 02 00 00 48 c7 c7 a0 63 2c c0 c6 05 18 6e 00 00 01 e8 bc 38 ff f5 e9 5b fe ff ff <0f> 0b e9 33 ff ff ff e8 4b 90 50 f6 e9 2d fe ff ff 48 89 df e8 de [ 341.227509] RSP: 0018:ffff88011086f408 EFLAGS: 00010202 [ 341.227509] RAX: dffffc0000000000 RBX: 1ffff1002210de85 RCX: 0000000000000000 [ 341.227509] RDX: 1ffff1002210de85 RSI: ffff880110813be8 RDI: ffffed002210de58 [ 341.227509] RBP: ffff88011086f4d0 R08: 0000000000000000 R09: 0000000000000000 [ 341.227509] R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff1002210de81 [ 341.227509] R13: ffff880110625a48 R14: ffff880114cec8c8 R15: 0000000000000014 [ 341.227509] FS: 0000000000000000(0000) GS:ffff880116600000(0000) knlGS:0000000000000000 [ 341.227509] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 341.227509] CR2: 00007f11fd38e000 CR3: 000000013ca16000 CR4: 00000000001006e0 [ 341.227509] Call Trace: [ 341.227509] ? __clusterip_config_find+0x460/0x460 [ipt_CLUSTERIP] [ 341.227509] ? default_device_exit+0x1ca/0x270 [ 341.227509] ? remove_proc_entry+0x1cd/0x390 [ 341.227509] ? dev_change_net_namespace+0xd00/0xd00 [ 341.227509] ? __init_waitqueue_head+0x130/0x130 [ 341.227509] ops_exit_list.isra.10+0x94/0x140 [ 341.227509] cleanup_net+0x45b/0x900 [ ... ] Fixes: 613d0776d3fe ("netfilter: exit_net cleanup check added") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-12-18netfilter: ipt_CLUSTERIP: fix deadlock in netns exit routineTaehee Yoo1-68/+87
When network namespace is destroyed, cleanup_net() is called. cleanup_net() holds pernet_ops_rwsem then calls each ->exit callback. So that clusterip_tg_destroy() is called by cleanup_net(). And clusterip_tg_destroy() calls unregister_netdevice_notifier(). But both cleanup_net() and clusterip_tg_destroy() hold same lock(pernet_ops_rwsem). hence deadlock occurrs. After this patch, only 1 notifier is registered when module is inserted. And all of configs are added to per-net list. test commands: %ip netns add vm1 %ip netns exec vm1 bash %ip link set lo up %iptables -A INPUT -p tcp -i lo -d 192.168.0.5 --dport 80 \ -j CLUSTERIP --new --hashmode sourceip \ --clustermac 01:00:5e:00:00:20 --total-nodes 2 --local-node 1 %exit %ip netns del vm1 splat looks like: [ 341.809674] ============================================ [ 341.809674] WARNING: possible recursive locking detected [ 341.809674] 4.19.0-rc5+ #16 Tainted: G W [ 341.809674] -------------------------------------------- [ 341.809674] kworker/u4:2/87 is trying to acquire lock: [ 341.809674] 000000005da2d519 (pernet_ops_rwsem){++++}, at: unregister_netdevice_notifier+0x8c/0x460 [ 341.809674] [ 341.809674] but task is already holding lock: [ 341.809674] 000000005da2d519 (pernet_ops_rwsem){++++}, at: cleanup_net+0x119/0x900 [ 341.809674] [ 341.809674] other info that might help us debug this: [ 341.809674] Possible unsafe locking scenario: [ 341.809674] [ 341.809674] CPU0 [ 341.809674] ---- [ 341.809674] lock(pernet_ops_rwsem); [ 341.809674] lock(pernet_ops_rwsem); [ 341.809674] [ 341.809674] *** DEADLOCK *** [ 341.809674] [ 341.809674] May be due to missing lock nesting notation [ 341.809674] [ 341.809674] 3 locks held by kworker/u4:2/87: [ 341.809674] #0: 00000000d9df6c92 ((wq_completion)"%s""netns"){+.+.}, at: process_one_work+0xafe/0x1de0 [ 341.809674] #1: 00000000c2cbcee2 (net_cleanup_work){+.+.}, at: process_one_work+0xb60/0x1de0 [ 341.809674] #2: 000000005da2d519 (pernet_ops_rwsem){++++}, at: cleanup_net+0x119/0x900 [ 341.809674] [ 341.809674] stack backtrace: [ 341.809674] CPU: 1 PID: 87 Comm: kworker/u4:2 Tainted: G W 4.19.0-rc5+ #16 [ 341.809674] Workqueue: netns cleanup_net [ 341.809674] Call Trace: [ ... ] [ 342.070196] down_write+0x93/0x160 [ 342.070196] ? unregister_netdevice_notifier+0x8c/0x460 [ 342.070196] ? down_read+0x1e0/0x1e0 [ 342.070196] ? sched_clock_cpu+0x126/0x170 [ 342.070196] ? find_held_lock+0x39/0x1c0 [ 342.070196] unregister_netdevice_notifier+0x8c/0x460 [ 342.070196] ? register_netdevice_notifier+0x790/0x790 [ 342.070196] ? __local_bh_enable_ip+0xe9/0x1b0 [ 342.070196] ? __local_bh_enable_ip+0xe9/0x1b0 [ 342.070196] ? clusterip_tg_destroy+0x372/0x650 [ipt_CLUSTERIP] [ 342.070196] ? trace_hardirqs_on+0x93/0x210 [ 342.070196] ? __bpf_trace_preemptirq_template+0x10/0x10 [ 342.070196] ? clusterip_tg_destroy+0x372/0x650 [ipt_CLUSTERIP] [ 342.123094] clusterip_tg_destroy+0x3ad/0x650 [ipt_CLUSTERIP] [ 342.123094] ? clusterip_net_init+0x3d0/0x3d0 [ipt_CLUSTERIP] [ 342.123094] ? cleanup_match+0x17d/0x200 [ip_tables] [ 342.123094] ? xt_unregister_table+0x215/0x300 [x_tables] [ 342.123094] ? kfree+0xe2/0x2a0 [ 342.123094] cleanup_entry+0x1d5/0x2f0 [ip_tables] [ 342.123094] ? cleanup_match+0x200/0x200 [ip_tables] [ 342.123094] __ipt_unregister_table+0x9b/0x1a0 [ip_tables] [ 342.123094] iptable_filter_net_exit+0x43/0x80 [iptable_filter] [ 342.123094] ops_exit_list.isra.10+0x94/0x140 [ 342.123094] cleanup_net+0x45b/0x900 [ ... ] Fixes: 202f59afd441 ("netfilter: ipt_CLUSTERIP: do not hold dev") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-12-01netfilter: Replace call_rcu_bh(), rcu_barrier_bh(), and synchronize_rcu_bh()Paul E. McKenney1-3/+3
Now that call_rcu()'s callback is not invoked until after bh-disable regions of code have completed (in addition to explicitly marked RCU read-side critical sections), call_rcu() can be used in place of call_rcu_bh(). Similarly, rcu_barrier() can be used in place of rcu_barrier_bh() and synchronize_rcu() in place of synchronize_rcu_bh(). This commit therefore makes these changes. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-03-27net: Drop pernet_operations::asyncKirill Tkhai1-1/+0
Synchronous pernet_operations are not allowed anymore. All are asynchronous. So, drop the structure member. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-26net: Use octal not symbolic permissionsJoe Perches1-1/+1
Prefer the direct use of octal for permissions. Done with checkpatch -f --types=SYMBOLIC_PERMS --fix-inplace and some typing. Miscellanea: o Whitespace neatening around these conversions. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-5/+10
All of the conflicts were cases of overlapping changes. In net/core/devlink.c, we have to make care that the resouce size_params have become a struct member rather than a pointer to such an object. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-27net: Convert clusterip_net_opsKirill Tkhai1-0/+1
These pernet_operations register and unregister nf hooks, and populate and destroy /proc entry. So, they are able to be marked as async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-25netfilter: ipt_CLUSTERIP: put config instead of freeing itFlorian Westphal1-3/+4
Once struct is added to per-netns list it becomes visible to other cpus, so we cannot use kfree(). Also delay setting entries refcount to 1 until after everything is initialised so that when we call clusterip_config_put() in this spot entries is still zero. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-25netfilter: ipt_CLUSTERIP: put config struct if we can't increment ct refcountFlorian Westphal1-2/+6
This needs to put() the entry to avoid a resource leak in error path. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-14netfilter: ipt_CLUSTERIP: fix a refcount bug in clusterip_config_find_get()Cong Wang1-2/+6
In clusterip_config_find_get() we hold RCU read lock so it could run concurrently with clusterip_config_entry_put(), as a result, the refcnt could go back to 1 from 0, which leads to a double list_del()... Just replace refcount_inc() with refcount_inc_not_zero(), as for c->refcount. Fixes: d73f33b16883 ("netfilter: CLUSTERIP: RCU conversion") Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-08netfilter: ipt_CLUSTERIP: fix a race condition of proc file creationCong Wang1-6/+6
There is a race condition between clusterip_config_entry_put() and clusterip_config_init(), after we release the spinlock in clusterip_config_entry_put(), a new proc file with a same IP could be created immediately since it is already removed from the configs list, therefore it triggers this warning: ------------[ cut here ]------------ proc_dir_entry 'ipt_CLUSTERIP/172.20.0.170' already registered WARNING: CPU: 1 PID: 4152 at fs/proc/generic.c:330 proc_register+0x2a4/0x370 fs/proc/generic.c:329 Kernel panic - not syncing: panic_on_warn set ... As a quick fix, just move the proc_remove() inside the spinlock. Reported-by: <syzbot+03218bcdba6aa76441a3@syzkaller.appspotmail.com> Fixes: 6c5d5cfbe3c5 ("netfilter: ipt_CLUSTERIP: check duplicate config when initializing") Tested-by: Paolo Abeni <pabeni@redhat.com> Cc: Xin Long <lucien.xin@gmail.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller1-3/+13
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree, they are: 1) Fix OOM that syskaller triggers with ipt_replace.size = -1 and IPT_SO_SET_REPLACE socket option, from Dmitry Vyukov. 2) Check for too long extension name in xt_request_find_{match|target} that result in out-of-bound reads, from Eric Dumazet. 3) Fix memory exhaustion bug in ipset hash:*net* types when adding ranges that look like x.x.x.x-255.255.255.255, from Jozsef Kadlecsik. 4) Fix pointer leaks to userspace in x_tables, from Dmitry Vyukov. 5) Insufficient sanity checks in clusterip_tg_check(), also from Dmitry. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-31netfilter: ipt_CLUSTERIP: fix out-of-bounds accesses in clusterip_tg_check()Dmitry Vyukov1-3/+13
Commit 136e92bbec0a switched local_nodes from an array to a bitmask but did not add proper bounds checks. As the result clusterip_config_init_nodelist() can both over-read ipt_clusterip_tgt_info.local_nodes and over-write clusterip_config.local_nodes. Add bounds checks for both. Fixes: 136e92bbec0a ("[NETFILTER] CLUSTERIP: use a bitmap to store node responsibility data") Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-01-16net: delete /proc THIS_MODULE referencesAlexey Dobriyan1-1/+0
/proc has been ignoring struct file_operations::owner field for 10 years. Specifically, it started with commit 786d7e1612f0b0adb6046f19b906609e4fe8b1ba ("Fix rmmod/read/write races in /proc entries"). Notice the chunk where inode->i_fop is initialized with proxy struct file_operations for regular files: - if (de->proc_fops) - inode->i_fop = de->proc_fops; + if (de->proc_fops) { + if (S_ISREG(inode->i_mode)) + inode->i_fop = &proc_reg_file_ops; + else + inode->i_fop = de->proc_fops; + } VFS stopped pinning module at this point. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-07netfilter: ipt_CLUSTERIP: fix clusterip_net_exit build regressionArnd Bergmann1-1/+1
The added check produces a build error when CONFIG_PROC_FS is disabled: net/ipv4/netfilter/ipt_CLUSTERIP.c: In function 'clusterip_net_exit': net/ipv4/netfilter/ipt_CLUSTERIP.c:822:28: error: 'cn' undeclared (first use in this function) This moves the variable declaration out of the #ifdef to make it available to the WARN_ON_ONCE(). Fixes: 613d0776d3fe ("netfilter: exit_net cleanup check added") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-11-20netfilter: exit_net cleanup check addedVasily Averin1-0/+1
Be sure that lists initialized in net_init hook was return to initial state. Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-09-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller1-1/+1
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree. Basically, updates to the conntrack core, enhancements for nf_tables, conversion of netfilter hooks from linked list to array to improve memory locality and asorted improvements for the Netfilter codebase. More specifically, they are: 1) Add expection to hashes after timer initialization to prevent access from another CPU that walks on the hashes and calls del_timer(), from Florian Westphal. 2) Don't update nf_tables chain counters from hot path, this is only used by the x_tables compatibility layer. 3) Get rid of nested rcu_read_lock() calls from netfilter hook path. Hooks are always guaranteed to run from rcu read side, so remove nested rcu_read_lock() where possible. Patch from Taehee Yoo. 4) nf_tables new ruleset generation notifications include PID and name of the process that has updated the ruleset, from Phil Sutter. 5) Use skb_header_pointer() from nft_fib, so we can reuse this code from the nf_family netdev family. Patch from Pablo M. Bermudo. 6) Add support for nft_fib in nf_tables netdev family, also from Pablo. 7) Use deferrable workqueue for conntrack garbage collection, to reduce power consumption, from Patch from Subash Abhinov Kasiviswanathan. 8) Add nf_ct_expect_iterate_net() helper and use it. From Florian Westphal. 9) Call nf_ct_unconfirmed_destroy only from cttimeout, from Florian. 10) Drop references on conntrack removal path when skbuffs has escaped via nfqueue, from Florian. 11) Don't queue packets to nfqueue with dying conntrack, from Florian. 12) Constify nf_hook_ops structure, from Florian. 13) Remove neededlessly branch in nf_tables trace code, from Phil Sutter. 14) Add nla_strdup(), from Phil Sutter. 15) Rise nf_tables objects name size up to 255 chars, people want to use DNS names, so increase this according to what RFC 1035 specifies. Patch series from Phil Sutter. 16) Kill nf_conntrack_default_on, it's broken. Default on conntrack hook registration on demand, suggested by Eric Dumazet, patch from Florian. 17) Remove unused variables in compat_copy_entry_from_user both in ip_tables and arp_tables code. Patch from Taehee Yoo. 18) Constify struct nf_conntrack_l4proto, from Julia Lawall. 19) Constify nf_loginfo structure, also from Julia. 20) Use a single rb root in connlimit, from Taehee Yoo. 21) Remove unused netfilter_queue_init() prototype, from Taehee Yoo. 22) Use audit_log() instead of open-coding it, from Geliang Tang. 23) Allow to mangle tcp options via nft_exthdr, from Florian. 24) Allow to fetch TCP MSS from nft_rt, from Florian. This includes a fix for a miscalculation of the minimal length. 25) Simplify branch logic in h323 helper, from Nick Desaulniers. 26) Calculate netlink attribute size for conntrack tuple at compile time, from Florian. 27) Remove protocol name field from nf_conntrack_{l3,l4}proto structure. From Florian. 28) Remove holes in nf_conntrack_l4proto structure, so it becomes smaller. From Florian. 29) Get rid of print_tuple() indirection for /proc conntrack listing. Place all the code in net/netfilter/nf_conntrack_standalone.c. Patch from Florian. 30) Do not built in print_conntrack() if CONFIG_NF_CONNTRACK_PROCFS is off. From Florian. 31) Constify most nf_conntrack_{l3,l4}proto helper functions, from Florian. 32) Fix broken indentation in ebtables extensions, from Colin Ian King. 33) Fix several harmless sparse warning, from Florian. 34) Convert netfilter hook infrastructure to use array for better memory locality, joint work done by Florian and Aaron Conole. Moreover, add some instrumentation to debug this. 35) Batch nf_unregister_net_hooks() calls, to call synchronize_net once per batch, from Florian. 36) Get rid of noisy logging in ICMPv6 conntrack helper, from Florian. 37) Get rid of obsolete NFDEBUG() instrumentation, from Varsha Rao. 38) Remove unused code in the generic protocol tracker, from Davide Caratti. I think I will have material for a second Netfilter batch in my queue if time allow to make it fit in this merge window. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-31netfilter: nf_hook_ops structs can be constFlorian Westphal1-1/+1
We no longer place these on a list so they can be const. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-19netfilter: ipt_CLUSTERIP: fix use-after-free of proc entrySabrina Dubroca1-1/+3
When we delete a netns with a CLUSTERIP rule, clusterip_net_exit() is called first, removing /proc/net/ipt_CLUSTERIP. Then clusterip_config_entry_put() is called from clusterip_tg_destroy(), and tries to remove its entry under /proc/net/ipt_CLUSTERIP/. Fix this by checking that the parent directory of the entry to remove hasn't already been deleted. The following triggers a KASAN splat (stealing the reproducer from 202f59afd441, thanks to Jianlin Shi and Xin Long): ip netns add test ip link add veth0_in type veth peer name veth0_out ip link set veth0_in netns test ip netns exec test ip link set lo up ip netns exec test ip link set veth0_in up ip netns exec test iptables -I INPUT -d 1.2.3.4 -i veth0_in -j \ CLUSTERIP --new --clustermac 89:d4:47:eb:9a:fa --total-nodes 3 \ --local-node 1 --hashmode sourceip-sourceport ip netns del test Fixes: ce4ff76c15a8 ("netfilter: ipt_CLUSTERIP: make proc directory per net namespace") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-06-19netfilter: ipt_CLUSTERIP: do not hold devXin Long1-28/+73
It's a terrible thing to hold dev in iptables target. When the dev is being removed, unregister_netdevice has to wait for the dev to become free. dmesg will keep logging the err: kernel:unregister_netdevice: waiting for veth0_in to become free. \ Usage count = 1 until iptables rules with this target are removed manually. The worse thing is when deleting a netns, a virtual nic will be deleted instead of reset to init_net in default_device_ops exit/exit_batch. As it is earlier than to flush the iptables rules in iptable_filter_net_ops exit, unregister_netdevice will block to wait for the nic to become free. As unregister_netdevice is actually waiting for iptables rules flushing while iptables rules have to be flushed after unregister_netdevice. This 'dead lock' will cause unregister_netdevice to block there forever. As the netns is not available to operate at that moment, iptables rules can not even be flushed manually either. The reproducer can be: # ip netns add test # ip link add veth0_in type veth peer name veth0_out # ip link set veth0_in netns test # ip netns exec test ip link set lo up # ip netns exec test ip link set veth0_in up # ip netns exec test iptables -I INPUT -d 1.2.3.4 -i veth0_in -j \ CLUSTERIP --new --clustermac 89:d4:47:eb:9a:fa --total-nodes 3 \ --local-node 1 --hashmode sourceip-sourceport # ip netns del test This issue can be triggered by all virtual nics with ipt_CLUSTERIP. This patch is to fix it by not holding dev in ipt_CLUSTERIP, but saving the dev->ifindex instead of the dev. As Pablo Neira Ayuso's suggestion, it will refresh c->ifindex and dev's mc by registering a netdevice notifier, just as what xt_TEE does. So it removes the old codes updating dev's mc, and also no need to initialize c->ifindex with dev->ifindex. But as one config can be shared by more than one targets, and the netdev notifier is per config, not per target. It couldn't get e->ip.iniface in the notifier handler. So e->ip.iniface has to be saved into config. Note that for backwards compatibility, this patch doesn't remove the codes checking if the dev exists before creating a config. v1->v2: - As Pablo Neira Ayuso's suggestion, register a netdevice notifier to manage c->ifindex and dev's mc. Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-05-29netfilter: ipt_CLUSTERIP: switch to nf_register_net_hookFlorian Westphal1-7/+7
one of the last remaining users of the old api, hopefully followup commit can remove it soon. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-1/+1
Conflicts were simply overlapping changes. In the net/ipv4/route.c case the code had simply moved around a little bit and the same fix was made in both 'net' and 'net-next'. In the net/sched/sch_generic.c case a fix in 'net' happened at the same time that a new argument was added to qdisc_hash_add(). Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13netfilter: ipt_CLUSTERIP: Fix wrong conntrack netns refcnt usageGao Feng1-1/+1
Current codes invoke wrongly nf_ct_netns_get in the destroy routine, it should use nf_ct_netns_put, not nf_ct_netns_get. It could cause some modules could not be unloaded. Fixes: ecb2421b5ddf ("netfilter: add and use nf_ct_netns_get/put") Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-03-17netfilter: refcounter conversionsReshetova, Elena1-9/+10
refcount_t type and corresponding API (see include/linux/refcount.h) should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller1-0/+1
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree, they are: 1) Stash ctinfo 3-bit field into pointer to nf_conntrack object from sk_buff so we only access one single cacheline in the conntrack hotpath. Patchset from Florian Westphal. 2) Don't leak pointer to internal structures when exporting x_tables ruleset back to userspace, from Willem DeBruijn. This includes new helper functions to copy data to userspace such as xt_data_to_user() as well as conversions of our ip_tables, ip6_tables and arp_tables clients to use it. Not surprinsingly, ebtables requires an ad-hoc update. There is also a new field in x_tables extensions to indicate the amount of bytes that we copy to userspace. 3) Add nf_log_all_netns sysctl: This new knob allows you to enable logging via nf_log infrastructure for all existing netnamespaces. Given the effort to provide pernet syslog has been discontinued, let's provide a way to restore logging using netfilter kernel logging facilities in trusted environments. Patch from Michal Kubecek. 4) Validate SCTP checksum from conntrack helper, from Davide Caratti. 5) Merge UDPlite conntrack and NAT helpers into UDP, this was mostly a copy&paste from the original helper, from Florian Westphal. 6) Reset netfilter state when duplicating packets, also from Florian. 7) Remove unnecessary check for broadcast in IPv6 in pkttype match and nft_meta, from Liping Zhang. 8) Add missing code to deal with loopback packets from nft_meta when used by the netdev family, also from Liping. 9) Several cleanups on nf_tables, one to remove unnecessary check from the netlink control plane path to add table, set and stateful objects and code consolidation when unregister chain hooks, from Gao Feng. 10) Fix harmless reference counter underflow in IPVS that, however, results in problems with the introduction of the new refcount_t type, from David Windsor. 11) Enable LIBCRC32C from nf_ct_sctp instead of nf_nat_sctp, from Davide Caratti. 12) Missing documentation on nf_tables uapi header, from Liping Zhang. 13) Use rb_entry() helper in xt_connlimit, from Geliang Tang. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-18netfilter: ipt_CLUSTERIP: fix build error without procfsArnd Bergmann1-1/+6
We can't access c->pde if CONFIG_PROC_FS is disabled: net/ipv4/netfilter/ipt_CLUSTERIP.c: In function 'clusterip_config_find_get': net/ipv4/netfilter/ipt_CLUSTERIP.c:147:9: error: 'struct clusterip_config' has no member named 'pde' This moves the check inside of another #ifdef. Fixes: 6c5d5cfbe3c5 ("netfilter: ipt_CLUSTERIP: check duplicate config when initializing") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-01-09xtables: extend matches and targets with .usersizeWillem de Bruijn1-0/+1
In matches and targets that define a kernel-only tail to their xt_match and xt_target data structs, add a field .usersize that specifies up to where data is to be shared with userspace. Performed a search for comment "Used internally by the kernel" to find relevant matches and targets. Manually inspected the structs to derive a valid offsetof. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-23netfilter: ipt_CLUSTERIP: check duplicate config when initializingXin Long1-11/+23
Now when adding an ipt_CLUSTERIP rule, it only checks duplicate config in clusterip_config_find_get(). But after that, there may be still another thread to insert a config with the same ip, then it leaves proc_create_data to do duplicate check. It's more reasonable to check duplicate config by ipt_CLUSTERIP itself, instead of checking it by proc fs duplicate file check. Before, when proc fs allowed duplicate name files in a directory, It could even crash kernel because of use-after-free. This patch is to check duplicate config under the protection of clusterip net lock when initializing a new config and correct the return err. Note that it also moves proc file node creation after adding new config, as proc_create_data may sleep, it couldn't be called under the clusterip_net lock. clusterip_config_find_get returns NULL if c->pde is null to make sure it can't be used until the proc file node creation is done. Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-04netfilter: add and use nf_ct_netns_get/putFlorian Westphal1-2/+2
currently aliased to try_module_get/_put. Will be changed in next patch when we add functions to make use of ->net argument to store usercount per l3proto tracker. This is needed to avoid registering the conntrack hooks in all netns and later only enable connection tracking in those that need conntrack. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-11-18netns: make struct pernet_operations::id unsigned intAlexey Dobriyan1-1/+1
Make struct pernet_operations::id unsigned. There are 2 reasons to do so: 1) This field is really an index into an zero based array and thus is unsigned entity. Using negative value is out-of-bound access by definition. 2) On x86_64 unsigned 32-bit data which are mixed with pointers via array indexing or offsets added or subtracted to pointers are preffered to signed 32-bit data. "int" being used as an array index needs to be sign-extended to 64-bit before being used. void f(long *p, int i) { g(p[i]); } roughly translates to movsx rsi, esi mov rdi, [rsi+...] call g MOVSX is 3 byte instruction which isn't necessary if the variable is unsigned because x86_64 is zero extending by default. Now, there is net_generic() function which, you guessed it right, uses "int" as an array index: static inline void *net_generic(const struct net *net, int id) { ... ptr = ng->ptr[id - 1]; ... } And this function is used a lot, so those sign extensions add up. Patch snipes ~1730 bytes on allyesconfig kernel (without all junk messing with code generation): add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730) Unfortunately some functions actually grow bigger. This is a semmingly random artefact of code generation with register allocator being used differently. gcc decides that some variable needs to live in new r8+ registers and every access now requires REX prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be used which is longer than [r8] However, overall balance is in negative direction: add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730) function old new delta nfsd4_lock 3886 3959 +73 tipc_link_build_proto_msg 1096 1140 +44 mac80211_hwsim_new_radio 2776 2808 +32 tipc_mon_rcv 1032 1058 +26 svcauth_gss_legacy_init 1413 1429 +16 tipc_bcbase_select_primary 379 392 +13 nfsd4_exchange_id 1247 1260 +13 nfsd4_setclientid_confirm 782 793 +11 ... put_client_renew_locked 494 480 -14 ip_set_sockfn_get 730 716 -14 geneve_sock_add 829 813 -16 nfsd4_sequence_done 721 703 -18 nlmclnt_lookup_host 708 686 -22 nfsd4_lockt 1085 1063 -22 nfs_get_client 1077 1050 -27 tcf_bpf_init 1106 1076 -30 nfsd4_encode_fattr 5997 5930 -67 Total: Before=154856051, After=154854321, chg -0.00% Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16netfilter: ipv4: whitespace around operatorsIan Morris1-4/+4
This patch cleanses whitespace around arithmetical operators. No changes detected by objdiff. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-18netfilter: Pass priv instead of nf_hook_ops to netfilter hooksEric W. Biederman1-1/+1
Only pass the void *priv parameter out of the nf_hook_ops. That is all any of the functions are interested now, and by limiting what is passed it becomes simpler to change implementation details. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-17netfilter: Use nf_hook_state.netEric W. Biederman1-1/+1
Instead of saying "net = dev_net(state->in?state->in:state->out)" just say "state->net". As that information is now availabe, much less confusing and much less error prone. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-15netfilter: x_tables: add context to know if extension runs from nft_compatPablo Neira Ayuso1-0/+5
Currently, we have four xtables extensions that cannot be used from the xt over nft compat layer. The problem is that they need real access to the full blown xt_entry to validate that the rule comes with the right dependencies. This check was introduced to overcome the lack of sufficient userspace dependency validation in iptables. To resolve this problem, this patch introduces a new field to the xt_tgchk_param structure that tell us if the extension is run from nft_compat context. The three affected extensions are: 1) CLUSTERIP, this target has been superseded by xt_cluster. So just bail out by returning -EINVAL. 2) TCPMSS. Relax the checking when used from nft_compat. If used with the wrong configuration, it will corrupt !syn packets by adding TCP MSS option. 3) ebt_stp. Relax the check to make sure it uses the reserved destination MAC address for STP. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Tested-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>