summaryrefslogtreecommitdiffstats
path: root/net/xfrm/xfrm_policy.c
AgeCommit message (Collapse)AuthorFilesLines
2020-03-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-0/+2
Minor comment conflict in mac80211. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-27Merge branch 'master' of ↵David S. Miller1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2020-03-27 1) Handle NETDEV_UNREGISTER for xfrm device to handle asynchronous unregister events cleanly. From Raed Salem. 2) Fix vti6 tunnel inter address family TX through bpf_redirect(). From Nicolas Dichtel. 3) Fix lenght check in verify_sec_ctx_len() to avoid a slab-out-of-bounds. From Xin Long. 4) Add a missing verify_sec_ctx_len check in xfrm_add_acquire to avoid a possible out-of-bounds to access. From Xin Long. 5) Use built-in RCU list checking of hlist_for_each_entry_rcu to silence false lockdep warning in __xfrm6_tunnel_spi_lookup when CONFIG_PROVE_RCU_LIST is enabled. From Madhuparna Bhowmik. 6) Fix a panic on esp offload when crypto is done asynchronously. From Xin Long. 7) Fix a skb memory leak in an error path of vti6_rcv. From Torsten Hilbrich. 8) Fix a race that can lead to a doulbe free in xfrm_policy_timer. From Xin Long. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24xfrm: policy: Fix doulbe free in xfrm_policy_timerYueHaibing1-0/+2
After xfrm_add_policy add a policy, its ref is 2, then xfrm_policy_timer read_lock xp->walk.dead is 0 .... mod_timer() xfrm_policy_kill policy->walk.dead = 1 .... del_timer(&policy->timer) xfrm_pol_put //ref is 1 xfrm_pol_put //ref is 0 xfrm_policy_destroy call_rcu xfrm_pol_hold //ref is 1 read_unlock xfrm_pol_put //ref is 0 xfrm_policy_destroy call_rcu xfrm_policy_destroy is called twice, which may leads to double free. Call Trace: RIP: 0010:refcount_warn_saturate+0x161/0x210 ... xfrm_policy_timer+0x522/0x600 call_timer_fn+0x1b3/0x5e0 ? __xfrm_decode_session+0x2990/0x2990 ? msleep+0xb0/0xb0 ? _raw_spin_unlock_irq+0x24/0x40 ? __xfrm_decode_session+0x2990/0x2990 ? __xfrm_decode_session+0x2990/0x2990 run_timer_softirq+0x5c5/0x10e0 Fix this by use write_lock_bh in xfrm_policy_kill. Fixes: ea2dea9dacc2 ("xfrm: remove policy lock when accessing policy->walk.dead") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Timo Teräs <timo.teras@iki.fi> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-03-23Remove DST_HOSTDavid Laight1-2/+1
Previous changes to the IP routing code have removed all the tests for the DS_HOST route flag. Remove the flags and all the code that sets it. Signed-off-by: David Laight <david.laight@aculab.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-04treewide: remove redundant IS_ERR() before error code checkMasahiro Yamada1-1/+1
'PTR_ERR(p) == -E*' is a stronger condition than IS_ERR(p). Hence, IS_ERR(p) is unneeded. The semantic patch that generates this commit is as follows: // <smpl> @@ expression ptr; constant error_code; @@ -IS_ERR(ptr) && (PTR_ERR(ptr) == - error_code) +PTR_ERR(ptr) == - error_code // </smpl> Link: http://lkml.kernel.org/r/20200106045833.1725-1-masahiroy@kernel.org Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Cc: Julia Lawall <julia.lawall@lip6.fr> Acked-by: Stephen Boyd <sboyd@kernel.org> [drivers/clk/clk.c] Acked-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> [GPIO] Acked-by: Wolfram Sang <wsa@the-dreams.de> [drivers/i2c] Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [acpi/scan.c] Acked-by: Rob Herring <robh@kernel.org> Cc: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-12-09xfrm: add espintcp (RFC 8229)Sabrina Dubroca1-0/+7
TCP encapsulation of IKE and IPsec messages (RFC 8229) is implemented as a TCP ULP, overriding in particular the sendmsg and recvmsg operations. A Stream Parser is used to extract messages out of the TCP stream using the first 2 bytes as length marker. Received IKE messages are put on "ike_queue", waiting to be dequeued by the custom recvmsg implementation. Received ESP messages are sent to XFRM, like with UDP encapsulation. Some of this code is taken from the original submission by Herbert Xu. Currently, only IPv4 is supported, like for UDP encapsulation. Co-developed-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-10-01netfilter: drop bridge nf reset from nf_resetFlorian Westphal1-1/+1
commit 174e23810cd31 ("sk_buff: drop all skb extensions on free and skb scrubbing") made napi recycle always drop skb extensions. The additional skb_ext_del() that is performed via nf_reset on napi skb recycle is not needed anymore. Most nf_reset() calls in the stack are there so queued skb won't block 'rmmod nf_conntrack' indefinitely. This removes the skb_ext_del from nf_reset, and renames it to a more fitting nf_reset_ct(). In a few selected places, add a call to skb_ext_reset to make sure that no active extensions remain. I am submitting this for "net", because we're still early in the release cycle. The patch applies to net-next too, but I think the rename causes needless divergence between those trees. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-06Merge branch 'master' of ↵David S. Miller1-2/+4
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2019-09-05 1) Several xfrm interface fixes from Nicolas Dichtel: - Avoid an interface ID corruption on changelink. - Fix wrong intterface names in the logs. - Fix a list corruption when changing network namespaces. - Fix unregistation of the underying phydev. 2) Fix a potential warning when merging xfrm_plocy nodes. From Florian Westphal. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-24xfrm/xfrm_policy: fix dst dev null pointer dereference in collect_md modeHangbin Liu1-2/+2
In decode_session{4,6} there is a possibility that the skb dst dev is NULL, e,g, with tunnel collect_md mode, which will cause kernel crash. Here is what the code path looks like, for GRE: - ip6gre_tunnel_xmit - ip6gre_xmit_ipv6 - __gre6_xmit - ip6_tnl_xmit - if skb->len - t->tun_hlen - eth_hlen > mtu; return -EMSGSIZE - icmpv6_send - icmpv6_route_lookup - xfrm_decode_session_reverse - decode_session4 - oif = skb_dst(skb)->dev->ifindex; <-- here - decode_session6 - oif = skb_dst(skb)->dev->ifindex; <-- here The reason is __metadata_dst_init() init dst->dev to NULL by default. We could not fix it in __metadata_dst_init() as there is no dev supplied. On the other hand, the skb_dst(skb)->dev is actually not needed as we called decode_session{4,6} via xfrm_decode_session_reverse(), so oif is not used by: fl4->flowi4_oif = reverse ? skb->skb_iif : oif; So make a dst dev check here should be clean and safe. v4: No changes. v3: No changes. v2: fix the issue in decode_session{4,6} instead of updating shared dst dev in {ip_md, ip6}_tunnel_xmit. Fixes: 8d79266bc48c ("ip6_tunnel: add collect_md mode to IPv6 tunnels") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Tested-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-20xfrm: policy: avoid warning splat when merging nodesFlorian Westphal1-2/+4
syzbot reported a splat: xfrm_policy_inexact_list_reinsert+0x625/0x6e0 net/xfrm/xfrm_policy.c:877 CPU: 1 PID: 6756 Comm: syz-executor.1 Not tainted 5.3.0-rc2+ #57 Call Trace: xfrm_policy_inexact_node_reinsert net/xfrm/xfrm_policy.c:922 [inline] xfrm_policy_inexact_node_merge net/xfrm/xfrm_policy.c:958 [inline] xfrm_policy_inexact_insert_node+0x537/0xb50 net/xfrm/xfrm_policy.c:1023 xfrm_policy_inexact_alloc_chain+0x62b/0xbd0 net/xfrm/xfrm_policy.c:1139 xfrm_policy_inexact_insert+0xe8/0x1540 net/xfrm/xfrm_policy.c:1182 xfrm_policy_insert+0xdf/0xce0 net/xfrm/xfrm_policy.c:1574 xfrm_add_policy+0x4cf/0x9b0 net/xfrm/xfrm_user.c:1670 xfrm_user_rcv_msg+0x46b/0x720 net/xfrm/xfrm_user.c:2676 netlink_rcv_skb+0x1f0/0x460 net/netlink/af_netlink.c:2477 xfrm_netlink_rcv+0x74/0x90 net/xfrm/xfrm_user.c:2684 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] netlink_unicast+0x809/0x9a0 net/netlink/af_netlink.c:1328 netlink_sendmsg+0xa70/0xd30 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg net/socket.c:657 [inline] There is no reproducer, however, the warning can be reproduced by adding rules with ever smaller prefixes. The sanity check ("does the policy match the node") uses the prefix value of the node before its updated to the smaller value. To fix this, update the prefix earlier. The bug has no impact on tree correctness, this is only to prevent a false warning. Reported-by: syzbot+8cc27ace5f6972910b31@syzkaller.appspotmail.com Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-07-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-8/+7
Two cases of overlapping changes, nothing fancy. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05Merge branch 'master' of ↵David S. Miller1-8/+7
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2019-07-05 1) Fix xfrm selector prefix length validation for inter address family tunneling. From Anirudh Gupta. 2) Fix a memleak in pfkey. From Jeremy Sowden. 3) Fix SA selector validation to allow empty selectors again. From Nicolas Dichtel. 4) Select crypto ciphers for xfrm_algo, this fixes some randconfig builds. From Arnd Bergmann. 5) Remove a duplicated assignment in xfrm_bydst_resize. From Cong Wang. 6) Fix a hlist corruption on hash rebuild. From Florian Westphal. 7) Fix a memory leak when creating xfrm interfaces. From Nicolas Dichtel. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-03xfrm: policy: fix bydst hlist corruption on hash rebuildFlorian Westphal1-5/+7
syzbot reported following spat: BUG: KASAN: use-after-free in __write_once_size include/linux/compiler.h:221 BUG: KASAN: use-after-free in hlist_del_rcu include/linux/rculist.h:455 BUG: KASAN: use-after-free in xfrm_hash_rebuild+0xa0d/0x1000 net/xfrm/xfrm_policy.c:1318 Write of size 8 at addr ffff888095e79c00 by task kworker/1:3/8066 Workqueue: events xfrm_hash_rebuild Call Trace: __write_once_size include/linux/compiler.h:221 [inline] hlist_del_rcu include/linux/rculist.h:455 [inline] xfrm_hash_rebuild+0xa0d/0x1000 net/xfrm/xfrm_policy.c:1318 process_one_work+0x814/0x1130 kernel/workqueue.c:2269 Allocated by task 8064: __kmalloc+0x23c/0x310 mm/slab.c:3669 kzalloc include/linux/slab.h:742 [inline] xfrm_hash_alloc+0x38/0xe0 net/xfrm/xfrm_hash.c:21 xfrm_policy_init net/xfrm/xfrm_policy.c:4036 [inline] xfrm_net_init+0x269/0xd60 net/xfrm/xfrm_policy.c:4120 ops_init+0x336/0x420 net/core/net_namespace.c:130 setup_net+0x212/0x690 net/core/net_namespace.c:316 The faulting address is the address of the old chain head, free'd by xfrm_hash_resize(). In xfrm_hash_rehash(), chain heads get re-initialized without any hlist_del_rcu: for (i = hmask; i >= 0; i--) INIT_HLIST_HEAD(odst + i); Then, hlist_del_rcu() gets called on the about to-be-reinserted policy when iterating the per-net list of policies. hlist_del_rcu() will then make chain->first be nonzero again: static inline void __hlist_del(struct hlist_node *n) { struct hlist_node *next = n->next; // address of next element in list struct hlist_node **pprev = n->pprev;// location of previous elem, this // can point at chain->first WRITE_ONCE(*pprev, next); // chain->first points to next elem if (next) next->pprev = pprev; Then, when we walk chainlist to find insertion point, we may find a non-empty list even though we're supposedly reinserting the first policy to an empty chain. To fix this first unlink all exact and inexact policies instead of zeroing the list heads. Add the commands equivalent to the syzbot reproducer to xfrm_policy.sh, without fix KASAN catches the corruption as it happens, SLUB poisoning detects it a bit later. Reported-by: syzbot+0165480d4ef07360eeda@syzkaller.appspotmail.com Fixes: 1548bc4e0512 ("xfrm: policy: delete inexact policies from inexact list on hash rebuild") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-07-02xfrm: remove a duplicated assignmentCong Wang1-3/+0
Fixes: 30846090a746 ("xfrm: policy: add sequence count to sync with hash resize") Cc: Florian Westphal <fw@strlen.de> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-06-06xfrm: remove state and template sort indirections from xfrm_state_afinfoFlorian Westphal1-1/+1
No module dependency, placing this in xfrm_state.c avoids need for an indirection. This also removes the state spinlock -- I don't see why we would need to hold it during sorting. This in turn allows to remove the 'net' argument passed to xfrm_tmpl_sort. Last, remove the EXPORT_SYMBOL, there are no modular callers. For the CONFIG_IPV6=m case, vmlinux size increase is about 300 byte. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-05-21Merge tag 'spdx-5.2-rc2' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull SPDX update from Greg KH: "Here is a series of patches that add SPDX tags to different kernel files, based on two different things: - SPDX entries are added to a bunch of files that we missed a year ago that do not have any license information at all. These were either missed because the tool saw the MODULE_LICENSE() tag, or some EXPORT_SYMBOL tags, and got confused and thought the file had a real license, or the files have been added since the last big sweep, or they were Makefile/Kconfig files, which we didn't touch last time. - Add GPL-2.0-only or GPL-2.0-or-later tags to files where our scan tools can determine the license text in the file itself. Where this happens, the license text is removed, in order to cut down on the 700+ different ways we have in the kernel today, in a quest to get rid of all of these. These patches have been out for review on the linux-spdx@vger mailing list, and while they were created by automatic tools, they were hand-verified by a bunch of different people, all whom names are on the patches are reviewers. The reason for these "large" patches is if we were to continue to progress at the current rate of change in the kernel, adding license tags to individual files in different subsystems, we would be finished in about 10 years at the earliest. There will be more series of these types of patches coming over the next few weeks as the tools and reviewers crunch through the more "odd" variants of how to say "GPLv2" that developers have come up with over the years, combined with other fun oddities (GPL + a BSD disclaimer?) that are being unearthed, with the goal for the whole kernel to be cleaned up. These diffstats are not small, 3840 files are touched, over 10k lines removed in just 24 patches" * tag 'spdx-5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (24 commits) treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 25 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 24 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 23 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 22 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 21 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 20 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 19 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 18 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 17 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 15 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 14 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 12 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 11 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 10 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 9 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 7 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 5 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 4 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 3 ...
2019-05-21treewide: Add SPDX license identifier for missed filesThomas Gleixner1-0/+1
Add SPDX license identifiers to all files which: - Have no license information of any form - Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-16xfrm: ressurrect "Fix uninitialized memory read in _decode_session4"Florian Westphal1-11/+13
This resurrects commit 8742dc86d0c7a9628 ("xfrm4: Fix uninitialized memory read in _decode_session4"), which got lost during a merge conflict resolution between ipsec-next and net-next tree. c53ac41e3720 ("xfrm: remove decode_session indirection from afinfo_policy") in ipsec-next moved the (buggy) _decode_session4 from net/ipv4/xfrm4_policy.c to net/xfrm/xfrm_policy.c. In mean time, 8742dc86d0c7a was applied to ipsec.git and fixed the problem in the "old" location. When the trees got merged, the moved, old function was kept. This applies the "lost" commit again, to the new location. Fixes: a658a3f2ecbab ("Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-1/+1
Three trivial overlapping conflicts. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-23xfrm: remove decode_session indirection from afinfo_policyFlorian Westphal1-9/+222
No external dependencies, might as well handle this directly. xfrm_afinfo_policy is now 40 bytes on x86_64. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-04-23xfrm: remove init_path indirection from afinfo_policyFlorian Westphal1-14/+7
handle this directly, its only used by ipv6. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-04-23xfrm: remove tos indirection from afinfo_policyFlorian Westphal1-11/+3
Only used by ipv4, we can read the fl4 tos value directly instead. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-04-08xfrm: store xfrm_mode directly, not its addressFlorian Westphal1-1/+1
This structure is now only 4 bytes, so its more efficient to cache a copy rather than its address. No significant size difference in allmodconfig vmlinux. With non-modular kernel that has all XFRM options enabled, this series reduces vmlinux image size by ~11kb. All xfrm_mode indirections are gone and all modes are built-in. before (ipsec-next master): text data bss dec filename 21071494 7233140 11104324 39408958 vmlinux.master after this series: 21066448 7226772 11104324 39397544 vmlinux.patched With allmodconfig kernel, the size increase is only 362 bytes, even all the xfrm config options removed in this series are modular. before: text data bss dec filename 15731286 6936912 4046908 26715106 vmlinux.master after this series: 15731492 6937068 4046908 26715468 vmlinux Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-04-08xfrm: make xfrm modes builtinFlorian Westphal1-1/+1
after previous changes, xfrm_mode contains no function pointers anymore and all modules defining such struct contain no code except an init/exit functions to register the xfrm_mode struct with the xfrm core. Just place the xfrm modes core and remove the modules, the run-time xfrm_mode register/unregister functionality is removed. Before: text data bss dec filename 7523 200 2364 10087 net/xfrm/xfrm_input.o 40003 628 440 41071 net/xfrm/xfrm_state.o 15730338 6937080 4046908 26714326 vmlinux 7389 200 2364 9953 net/xfrm/xfrm_input.o 40574 656 440 41670 net/xfrm/xfrm_state.o 15730084 6937068 4046908 26714060 vmlinux The xfrm*_mode_{transport,tunnel,beet} modules are gone. v2: replace CONFIG_INET6_XFRM_MODE_* IS_ENABLED guards with CONFIG_IPV6 ones rather than removing them. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-04-08xfrm: remove afinfo pointer from xfrm_modeFlorian Westphal1-1/+9
Adds an EXPORT_SYMBOL for afinfo_get_rcu, as it will now be called from ipv6 in case of CONFIG_IPV6=m. This change has virtually no effect on vmlinux size, but it reduces afinfo size and allows followup patch to make xfrm modes const. v2: mark if (afinfo) tests as likely (Sabrina) re-fetch afinfo according to inner_mode in xfrm_prepare_input(). Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-03-27xfrm: Honor original L3 slave device in xfrmi policy lookupMartin Willi1-1/+1
If an xfrmi is associated to a vrf layer 3 master device, xfrm_policy_check() fails after traffic decapsulation. The input interface is replaced by the layer 3 master device, and hence xfrmi_decode_session() can't match the xfrmi anymore to satisfy policy checking. Extend ingress xfrmi lookup to honor the original layer 3 slave device, allowing xfrm interfaces to operate within a vrf domain. Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces") Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-02-18xfrm: Fix inbound traffic via XFRM interfaces across network namespacesTobias Brunner1-1/+3
After moving an XFRM interface to another namespace it stays associated with the original namespace (net in `struct xfrm_if` and the list keyed with `xfrmi_net_id`), allowing processes in the new namespace to use SAs/policies that were created in the original namespace. For instance, this allows a keying daemon in one namespace to establish IPsec SAs for other namespaces without processes there having access to the keys or IKE credentials. This worked fine for outbound traffic, however, for inbound traffic the lookup for the interfaces and the policies used the incorrect namespace (the one the XFRM interface was moved to). Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces") Signed-off-by: Tobias Brunner <tobias@strongswan.org> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-01-16xfrm: Make set-mark default behavior backward compatibleBenedict Wong1-1/+4
Fixes 9b42c1f179a6, which changed the default route lookup behavior for tunnel mode SAs in the outbound direction to use the skb mark, whereas previously mark=0 was used if the output mark was unspecified. In mark-based routing schemes such as Android’s, this change in default behavior causes routing loops or lookup failures. This patch restores the default behavior of using a 0 mark while still incorporating the skb mark if the SET_MARK (and SET_MARK_MASK) is specified. Tested with additions to Android's kernel unit test suite: https://android-review.googlesource.com/c/kernel/tests/+/860150 Fixes: 9b42c1f179a6 ("xfrm: Extend the output_mark to support input direction and masking") Signed-off-by: Benedict Wong <benedictwong@google.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-01-09xfrm: policy: fix infinite loop when merging src-nodesFlorian Westphal1-8/+7
With very small change to test script we can trigger softlockup due to bogus assignment of 'p' (policy to be examined) on restart. Previously the two to-be-merged nodes had same address/prefixlength pair, so no erase/reinsert was necessary, we only had to append the list from node a to b. If prefix lengths are different, the node has to be deleted and re-inserted into the tree, with the updated prefix length. This was broken; due to bogus update to 'p' this loops forever. Add a 'restart' label and use that instead. While at it, don't perform the unneeded reinserts of the policies that are already sorted into the 'new' node. A previous patch in this series made xfrm_policy_inexact_list_reinsert() use the relative position indicator to sort policies according to age in case priorities are identical. Fixes: 6ac098b2a9d30 ("xfrm: policy: add 2nd-level saddr trees for inexact policies") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-01-09xfrm: policy: fix reinsertion on node mergeFlorian Westphal1-6/+8
"newpos" has wrong scope. It must be NULL on each iteration of the loop. Otherwise, when policy is to be inserted at the start, we would instead insert at point found by the previous loop-iteration instead. Also, we need to unlink the policy before we reinsert it to the new node, else we can get next-points-to-self loops. Because policies are only ordered by priority it is irrelevant which policy is "more recent" except when two policies have same priority. (the more recent one is placed after the older one). In these cases, we can use the ->pos id number to know which one is the 'older': the higher the id, the more recent the policy. So we only need to unlink all policies from the node that is about to be removed, and insert them to the replacement node. Fixes: 9cf545ebd591da ("xfrm: policy: store inexact policies in a tree ordered by destination address") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-01-09xfrm: policy: delete inexact policies from inexact list on hash rebuildFlorian Westphal1-13/+10
An xfrm hash rebuild has to reset the inexact policy list before the policies get re-inserted: A change of hash thresholds will result in policies to get moved from inexact tree to the policy hash table. If the thresholds are increased again later, they get moved from hash table to inexact tree. We must unlink all policies from the inexact tree before re-insertion. Otherwise 'migrate' may find policies that are in main hash table a second time, when it searches the inexact lists. Furthermore, re-insertion without deletion can cause elements ->next to point back to itself, causing soft lockups or double-frees. Reported-by: syzbot+9d971dd21eb26567036b@syzkaller.appspotmail.com Fixes: 9cf545ebd591da ("xfrm: policy: store inexact policies in a tree ordered by destination address") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-01-09xfrm: policy: increment xfrm_hash_generation on hash rebuildFlorian Westphal1-0/+2
Hash rebuild will re-set all the inexact entries, then re-insert them. Lookups that can occur in parallel will therefore not find any policies. This was safe when lookups were still guarded by rwlock. After rcu-ification, lookups check the hash_generation seqcount to detect when a hash resize takes place. Hash rebuild missed the needed increment. Hash resizes and hash rebuilds cannot occur in parallel (both acquire hash_resize_mutex), so just increment xfrm_hash_generation, like resize. Fixes: a7c44247f704e3 ("xfrm: policy: make xfrm_policy_lookup_bytype lockless") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-01-09xfrm: policy: use hlist rcu variants on inexact insert, part 2Florian Westphal1-2/+2
This function was modeled on the 'exact' insert one, which did not use the rcu variant either. When I fixed the 'exact' insert I forgot to propagate this to my development tree, so the inexact variant retained the bug. Fixes: 9cf545ebd591d ("xfrm: policy: store inexact policies in a tree ordered by destination address") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-12-20Merge branch 'master' of ↵David S. Miller1-3/+0
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2018-12-20 Two last patches for this release cycle: 1) Remove an unused variable in xfrm_policy_lookup_bytype(). From YueHaibing. 2) Fix possible infinite loop in __xfrm6_tunnel_alloc_spi(). Also from YueHaibing. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19net: use skb_sec_path helper in more placesFlorian Westphal1-8/+11
skb_sec_path gains 'const' qualifier to avoid xt_policy.c: 'skb_sec_path' discards 'const' qualifier from pointer target type same reasoning as previous conversions: Won't need to touch these spots anymore when skb->sp is removed. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19xfrm: policy: remove set but not used variable 'priority'YueHaibing1-3/+0
Fixes gcc '-Wunused-but-set-variable' warning: net/xfrm/xfrm_policy.c: In function 'xfrm_policy_lookup_bytype': net/xfrm/xfrm_policy.c:2079:6: warning: variable 'priority' set but not used [-Wunused-but-set-variable] It not used since commit 6be3b0db6db8 ("xfrm: policy: add inexact policy search tree infrastructure") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-11-28xfrm: policy: fix policy hash rebuildFlorian Westphal1-4/+5
Dan Carpenter reports following static checker warning: net/xfrm/xfrm_policy.c:1316 xfrm_hash_rebuild() warn: 'dir' is out of bounds '3' vs '2' | 1280 /* reset the bydst and inexact table in all directions */ | 1281 xfrm_hash_reset_inexact_table(net); | 1282 | 1283 for (dir = 0; dir < XFRM_POLICY_MAX; dir++) { | ^^^^^^^^^^^^^^^^^^^^^ |dir == XFRM_POLICY_MAX at the end of this loop. | 1304 /* re-insert all policies by order of creation */ | 1305 list_for_each_entry_reverse(policy, &net->xfrm.policy_all, walk.all) { [..] | 1314 xfrm_policy_id2dir(policy->index)); | 1315 if (!chain) { | 1316 void *p = xfrm_policy_inexact_insert(policy, dir, 0); Fix this by updating 'dir' based on current policy. Otherwise, the inexact policies won't be found anymore during lookup, as they get hashed to a bogus bin. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: cc1bb845adc9 ("xfrm: policy: return NULL when inexact search needed") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-11-15xfrm: policy: fix netlink/pf_key policy lookupsFlorian Westphal1-1/+4
Colin Ian King says: Static analysis with CoverityScan found a potential issue [..] It seems that pointer pol is set to NULL and then a check to see if it is non-null is used to set pol to tmp; howeverm this check is always going to be false because pol is always NULL. Fix this and update test script to catch this. Updated script only: ./xfrm_policy.sh ; echo $? RTNETLINK answers: No such file or directory FAIL: ip -net ns3 xfrm policy get src 10.0.1.0/24 dst 10.0.2.0/24 dir out RTNETLINK answers: No such file or directory [..] PASS: policy before exception matches PASS: ping to .254 bypassed ipsec tunnel PASS: direct policy matches PASS: policy matches 1 Fixes: 6be3b0db6db ("xfrm: policy: add inexact policy search tree infrastructure") Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-11-15xfrm: policy: add missing indentationColin Ian King1-1/+1
There is a missing indentation before the goto statement. Add it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-11-09xfrm: policy: add 2nd-level saddr trees for inexact policiesFlorian Westphal1-10/+108
This adds the fourth and final search class, containing policies where both saddr and daddr have prefix lengths (i.e., not wildcards). Inexact policies now end up in one of the following four search classes: 1. "Any:Any" list, containing policies where both saddr and daddr are wildcards or have very coarse prefixes, e.g. 10.0.0.0/8 and the like. 2. "saddr:any" list, containing policies with a fixed saddr/prefixlen, but without destination restrictions. These lists are stored in rbtree nodes; each node contains those policies matching saddr/prefixlen. 3. "Any:daddr" list. Similar to 2), except for policies where only the destinations are specified. 4. "saddr:daddr" lists, containing only those policies that match the given source/destination network. The root of the saddr/daddr nodes gets stored in the nodes of the 'daddr' tree. This diagram illustrates the list classes, and their placement in the lookup hierarchy: xfrm_pol_inexact_bin = hash(dir,type,family,if_id); | +---- root_d: sorted by daddr:prefix | | | xfrm_pol_inexact_node | | | +- root: sorted by saddr/prefix | | | | | xfrm_pol_inexact_node | | | | | + root: unused | | | | | + hhead: saddr:daddr policies | | | +- coarse policies and all any:daddr policies | +---- root_s: sorted by saddr:prefix | | | xfrm_pol_inexact_node | | | + root: unused | | | + hhead: saddr:any policies | +---- coarse policies and all any:any policies lookup for an inexact policy returns pointers to the four relevant list classes, after which each of the lists needs to be searched for the policy with the higher priority. This will only speed up lookups in case we have many policies and a sizeable portion of these have disjunct saddr/daddr addresses. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-11-09xfrm: policy: store inexact policies in a tree ordered by source addressFlorian Westphal1-4/+42
This adds the 'saddr:any' search class. It contains all policies that have a fixed saddr/prefixlen, but 'any' destination. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-11-09xfrm: policy: check reinserted policies match their nodeFlorian Westphal1-0/+32
validate the re-inserted policies match the lookup node. Policies that fail this test won't be returned in the candidate set. This is enabled by default for now, it should not cause noticeable reinsert slow down. Such reinserts are needed when we have to merge an existing node (e.g. for 10.0.0.0/28 because a overlapping subnet was added (e.g. 10.0.0.0/24), so whenever this happens existing policies have to be placed on the list of the new node. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-11-09xfrm: policy: store inexact policies in a tree ordered by destination addressFlorian Westphal1-6/+327
This adds inexact lists per destination network, stored in a search tree. Inexact lookups now return two 'candidate lists', the 'any' policies ('any' destionations), and a list of policies that share same daddr/prefix. Next patch will add a second search tree for 'saddr:any' policies so we can avoid placing those on the 'any:any' list too. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-11-09xfrm: policy: add inexact policy search tree infrastructureFlorian Westphal1-54/+247
At this time inexact policies are all searched in-order until the first match is found. After removal of the flow cache, this resolution has to be performed for every packetm resulting in major slowdown when number of inexact policies is high. This adds infrastructure to later sort inexact policies into a tree. This only introduces a single class: any:any. Next patch will add a search tree to pre-sort policies that have a fixed daddr/prefixlen, so in this patch the any:any class will still be used for all policies. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-11-09xfrm: policy: consider if_id when hashing inexact policyFlorian Westphal1-9/+16
This avoids searches of polices that cannot match in the first place due to different interface id by placing them in different bins. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-11-09xfrm: policy: store inexact policies in an rhashtableFlorian Westphal1-18/+332
Switch packet-path lookups for inexact policies to rhashtable. In this initial version, we now no longer need to search policies with non-matching address family and type. Next patch will add the if_id as well so lookups from the xfrm interface driver only need to search inexact policies for that device. Future patches will augment the hlist in each rhash bucket with a tree and pre-sort policies according to daddr/prefix. A single rhashtable is used. In order to avoid a full rhashtable walk on netns exit, the bins get placed on a pernet list, i.e. we add almost no cost for network namespaces that had no xfrm policies. The inexact lists are kept in place, and policies are added to both the per-rhash-inexact list and a pernet one. The latter is needed for the control plane to handle migrate -- these requests do not consider the if_id, so if we'd remove the inexact_list now we would have to search all hash buckets and then figure out which matching policy candidate is the most recent one -- this appears a bit harder than just keeping the 'old' inexact list for this purpose. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-11-09xfrm: policy: return NULL when inexact search neededFlorian Westphal1-2/+11
currently policy_hash_bysel() returns the hash bucket list (for exact policies), or the inexact list (when policy uses a prefix). Searching this inexact list is slow, so it might be better to pre-sort inexact lists into a tree or another data structure for faster searching. However, due to 'any' policies, that need to be searched in any case, doing so will require that 'inexact' policies need to be handled specially to decide the best search strategy. So change hash_bysel() and return NULL if the policy can't be handled via the policy hash table. Right now, we simply use the inexact list when this happens, but future patch can then implement a different strategy. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-11-09xfrm: policy: split list insertion into a helperFlorian Westphal1-16/+27
... so we can reuse this later without code duplication when we add policy to a second inexact list. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-11-09xfrm: security: iterate all, not inexact listsFlorian Westphal1-67/+26
currently all non-socket policies are either hashed in the dst table, or placed on the 'inexact list'. When flushing, we first walk the table, then the (per-direction) inexact lists. When we try and get rid of the inexact lists to having "n" inexact lists (e.g. per-af inexact lists, or sorted into a tree), this walk would become more complicated. Simplify this: walk the 'all' list and skip socket policies during traversal so we don't need to handle exact and inexact policies separately anymore. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-10-11xfrm: policy: use hlist rcu variants on insertFlorian Westphal1-4/+4
bydst table/list lookups use rcu, so insertions must use rcu versions. Fixes: a7c44247f704e ("xfrm: policy: make xfrm_policy_lookup_bytype lockless") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>