Age | Commit message (Collapse) | Author | Files | Lines |
|
Pull networking updates from David Miller:
1) BPF debugger and asm tool by Daniel Borkmann.
2) Speed up create/bind in AF_PACKET, also from Daniel Borkmann.
3) Correct reciprocal_divide and update users, from Hannes Frederic
Sowa and Daniel Borkmann.
4) Currently we only have a "set" operation for the hw timestamp socket
ioctl, add a "get" operation to match. From Ben Hutchings.
5) Add better trace events for debugging driver datapath problems, also
from Ben Hutchings.
6) Implement auto corking in TCP, from Eric Dumazet. Basically, if we
have a small send and a previous packet is already in the qdisc or
device queue, defer until TX completion or we get more data.
7) Allow userspace to manage ipv6 temporary addresses, from Jiri Pirko.
8) Add a qdisc bypass option for AF_PACKET sockets, from Daniel
Borkmann.
9) Share IP header compression code between Bluetooth and IEEE802154
layers, from Jukka Rissanen.
10) Fix ipv6 router reachability probing, from Jiri Benc.
11) Allow packets to be captured on macvtap devices, from Vlad Yasevich.
12) Support tunneling in GRO layer, from Jerry Chu.
13) Allow bonding to be configured fully using netlink, from Scott
Feldman.
14) Allow AF_PACKET users to obtain the VLAN TPID, just like they can
already get the TCI. From Atzm Watanabe.
15) New "Heavy Hitter" qdisc, from Terry Lam.
16) Significantly improve the IPSEC support in pktgen, from Fan Du.
17) Allow ipv4 tunnels to cache routes, just like sockets. From Tom
Herbert.
18) Add Proportional Integral Enhanced packet scheduler, from Vijay
Subramanian.
19) Allow openvswitch to mmap'd netlink, from Thomas Graf.
20) Key TCP metrics blobs also by source address, not just destination
address. From Christoph Paasch.
21) Support 10G in generic phylib. From Andy Fleming.
22) Try to short-circuit GRO flow compares using device provided RX
hash, if provided. From Tom Herbert.
The wireless and netfilter folks have been busy little bees too.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2064 commits)
net/cxgb4: Fix referencing freed adapter
ipv6: reallocate addrconf router for ipv6 address when lo device up
fib_frontend: fix possible NULL pointer dereference
rtnetlink: remove IFLA_BOND_SLAVE definition
rtnetlink: remove check for fill_slave_info in rtnl_have_link_slave_info
qlcnic: update version to 5.3.55
qlcnic: Enhance logic to calculate msix vectors.
qlcnic: Refactor interrupt coalescing code for all adapters.
qlcnic: Update poll controller code path
qlcnic: Interrupt code cleanup
qlcnic: Enhance Tx timeout debugging.
qlcnic: Use bool for rx_mac_learn.
bonding: fix u64 division
rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC
sfc: Use the correct maximum TX DMA ring size for SFC9100
Add Shradha Shah as the sfc driver maintainer.
net/vxlan: Share RX skb de-marking and checksum checks with ovs
tulip: cleanup by using ARRAY_SIZE()
ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is called
net/cxgb4: Don't retrieve stats during recovery
...
|
|
Pull audit update from Eric Paris:
"Again we stayed pretty well contained inside the audit system.
Venturing out was fixing a couple of function prototypes which were
inconsistent (didn't hurt anything, but we used the same value as an
int, uint, u32, and I think even a long in a couple of places).
We also made a couple of minor changes to when a couple of LSMs called
the audit system. We hoped to add aarch64 audit support this go
round, but it wasn't ready.
I'm disappearing on vacation on Thursday. I should have internet
access, but it'll be spotty. If anything goes wrong please be sure to
cc rgb@redhat.com. He'll make fixing things his top priority"
* git://git.infradead.org/users/eparis/audit: (50 commits)
audit: whitespace fix in kernel-parameters.txt
audit: fix location of __net_initdata for audit_net_ops
audit: remove pr_info for every network namespace
audit: Modify a set of system calls in audit class definitions
audit: Convert int limit uses to u32
audit: Use more current logging style
audit: Use hex_byte_pack_upper
audit: correct a type mismatch in audit_syscall_exit()
audit: reorder AUDIT_TTY_SET arguments
audit: rework AUDIT_TTY_SET to only grab spin_lock once
audit: remove needless switch in AUDIT_SET
audit: use define's for audit version
audit: documentation of audit= kernel parameter
audit: wait_for_auditd rework for readability
audit: update MAINTAINERS
audit: log task info on feature change
audit: fix incorrect set of audit_sock
audit: print error message when fail to create audit socket
audit: fix dangling keywords in audit_log_set_loginuid() output
audit: log on errors from filter user rules
...
|
|
This patch removes the net_random and net_srandom macros and replaces
them with direct calls to the prandom ones. As new commits only seem to
use prandom_u32 there is no use to keep them around.
This change makes it easier to grep for users of prandom_u32.
Signed-off-by: Aruna-Hewapathirane <aruna.hewapathirane@gmail.com>
Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Conflicts:
net/xfrm/xfrm_policy.c
Steffen Klassert says:
====================
This pull request has a merge conflict between commits be7928d20bab
("net: xfrm: xfrm_policy: fix inline not at beginning of declaration") and
da7c224b1baa ("net: xfrm: xfrm_policy: silence compiler warning") from
the net-next tree and commit 2f3ea9a95c58 ("xfrm: checkpatch erros with
inline keyword position") from the ipsec-next tree.
The version from net-next can be used, like it is done in linux-next.
1) Checkpatch cleanups, from Weilong Chen.
2) Fix lockdep complaints when pktgen is used with IPsec,
from Fan Du.
3) Update pktgen to allow any combination of IPsec transport/tunnel mode
and AH/ESP/IPcomp type, from Fan Du.
4) Make pktgen_dst_metrics static, Fengguang Wu.
5) Compile fix for pktgen when CONFIG_XFRM is not set,
from Fan Du.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Right now the sessionid value in the kernel is a combination of u32,
int, and unsigned int. Just use unsigned int throughout.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
|
|
Fix below compiler warning:
net/xfrm/xfrm_policy.c:1644:12: warning: ‘xfrm_dst_alloc_copy’ defined but not used [-Wunused-function]
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix three warnings related to:
net/xfrm/xfrm_policy.c:1644:1: warning: 'inline' is not at beginning of declaration [-Wold-style-declaration]
net/xfrm/xfrm_policy.c:1656:1: warning: 'inline' is not at beginning of declaration [-Wold-style-declaration]
net/xfrm/xfrm_policy.c:1668:1: warning: 'inline' is not at beginning of declaration [-Wold-style-declaration]
Just removing the inline keyword is sufficient as the compiler will
decide on its own about inlining or not.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Introduce xfrm_state_lookup_byspi to find user specified by custom
from "pgset spi xxx". Using this scheme, any flow regardless its
saddr/daddr could be transform by SA specified with configurable
spi.
Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Acquiring xfrm_state_lock in process context is expected to turn BH off,
as this lock is also used in BH context, namely xfrm state timer handler.
Otherwise it surprises LOCKDEP with below messages.
[ 81.422781] pktgen: Packet Generator for packet performance testing. Version: 2.74
[ 81.725194]
[ 81.725211] =========================================================
[ 81.725212] [ INFO: possible irq lock inversion dependency detected ]
[ 81.725215] 3.13.0-rc2+ #92 Not tainted
[ 81.725216] ---------------------------------------------------------
[ 81.725218] kpktgend_0/2780 just changed the state of lock:
[ 81.725220] (xfrm_state_lock){+.+...}, at: [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0
[ 81.725231] but this lock was taken by another, SOFTIRQ-safe lock in the past:
[ 81.725232] (&(&x->lock)->rlock){+.-...}
[ 81.725232]
[ 81.725232] and interrupts could create inverse lock ordering between them.
[ 81.725232]
[ 81.725235]
[ 81.725235] other info that might help us debug this:
[ 81.725237] Possible interrupt unsafe locking scenario:
[ 81.725237]
[ 81.725238] CPU0 CPU1
[ 81.725240] ---- ----
[ 81.725241] lock(xfrm_state_lock);
[ 81.725243] local_irq_disable();
[ 81.725244] lock(&(&x->lock)->rlock);
[ 81.725246] lock(xfrm_state_lock);
[ 81.725248] <Interrupt>
[ 81.725249] lock(&(&x->lock)->rlock);
[ 81.725251]
[ 81.725251] *** DEADLOCK ***
[ 81.725251]
[ 81.725254] no locks held by kpktgend_0/2780.
[ 81.725255]
[ 81.725255] the shortest dependencies between 2nd lock and 1st lock:
[ 81.725269] -> (&(&x->lock)->rlock){+.-...} ops: 8 {
[ 81.725274] HARDIRQ-ON-W at:
[ 81.725276] [<ffffffff8109a64b>] __lock_acquire+0x65b/0x1d70
[ 81.725282] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130
[ 81.725284] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70
[ 81.725289] [<ffffffff816dc3a3>] xfrm_timer_handler+0x43/0x290
[ 81.725292] [<ffffffff81059437>] __tasklet_hrtimer_trampoline+0x17/0x40
[ 81.725300] [<ffffffff8105a1b7>] tasklet_hi_action+0xd7/0xf0
[ 81.725303] [<ffffffff81059ac6>] __do_softirq+0xe6/0x2d0
[ 81.725305] [<ffffffff8105a026>] irq_exit+0x96/0xc0
[ 81.725308] [<ffffffff8177fd0a>] smp_apic_timer_interrupt+0x4a/0x60
[ 81.725313] [<ffffffff8177e96f>] apic_timer_interrupt+0x6f/0x80
[ 81.725316] [<ffffffff8100b7c6>] arch_cpu_idle+0x26/0x30
[ 81.725329] [<ffffffff810ace28>] cpu_startup_entry+0x88/0x2b0
[ 81.725333] [<ffffffff8102e5b0>] start_secondary+0x190/0x1f0
[ 81.725338] IN-SOFTIRQ-W at:
[ 81.725340] [<ffffffff8109a61d>] __lock_acquire+0x62d/0x1d70
[ 81.725342] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130
[ 81.725344] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70
[ 81.725347] [<ffffffff816dc3a3>] xfrm_timer_handler+0x43/0x290
[ 81.725349] [<ffffffff81059437>] __tasklet_hrtimer_trampoline+0x17/0x40
[ 81.725352] [<ffffffff8105a1b7>] tasklet_hi_action+0xd7/0xf0
[ 81.725355] [<ffffffff81059ac6>] __do_softirq+0xe6/0x2d0
[ 81.725358] [<ffffffff8105a026>] irq_exit+0x96/0xc0
[ 81.725360] [<ffffffff8177fd0a>] smp_apic_timer_interrupt+0x4a/0x60
[ 81.725363] [<ffffffff8177e96f>] apic_timer_interrupt+0x6f/0x80
[ 81.725365] [<ffffffff8100b7c6>] arch_cpu_idle+0x26/0x30
[ 81.725368] [<ffffffff810ace28>] cpu_startup_entry+0x88/0x2b0
[ 81.725370] [<ffffffff8102e5b0>] start_secondary+0x190/0x1f0
[ 81.725373] INITIAL USE at:
[ 81.725375] [<ffffffff8109a31a>] __lock_acquire+0x32a/0x1d70
[ 81.725385] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130
[ 81.725388] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70
[ 81.725390] [<ffffffff816dc3a3>] xfrm_timer_handler+0x43/0x290
[ 81.725394] [<ffffffff81059437>] __tasklet_hrtimer_trampoline+0x17/0x40
[ 81.725398] [<ffffffff8105a1b7>] tasklet_hi_action+0xd7/0xf0
[ 81.725401] [<ffffffff81059ac6>] __do_softirq+0xe6/0x2d0
[ 81.725404] [<ffffffff8105a026>] irq_exit+0x96/0xc0
[ 81.725407] [<ffffffff8177fd0a>] smp_apic_timer_interrupt+0x4a/0x60
[ 81.725409] [<ffffffff8177e96f>] apic_timer_interrupt+0x6f/0x80
[ 81.725412] [<ffffffff8100b7c6>] arch_cpu_idle+0x26/0x30
[ 81.725415] [<ffffffff810ace28>] cpu_startup_entry+0x88/0x2b0
[ 81.725417] [<ffffffff8102e5b0>] start_secondary+0x190/0x1f0
[ 81.725420] }
[ 81.725421] ... key at: [<ffffffff8295b9c8>] __key.46349+0x0/0x8
[ 81.725445] ... acquired at:
[ 81.725446] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130
[ 81.725449] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70
[ 81.725452] [<ffffffff816dc057>] __xfrm_state_delete+0x37/0x140
[ 81.725454] [<ffffffff816dc18c>] xfrm_state_delete+0x2c/0x50
[ 81.725456] [<ffffffff816dc277>] xfrm_state_flush+0xc7/0x1b0
[ 81.725458] [<ffffffffa005f6cc>] pfkey_flush+0x7c/0x100 [af_key]
[ 81.725465] [<ffffffffa005efb7>] pfkey_process+0x1c7/0x1f0 [af_key]
[ 81.725468] [<ffffffffa005f139>] pfkey_sendmsg+0x159/0x260 [af_key]
[ 81.725471] [<ffffffff8162c16f>] sock_sendmsg+0xaf/0xc0
[ 81.725476] [<ffffffff8162c99c>] SYSC_sendto+0xfc/0x130
[ 81.725479] [<ffffffff8162cf3e>] SyS_sendto+0xe/0x10
[ 81.725482] [<ffffffff8177dd12>] system_call_fastpath+0x16/0x1b
[ 81.725484]
[ 81.725486] -> (xfrm_state_lock){+.+...} ops: 11 {
[ 81.725490] HARDIRQ-ON-W at:
[ 81.725493] [<ffffffff8109a64b>] __lock_acquire+0x65b/0x1d70
[ 81.725504] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130
[ 81.725507] [<ffffffff81774e4b>] _raw_spin_lock_bh+0x3b/0x70
[ 81.725510] [<ffffffff816dc1df>] xfrm_state_flush+0x2f/0x1b0
[ 81.725513] [<ffffffffa005f6cc>] pfkey_flush+0x7c/0x100 [af_key]
[ 81.725516] [<ffffffffa005efb7>] pfkey_process+0x1c7/0x1f0 [af_key]
[ 81.725519] [<ffffffffa005f139>] pfkey_sendmsg+0x159/0x260 [af_key]
[ 81.725522] [<ffffffff8162c16f>] sock_sendmsg+0xaf/0xc0
[ 81.725525] [<ffffffff8162c99c>] SYSC_sendto+0xfc/0x130
[ 81.725527] [<ffffffff8162cf3e>] SyS_sendto+0xe/0x10
[ 81.725530] [<ffffffff8177dd12>] system_call_fastpath+0x16/0x1b
[ 81.725533] SOFTIRQ-ON-W at:
[ 81.725534] [<ffffffff8109a67a>] __lock_acquire+0x68a/0x1d70
[ 81.725537] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130
[ 81.725539] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70
[ 81.725541] [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0
[ 81.725544] [<ffffffffa008af03>] mod_cur_headers+0x793/0x7f0 [pktgen]
[ 81.725547] [<ffffffffa008bca2>] pktgen_thread_worker+0xd42/0x1880 [pktgen]
[ 81.725550] [<ffffffff81078f84>] kthread+0xe4/0x100
[ 81.725555] [<ffffffff8177dc6c>] ret_from_fork+0x7c/0xb0
[ 81.725565] INITIAL USE at:
[ 81.725567] [<ffffffff8109a31a>] __lock_acquire+0x32a/0x1d70
[ 81.725569] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130
[ 81.725572] [<ffffffff81774e4b>] _raw_spin_lock_bh+0x3b/0x70
[ 81.725574] [<ffffffff816dc1df>] xfrm_state_flush+0x2f/0x1b0
[ 81.725576] [<ffffffffa005f6cc>] pfkey_flush+0x7c/0x100 [af_key]
[ 81.725580] [<ffffffffa005efb7>] pfkey_process+0x1c7/0x1f0 [af_key]
[ 81.725583] [<ffffffffa005f139>] pfkey_sendmsg+0x159/0x260 [af_key]
[ 81.725586] [<ffffffff8162c16f>] sock_sendmsg+0xaf/0xc0
[ 81.725589] [<ffffffff8162c99c>] SYSC_sendto+0xfc/0x130
[ 81.725594] [<ffffffff8162cf3e>] SyS_sendto+0xe/0x10
[ 81.725597] [<ffffffff8177dd12>] system_call_fastpath+0x16/0x1b
[ 81.725599] }
[ 81.725600] ... key at: [<ffffffff81cadef8>] xfrm_state_lock+0x18/0x50
[ 81.725606] ... acquired at:
[ 81.725607] [<ffffffff810995c0>] check_usage_backwards+0x110/0x150
[ 81.725609] [<ffffffff81099e96>] mark_lock+0x196/0x2f0
[ 81.725611] [<ffffffff8109a67a>] __lock_acquire+0x68a/0x1d70
[ 81.725614] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130
[ 81.725616] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70
[ 81.725627] [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0
[ 81.725629] [<ffffffffa008af03>] mod_cur_headers+0x793/0x7f0 [pktgen]
[ 81.725632] [<ffffffffa008bca2>] pktgen_thread_worker+0xd42/0x1880 [pktgen]
[ 81.725635] [<ffffffff81078f84>] kthread+0xe4/0x100
[ 81.725637] [<ffffffff8177dc6c>] ret_from_fork+0x7c/0xb0
[ 81.725640]
[ 81.725641]
[ 81.725641] stack backtrace:
[ 81.725645] CPU: 0 PID: 2780 Comm: kpktgend_0 Not tainted 3.13.0-rc2+ #92
[ 81.725647] Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006
[ 81.725649] ffffffff82537b80 ffff880018199988 ffffffff8176af37 0000000000000007
[ 81.725652] ffff8800181999f0 ffff8800181999d8 ffffffff81099358 ffffffff82537b80
[ 81.725655] ffffffff81a32def ffff8800181999f4 0000000000000000 ffff880002cbeaa8
[ 81.725659] Call Trace:
[ 81.725664] [<ffffffff8176af37>] dump_stack+0x46/0x58
[ 81.725667] [<ffffffff81099358>] print_irq_inversion_bug.part.42+0x1e8/0x1f0
[ 81.725670] [<ffffffff810995c0>] check_usage_backwards+0x110/0x150
[ 81.725672] [<ffffffff81099e96>] mark_lock+0x196/0x2f0
[ 81.725675] [<ffffffff810994b0>] ? check_usage_forwards+0x150/0x150
[ 81.725685] [<ffffffff8109a67a>] __lock_acquire+0x68a/0x1d70
[ 81.725691] [<ffffffff810899a5>] ? sched_clock_local+0x25/0x90
[ 81.725694] [<ffffffff81089b38>] ? sched_clock_cpu+0xa8/0x120
[ 81.725697] [<ffffffff8109a31a>] ? __lock_acquire+0x32a/0x1d70
[ 81.725699] [<ffffffff816dd751>] ? xfrm_stateonly_find+0x41/0x1f0
[ 81.725702] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130
[ 81.725704] [<ffffffff816dd751>] ? xfrm_stateonly_find+0x41/0x1f0
[ 81.725707] [<ffffffff810899a5>] ? sched_clock_local+0x25/0x90
[ 81.725710] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70
[ 81.725712] [<ffffffff816dd751>] ? xfrm_stateonly_find+0x41/0x1f0
[ 81.725715] [<ffffffff810971ec>] ? lock_release_holdtime.part.26+0x1c/0x1a0
[ 81.725717] [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0
[ 81.725721] [<ffffffffa008af03>] mod_cur_headers+0x793/0x7f0 [pktgen]
[ 81.725724] [<ffffffffa008bca2>] pktgen_thread_worker+0xd42/0x1880 [pktgen]
[ 81.725727] [<ffffffffa008ba71>] ? pktgen_thread_worker+0xb11/0x1880 [pktgen]
[ 81.725729] [<ffffffff8109cf9d>] ? trace_hardirqs_on+0xd/0x10
[ 81.725733] [<ffffffff81775410>] ? _raw_spin_unlock_irq+0x30/0x40
[ 81.725745] [<ffffffff8151faa0>] ? e1000_clean+0x9d0/0x9d0
[ 81.725751] [<ffffffff81094310>] ? __init_waitqueue_head+0x60/0x60
[ 81.725753] [<ffffffff81094310>] ? __init_waitqueue_head+0x60/0x60
[ 81.725757] [<ffffffffa008af60>] ? mod_cur_headers+0x7f0/0x7f0 [pktgen]
[ 81.725759] [<ffffffff81078f84>] kthread+0xe4/0x100
[ 81.725762] [<ffffffff81078ea0>] ? flush_kthread_worker+0x170/0x170
[ 81.725765] [<ffffffff8177dc6c>] ret_from_fork+0x7c/0xb0
[ 81.725768] [<ffffffff81078ea0>] ? flush_kthread_worker+0x170/0x170
Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Weilong Chen <chenweilong@huawei.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Fix that "else should follow close brace '}'".
Signed-off-by: Weilong Chen <chenweilong@huawei.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Fix checkpatch error "space prohibited xxx".
Signed-off-by: Weilong Chen <chenweilong@huawei.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
This patch clean up some checkpatch errors like this:
ERROR: "foo * bar" should be "foo *bar"
ERROR: "(foo*)" should be "(foo *)"
Signed-off-by: Weilong Chen <chenweilong@huawei.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
This patch cleanup some space errors.
Signed-off-by: Weilong Chen <chenweilong@huawei.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
In order to check against valid IPcomp spi range, export verify_userspi_info
for both pfkey and netlink interface.
Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
IPComp connection between two hosts is broken if given spi bigger
than 0xffff.
OUTSPI=0x87
INSPI=0x11112
ip xfrm policy update dst 192.168.1.101 src 192.168.1.109 dir out action allow \
tmpl dst 192.168.1.101 src 192.168.1.109 proto comp spi $OUTSPI
ip xfrm policy update src 192.168.1.101 dst 192.168.1.109 dir in action allow \
tmpl src 192.168.1.101 dst 192.168.1.109 proto comp spi $INSPI
ip xfrm state add src 192.168.1.101 dst 192.168.1.109 proto comp spi $INSPI \
comp deflate
ip xfrm state add dst 192.168.1.101 src 192.168.1.109 proto comp spi $OUTSPI \
comp deflate
tcpdump can capture outbound ping packet, but inbound packet is
dropped with XfrmOutNoStates errors. It looks like spi value used
for IPComp is expected to be 16bits wide only.
Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
We now queue packets to the policy if the states are not yet resolved,
this replaces the ancient sleeping code. Also the sleeping can cause
indefinite task hangs if the needed state does not get resolved.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
By semantics, xfrm layer is fully name space aware,
so will the locks, e.g. xfrm_state/pocliy_lock.
Ensure exclusive access into state/policy link list
for different name space with one global lock is not
right in terms of semantics aspect at first place,
as they are indeed mutually independent with each
other, but also more seriously causes scalability
problem.
One practical scenario is on a Open Network Stack,
more than hundreds of lxc tenants acts as routers
within one host, a global xfrm_state/policy_lock
becomes the bottleneck. But onces those locks are
decoupled in a per-namespace fashion, locks contend
is just with in specific name space scope, without
causing additional SPD/SAD access delay for other
name space.
Also this patch improve scalability while as without
changing original xfrm behavior.
Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
because the home agent could surely be run on a different
net namespace other than init_net. The original behavior
could lead into inconsistent of key info.
Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
xfrm code always searches for unused policy index for
newly created policy regardless whether or not user
space policy index hint supplied.
This patch enables such feature so that using
"ip xfrm ... index=xxx" can be used by user to set
specific policy index.
Currently this beahvior is broken, so this patch make
it happen as expected.
Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
This function has usage beside IPsec so move it to the core skbuff code.
While doing so, give it some documentation and change its return type to
'unsigned char *' to be in line with skb_put().
Signed-off-by: Mathias Krause <mathias.krause@secunet.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Conflicts:
drivers/net/ethernet/emulex/benet/be.h
drivers/net/netconsole.c
net/bridge/br_private.h
Three mostly trivial conflicts.
The net/bridge/br_private.h conflict was a function signature (argument
addition) change overlapping with the extern removals from Joe Perches.
In drivers/net/netconsole.c we had one change adjusting a printk message
whilst another changed "printk(KERN_INFO" into "pr_info(".
Lastly, the emulex change was a new inline function addition overlapping
with Joe Perches's extern removals.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Conflicts:
net/xfrm/xfrm_policy.c
Minor merge conflict in xfrm_policy.c, consisting of overlapping
changes which were trivial to resolve.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Conflicts:
drivers/net/usb/qmi_wwan.c
include/net/dst.h
Trivial merge conflicts, both were overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It does not make sense to queue retransmitted packets if the
original packet is still in some queue of this host. So add
a check to xdst_queue_output() and drop the packet if the
original packet is not yet sent.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Eric Dumazet <edumazet@google.com>
|
|
scratches are per cpu, we can use vmalloc_node() for proper
NUMA affinity.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
There are a mix of function prototypes with and without extern
in the kernel sources. Standardize on not using extern for
function prototypes.
Function prototypes don't need to be written with extern.
extern is assumed by the compiler. Its use is as unnecessary as
using auto to declare automatic/local variables in a block.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In ipcomp_compress(), sortirq is enabled too early, allowing the
per-cpu scratch buffer to be rewritten by ipcomp_decompress()
(called on the same CPU in softirq context) between populating
the buffer and copying the compressed data to the skb.
v2: as pointed out by Steffen Klassert, if we also move the
local_bh_disable() before reading the per-cpu pointers, we can
get rid of get_cpu()/put_cpu().
v3: removed ipcomp_decompress part (as explained by Herbert Xu,
it cannot be called from process context), get rid of cpu
variable (thanks to Eric Dumazet)
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
We might dreference a NULL pointer if the hold_queue is empty,
so add a check to avoid this.
Bug was introduced with git commit a0073fe18 ("xfrm: Add a state
resolution packet queue")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
We need to ensure that policies can't go away as long as the hold timer
is armed, so take a refcont when we arm the timer and drop one if we
delete it.
Bug was introduced with git commit a0073fe18 ("xfrm: Add a state
resolution packet queue")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
__xfrm4/6_state_addr_check is a four steps check, all we need to do
is checking whether the destination address match when looking SA
using wildcard source address. Passing saddr from flow is worst option,
as the checking needs to reach the fourth step while actually only
one time checking will do the work.
So, simplify this process by only checking destination address when
using wildcard source address for looking up SAs.
Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
If SA is in the process of acquiring, which indicates this SA is more
promising and precise than the fall back option, i.e. using wild card
source address for searching less suitable SA.
So, here bail out, and try again.
Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Conflicts:
include/net/xfrm.h
Simple conflict between Joe Perches "extern" removal for function
declarations in header files and the changes in Steffen's tree.
Steffen Klassert says:
====================
Two patches that are left from the last development cycle.
Manual merging of include/net/xfrm.h is needed. The conflict
can be solved as it is currently done in linux-next.
1) We announce the creation of temporary acquire state via an asyc event,
so the deletion should be annunced too. From Nicolas Dichtel.
2) The VTI tunnels do not real tunning, they just provide a routable
IPsec tunnel interface. So introduce and use xfrm_tunnel_notifier
instead of xfrm_tunnel for xfrm tunnel mode callback. From Fan Du.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If asynchronous events are enabled for a particular netlink socket,
the notify function is called by the advance function. The notify
function creates and dispatches a km_event if a replay timeout occurred,
or at least replay_maxdiff packets have been received since the last
asynchronous event has been sent. The function is supposed to return if
neither of the two events were detected for a state, or replay_maxdiff
is equal to zero.
Replay_maxdiff is initialized in xfrm_state_construct to the value of
the xfrm.sysctl_aevent_rseqth (2 by default), and updated if for a state
if the netlink attribute XFRMA_REPLAY_THRESH is set.
If, however, replay_maxdiff is set to zero, then all of the three notify
implementations perform a break from the switch statement instead of
checking whether a timeout occurred, and -- if not -- return. As a
result an asynchronous event is generated for every replay update of a
state that has a zero replay_maxdiff value.
This patch modifies the notify functions such that they immediately
return if replay_maxdiff has the value zero, unless a timeout occurred.
Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
For legacy IPsec anti replay mechanism:
bitmap in struct xfrm_replay_state could only provide a 32 bits
window size limit in current design, thus user level parameter
sadb_sa_replay should honor this limit, otherwise misleading
outputs("replay=244") by setkey -D will be:
192.168.25.2 192.168.22.2
esp mode=transport spi=147561170(0x08cb9ad2) reqid=0(0x00000000)
E: aes-cbc 9a8d7468 7655cf0b 719d27be b0ddaac2
A: hmac-sha1 2d2115c2 ebf7c126 1c54f186 3b139b58 264a7331
seq=0x00000000 replay=244 flags=0x00000000 state=mature
created: Sep 17 14:00:00 2013 current: Sep 17 14:00:22 2013
diff: 22(s) hard: 30(s) soft: 26(s)
last: Sep 17 14:00:00 2013 hard: 0(s) soft: 0(s)
current: 1408(bytes) hard: 0(bytes) soft: 0(bytes)
allocated: 22 hard: 0 soft: 0
sadb_seq=1 pid=4854 refcnt=0
192.168.22.2 192.168.25.2
esp mode=transport spi=255302123(0x0f3799eb) reqid=0(0x00000000)
E: aes-cbc 6485d990 f61a6bd5 e5660252 608ad282
A: hmac-sha1 0cca811a eb4fa893 c47ae56c 98f6e413 87379a88
seq=0x00000000 replay=244 flags=0x00000000 state=mature
created: Sep 17 14:00:00 2013 current: Sep 17 14:00:22 2013
diff: 22(s) hard: 30(s) soft: 26(s)
last: Sep 17 14:00:00 2013 hard: 0(s) soft: 0(s)
current: 1408(bytes) hard: 0(bytes) soft: 0(bytes)
allocated: 22 hard: 0 soft: 0
sadb_seq=0 pid=4854 refcnt=0
And also, optimizing xfrm_replay_check window checking by setting the
desirable x->props.replay_window with only doing the comparison once
for all when xfrm_state is first born.
Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
We pass the wrong netlink attribute to xfrm_replay_verify_len().
It should be XFRMA_REPLAY_ESN_VAL and not XFRMA_REPLAY_VAL as
we currently doing. This causes memory corruptions if the
replay esn attribute has incorrect length. Fix this by passing
the right attribute to xfrm_replay_verify_len().
Reported-by: Michael Rossberg <michael.rossberg@tu-ilmenau.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Conflicts:
drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
net/bridge/br_multicast.c
net/ipv6/sit.c
The conflicts were minor:
1) sit.c changes overlap with change to ip_tunnel_xmit() signature.
2) br_multicast.c had an overlap between computing max_delay using
msecs_to_jiffies and turning MLDV2_MRC() into an inline function
with a name using lowercase instead of uppercase letters.
3) stmmac had two overlapping changes, one which conditionally allocated
and hooked up a dma_cfg based upon the presence of the pbl OF property,
and another one handling store-and-forward DMA made. The latter of
which should not go into the new of_find_property() basic block.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The net_device might be not set on the skb when we try refcounting.
This leads to a null pointer dereference in xdst_queue_output().
It turned out that the refcount to the net_device is not needed
after all. The dst_entry has a refcount to the net_device before
we queue the skb, so it can't go away. Therefore we can remove the
refcount on queueing to fix the null pointer dereference.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Creation of temporary SA are announced by netlink, but there is no notification
for the deletion.
This patch fix this asymmetric situation.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
This patch removes a comment in xfrm_input() which became irrelevant
due to commit 2774c13, "xfrm: Handle blackhole route creation via afinfo".
That commit removed returning -EREMOTE in the xfrm_lookup() method when the
packet should be discarded and also removed the correspoinding -EREMOTE
handlers. This was replaced by calling the make_blackhole() method. Therefore
the comment about -EREMOTE is not relevant anymore.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
We need to choose the protocol family by skb->protocol. Otherwise we
call the wrong xfrm{4,6}_local_error handler in case an ipv6 sockets is
used in ipv4 mode, in which case we should call down to xfrm4_local_error
(ip6 sockets are a superset of ip4 ones).
We are called before before ip_output functions, so skb->protocol is
not reset.
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
xfrm_state timer should be independent of system clock change,
so switch to CLOCK_BOOTTIME base which is not only monotonic but
also counting suspend time.
Thus issue reported in commit: 9e0d57fd6dad37d72a3ca6db00ca8c76f2215454
("xfrm: SAD entries do not expire correctly after suspend-resume")
could ALSO be avoided.
v2: Use CLOCK_BOOTTIME to count suspend time, but still monotonic.
Signed-off-by: Fan Du <fan.du@windriver.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
In xfrm4 and xfrm6 we need to take care about sockets of the other
address family. This could happen because a 6in4 or 4in6 tunnel could
get protected by ipsec.
Because we don't want to have a run-time dependency on ipv6 when only
using ipv4 xfrm we have to embed a pointer to the correct local_error
function in xfrm_state_afinet and look it up when returning an error
depending on the socket address family.
Thanks to vi0ss for the great bug report:
<https://bugzilla.kernel.org/show_bug.cgi?id=58691>
v2:
a) fix two more unsafe interpretations of skb->sk as ipv6 socket
(xfrm6_local_dontfrag and __xfrm6_output)
v3:
a) add an EXPORT_SYMBOL_GPL(xfrm_local_error) to fix a link error when
building ipv6 as a module (thanks to Steffen Klassert)
Reported-by: <vi0oss@gmail.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Both policy timer and hold_timer need to be deleted when destroy policy
Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
The mark argument is read only, so constify it. Also make dummy_mark in
af_key const -- only used as dummy argument for this very function.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Current net name space has only one genid for both IPv4 and IPv6, it has below
drawbacks:
- Add/delete an IPv4 address will invalidate all IPv6 routing table entries.
- Insert/remove XFRM policy will also invalidate both IPv4/IPv6 routing table
entries even when the policy is only applied for one address family.
Thus, this patch attempt to split one genid for two to cater for IPv4 and IPv6
separately in a fine granularity.
Signed-off-by: Fan Du <fan.du@windriver.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:
====================
Just one patch this time.
1) Drop packets when the matching SA is in larval state and add a
statistic counter for that. From Fan Du.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When host ping its peer, ICMP echo request packet triggers IPsec
policy, then host negotiates SA secret with its peer. After IKE
installed SA for OUT direction, but before SA for IN direction
installed, host get ICMP echo reply from its peer. At the time
being, the SA state for IN direction could be XFRM_STATE_ACQ,
then the received packet will be dropped after adding
LINUX_MIB_XFRMINSTATEINVALID statistic.
Adding a LINUX_MIB_XFRMACQUIREERROR statistic counter for such
scenario when SA in larval state is much clearer for user than
LINUX_MIB_XFRMINSTATEINVALID which indicates the SA is totally
bad.
Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Merge 'net' bug fixes into 'net-next' as we have patches
that will build on top of them.
This merge commit includes a change from Emil Goode
(emilgoode@gmail.com) that fixes a warning that would
have been introduced by this merge. Specifically it
fixes the pingv6_ops method ipv6_chk_addr() to add a
"const" to the "struct net_device *dev" argument and
likewise update the dummy_ipv6_chk_addr() declaration.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Clean up unnecessary assignment and jump. While there, fix up the label
name.
Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|