summaryrefslogtreecommitdiffstats
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2015-07-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds12-32/+33
Pull networking fixes from David Miller: 1) mlx4 driver bug fixes (TX queue wakeups, csum complete indications) from Ido Shamay, Eran Ben Elisha, and Or Gerlitz. 2) Missing unlock in error path of PTP support in renesas driver, from Dan Carpenter. 3) Add Vitesse 8641 phy IDs to vitesse PHY driver, from Shaohui Xie. 4) Bnx2x driver bug fixes (linearization of encap packets, scratchpad parity error notifications, flow-control and speed settings) from Yuval Mintz, Manish Chopra, Shahed Shaikh, and Ariel Elior. 5) ipv6 extension header parsing in the igb chip has a HW errata, disable it. Frm Todd Fujinaka. 6) Fix PCI link state locking issue in e1000e driver, from Yanir Lubetkin. 7) Cure panics during MTU change in i40e, from Mitch Williams. 8) Don't leak promisc refs in DSA slave driver, from Gilad Ben-Yossef. 9) Add missing HAS_DMA dep to VIA Rhine driver, from Geery Uytterhoeven. 10) Make sure DMA map/unmap calls are symmetric in bnx2x driver, from Michal Schmidt. 11) Workaround for MDIO access problems in bcm7xxx devices, from FLorian Fainelli. 12) Fix races in SCTP protocol between OTTB responses and route removals, from Alexander Sverdlin. 13) Fix jumbo frame checksum issue with some mvneta devices, from Simon Guinot. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (58 commits) sock_diag: don't broadcast kernel sockets net: mvneta: disable IP checksum with jumbo frames for Armada 370 ARM: mvebu: update Ethernet compatible string for Armada XP net: mvneta: introduce compatible string "marvell, armada-xp-neta" api: fix compatibility of linux/in.h with netinet/in.h net: icplus: fix typo in constant name sis900: Trivial: Fix typos in enums stmmac: Trivial: fix typo in constant name sctp: Fix race between OOTB responce and route removal net-Liquidio: Delete unnecessary checks before the function call "vfree" vmxnet3: Bump up driver version number amd-xgbe: Add the __GFP_NOWARN flag to Rx buffer allocation net: phy: mdio-bcm-unimac: workaround initial read failures for integrated PHYs net: bcmgenet: workaround initial read failures for integrated PHYs net: phy: bcm7xxx: workaround MDIO management controller initial read bnx2x: fix DMA API usage net: via: VIA_RHINE and VIA_VELOCITY should depend on HAS_DMA net/phy: tune get_phy_c45_ids to support more c45 phy bnx2x: fix lockdep splat net: fec: don't access RACC register when not available ...
2015-07-01Merge tag 'modules-next-for-linus' of ↵Linus Torvalds3-6/+6
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module updates from Rusty Russell: "Main excitement here is Peter Zijlstra's lockless rbtree optimization to speed module address lookup. He found some abusers of the module lock doing that too. A little bit of parameter work here too; including Dan Streetman's breaking up the big param mutex so writing a parameter can load another module (yeah, really). Unfortunately that broke the usual suspects, !CONFIG_MODULES and !CONFIG_SYSFS, so those fixes were appended too" * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (26 commits) modules: only use mod->param_lock if CONFIG_MODULES param: fix module param locks when !CONFIG_SYSFS. rcu: merge fix for Convert ACCESS_ONCE() to READ_ONCE() and WRITE_ONCE() module: add per-module param_lock module: make perm const params: suppress unused variable error, warn once just in case code changes. modules: clarify CONFIG_MODULE_COMPRESS help, suggest 'N'. kernel/module.c: avoid ifdefs for sig_enforce declaration kernel/workqueue.c: remove ifdefs over wq_power_efficient kernel/params.c: export param_ops_bool_enable_only kernel/params.c: generalize bool_enable_only kernel/module.c: use generic module param operaters for sig_enforce kernel/params: constify struct kernel_param_ops uses sysfs: tightened sysfs permission checks module: Rework module_addr_{min,max} module: Use __module_address() for module_address_lookup() module: Make the mod_tree stuff conditional on PERF_EVENTS || TRACING module: Optimize __module_address() using a latched RB-tree rbtree: Implement generic latch_tree seqlock: Introduce raw_read_seqcount_latch() ...
2015-06-30sock_diag: don't broadcast kernel socketsCraig Gallek1-1/+1
Kernel sockets do not hold a reference for the network namespace to which they point. Socket destruction broadcasting relies on the network namespace and will cause the splat below when a kernel socket is destroyed. This fix simply ignores kernel sockets when they are destroyed. Reported as: general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC CPU: 1 PID: 9130 Comm: kworker/1:1 Not tainted 4.1.0-gelk-debug+ #1 Workqueue: sock_diag_events sock_diag_broadcast_destroy_work Stack: ffff8800b9c586c0 ffff8800b9c586c0 ffff8800ac4692c0 ffff8800936d4a90 ffff8800352efd38 ffffffff8469a93e ffff8800352efd98 ffffffffc09b9b90 ffff8800352efd78 ffff8800ac4692c0 ffff8800b9c586c0 ffff8800831b6ab8 Call Trace: [<ffffffff8469a93e>] ? mutex_unlock+0xe/0x10 [<ffffffffc09b9b90>] ? inet_diag_handler_get_info+0x110/0x1fb [inet_diag] [<ffffffff845c868d>] netlink_broadcast+0x1d/0x20 [<ffffffff8469a93e>] ? mutex_unlock+0xe/0x10 [<ffffffff845b2bf5>] sock_diag_broadcast_destroy_work+0xd5/0x160 [<ffffffff8408ea97>] process_one_work+0x147/0x420 [<ffffffff8408f0f9>] worker_thread+0x69/0x470 [<ffffffff8409fda3>] ? preempt_count_sub+0xa3/0xf0 [<ffffffff8408f090>] ? rescuer_thread+0x320/0x320 [<ffffffff84093cd7>] kthread+0x107/0x120 [<ffffffff84093bd0>] ? kthread_create_on_node+0x1b0/0x1b0 [<ffffffff8469d31f>] ret_from_fork+0x3f/0x70 [<ffffffff84093bd0>] ? kthread_create_on_node+0x1b0/0x1b0 Tested: Using a debug kernel while 'ss -E' is running: ip netns add test-ns ip netns delete test-ns Fixes: eb4cb008529c sock_diag: define destruction multicast groups Fixes: 26abe14379f8 net: Modify sk_alloc to not reference count the netns of kernel sockets. Reported-by: Dave Jones <davej@codemonkey.org.uk> Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Craig Gallek <kraig@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-29sctp: Fix race between OOTB responce and route removalAlexander Sverdlin1-1/+3
There is NULL pointer dereference possible during statistics update if the route used for OOTB responce is removed at unfortunate time. If the route exists when we receive OOTB packet and we finally jump into sctp_packet_transmit() to send ABORT, but in the meantime route is removed under our feet, we take "no_route" path and try to update stats with IP_INC_STATS(sock_net(asoc->base.sk), ...). But sctp_ootb_pkt_new() used to prepare responce packet doesn't call sctp_transport_set_owner() and therefore there is no asoc associated with this packet. Probably temporary asoc just for OOTB responces is overkill, so just introduce a check like in all other places in sctp_packet_transmit(), where "asoc" is dereferenced. To reproduce this, one needs to 0. ensure that sctp module is loaded (otherwise ABORT is not generated) 1. remove default route on the machine 2. while true; do ip route del [interface-specific route] ip route add [interface-specific route] done 3. send enough OOTB packets (i.e. HB REQs) from another host to trigger ABORT responce On x86_64 the crash looks like this: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp] PGD 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: ... CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 4.0.5-1-ARCH #1 Hardware name: ... task: ffffffff818124c0 ti: ffffffff81800000 task.ti: ffffffff81800000 RIP: 0010:[<ffffffffa05ec9ac>] [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp] RSP: 0018:ffff880127c037b8 EFLAGS: 00010296 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000015ff66b480 RDX: 00000015ff66b400 RSI: ffff880127c17200 RDI: ffff880123403700 RBP: ffff880127c03888 R08: 0000000000017200 R09: ffffffff814625af R10: ffffea00047e4680 R11: 00000000ffffff80 R12: ffff8800b0d38a28 R13: ffff8800b0d38a28 R14: ffff8800b3e88000 R15: ffffffffa05f24e0 FS: 0000000000000000(0000) GS:ffff880127c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000020 CR3: 00000000c855b000 CR4: 00000000000007f0 Stack: ffff880127c03910 ffff8800b0d38a28 ffffffff8189d240 ffff88011f91b400 ffff880127c03828 ffffffffa05c94c5 0000000000000000 ffff8800baa1c520 0000000000000000 0000000000000001 0000000000000000 0000000000000000 Call Trace: <IRQ> [<ffffffffa05c94c5>] ? sctp_sf_tabort_8_4_8.isra.20+0x85/0x140 [sctp] [<ffffffffa05d6b42>] ? sctp_transport_put+0x52/0x80 [sctp] [<ffffffffa05d0bfc>] sctp_do_sm+0xb8c/0x19a0 [sctp] [<ffffffff810b0e00>] ? trigger_load_balance+0x90/0x210 [<ffffffff810e0329>] ? update_process_times+0x59/0x60 [<ffffffff812c7a40>] ? timerqueue_add+0x60/0xb0 [<ffffffff810e0549>] ? enqueue_hrtimer+0x29/0xa0 [<ffffffff8101f599>] ? read_tsc+0x9/0x10 [<ffffffff8116d4b5>] ? put_page+0x55/0x60 [<ffffffff810ee1ad>] ? clockevents_program_event+0x6d/0x100 [<ffffffff81462b68>] ? skb_free_head+0x58/0x80 [<ffffffffa029a10b>] ? chksum_update+0x1b/0x27 [crc32c_generic] [<ffffffff81283f3e>] ? crypto_shash_update+0xce/0xf0 [<ffffffffa05d3993>] sctp_endpoint_bh_rcv+0x113/0x280 [sctp] [<ffffffffa05dd4e6>] sctp_inq_push+0x46/0x60 [sctp] [<ffffffffa05ed7a0>] sctp_rcv+0x880/0x910 [sctp] [<ffffffffa05ecb50>] ? sctp_packet_transmit_chunk+0xb0/0xb0 [sctp] [<ffffffffa05ecb70>] ? sctp_csum_update+0x20/0x20 [sctp] [<ffffffff814b05a5>] ? ip_route_input_noref+0x235/0xd30 [<ffffffff81051d6b>] ? ack_ioapic_level+0x7b/0x150 [<ffffffff814b27be>] ip_local_deliver_finish+0xae/0x210 [<ffffffff814b2e15>] ip_local_deliver+0x35/0x90 [<ffffffff814b2a15>] ip_rcv_finish+0xf5/0x370 [<ffffffff814b3128>] ip_rcv+0x2b8/0x3a0 [<ffffffff81474193>] __netif_receive_skb_core+0x763/0xa50 [<ffffffff81476c28>] __netif_receive_skb+0x18/0x60 [<ffffffff81476cb0>] netif_receive_skb_internal+0x40/0xd0 [<ffffffff814776c8>] napi_gro_receive+0xe8/0x120 [<ffffffffa03946aa>] rtl8169_poll+0x2da/0x660 [r8169] [<ffffffff8147896a>] net_rx_action+0x21a/0x360 [<ffffffff81078dc1>] __do_softirq+0xe1/0x2d0 [<ffffffff8107912d>] irq_exit+0xad/0xb0 [<ffffffff8157d158>] do_IRQ+0x58/0xf0 [<ffffffff8157b06d>] common_interrupt+0x6d/0x6d <EOI> [<ffffffff810e1218>] ? hrtimer_start+0x18/0x20 [<ffffffffa05d65f9>] ? sctp_transport_destroy_rcu+0x29/0x30 [sctp] [<ffffffff81020c50>] ? mwait_idle+0x60/0xa0 [<ffffffff810216ef>] arch_cpu_idle+0xf/0x20 [<ffffffff810b731c>] cpu_startup_entry+0x3ec/0x480 [<ffffffff8156b365>] rest_init+0x85/0x90 [<ffffffff818eb035>] start_kernel+0x48b/0x4ac [<ffffffff818ea120>] ? early_idt_handlers+0x120/0x120 [<ffffffff818ea339>] x86_64_start_reservations+0x2a/0x2c [<ffffffff818ea49c>] x86_64_start_kernel+0x161/0x184 Code: 90 48 8b 80 b8 00 00 00 48 89 85 70 ff ff ff 48 83 bd 70 ff ff ff 00 0f 85 cd fa ff ff 48 89 df 31 db e8 18 63 e7 e0 48 8b 45 80 <48> 8b 40 20 48 8b 40 30 48 8b 80 68 01 00 00 65 48 ff 40 78 e9 RIP [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp] RSP <ffff880127c037b8> CR2: 0000000000000020 ---[ end trace 5aec7fd2dc983574 ]--- Kernel panic - not syncing: Fatal exception in interrupt Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff) drm_kms_helper: panic occurred, switching back to text console ---[ end Kernel panic - not syncing: Fatal exception in interrupt Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-28dsa: fix promiscuity leak on slave dev open errorGilad Ben-Yossef1-1/+1
DSA master netdev promiscuity counter was not being properly decremented on slave device open error path. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> CC: Gilad Ben-Yossef <giladb@ezchip.com> CC: David S. Miller <davem@davemloft.net> CC: Florian Fainelli <f.fainelli@gmail.com> CC: Guenter Roeck <linux@roeck-us.net> CC: Andrew Lunn <andrew@lunn.ch> CC: Scott Feldman <sfeldma@gmail.com> Acked-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-28net: Kill sock->sk_protinfoDavid Miller2-7/+0
No more users, so it can now be removed. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-28ax25: Stop using sock->sk_protinfo.David Miller2-16/+16
Just make a ax25_sock structure that provides the ax25_cb pointer. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-28flow_dissector: Pre-initialize ip_proto in __skb_flow_dissect()Geert Uytterhoeven1-1/+1
net/core/flow_dissector.c: In function ‘__skb_flow_dissect’: net/core/flow_dissector.c:132: warning: ‘ip_proto’ may be used uninitialized in this function Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-28ipv4: fix RCU lockdep warning from linkdown changesAndy Gospodarek1-2/+2
The following lockdep splat was seen due to the wrong context for grabbing in_dev. =============================== [ INFO: suspicious RCU usage. ] 4.1.0-next-20150626-dbg-00020-g54a6d91-dirty #244 Not tainted ------------------------------- include/linux/inetdevice.h:205 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by ip/403: #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff81453305>] rtnl_lock+0x17/0x19 #1: ((inetaddr_chain).rwsem){.+.+.+}, at: [<ffffffff8105c327>] __blocking_notifier_call_chain+0x35/0x6a stack backtrace: CPU: 2 PID: 403 Comm: ip Not tainted 4.1.0-next-20150626-dbg-00020-g54a6d91-dirty #244 0000000000000001 ffff8800b189b728 ffffffff8150a542 ffffffff8107a8b3 ffff880037bbea40 ffff8800b189b758 ffffffff8107cb74 ffff8800379dbd00 ffff8800bec85800 ffff8800bf9e13c0 00000000000000ff ffff8800b189b7d8 Call Trace: [<ffffffff8150a542>] dump_stack+0x4c/0x6e [<ffffffff8107a8b3>] ? up+0x39/0x3e [<ffffffff8107cb74>] lockdep_rcu_suspicious+0xf7/0x100 [<ffffffff814b63c3>] fib_dump_info+0x227/0x3e2 [<ffffffff814b6624>] rtmsg_fib+0xa6/0x116 [<ffffffff814b978f>] fib_table_insert+0x316/0x355 [<ffffffff814b362e>] fib_magic+0xb7/0xc7 [<ffffffff814b4803>] fib_add_ifaddr+0xb1/0x13b [<ffffffff814b4d09>] fib_inetaddr_event+0x36/0x90 [<ffffffff8105c086>] notifier_call_chain+0x4c/0x71 [<ffffffff8105c340>] __blocking_notifier_call_chain+0x4e/0x6a [<ffffffff8105c370>] blocking_notifier_call_chain+0x14/0x16 [<ffffffff814a7f50>] __inet_insert_ifa+0x1a5/0x1b3 [<ffffffff814a894d>] inet_rtm_newaddr+0x350/0x35f [<ffffffff81457866>] rtnetlink_rcv_msg+0x17b/0x18a [<ffffffff8107e7c3>] ? trace_hardirqs_on+0xd/0xf [<ffffffff8146965f>] ? netlink_deliver_tap+0x1cb/0x1f7 [<ffffffff814576eb>] ? rtnl_newlink+0x72a/0x72a ... This patch resolves that splat. Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com> Reported-by: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-28tipc: purge backlog queue counters when broadcast link is resetJon Paul Maloy3-1/+7
In commit 1f66d161ab3d8b518903fa6c3f9c1f48d6919e74 ("tipc: introduce starvation free send algorithm") we introduced a counter per priority level for buffers in the link backlog queue. We also introduced a new function tipc_link_purge_backlog(), to reset these counters to zero when the link is reset. Unfortunately, we missed to call this function when the broadcast link is reset, with the result that the values of these counters might be permanently skewed when new nodes are attached. This may in the worst case lead to permananent, but spurious, broadcast link congestion, where no broadcast packets can be sent at all. We fix this bug with this commit. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-27Merge branch 'for-4.2' of git://linux-nfs.org/~bfields/linuxLinus Torvalds13-198/+129
Pull nfsd updates from Bruce Fields: "A relatively quiet cycle, with a mix of cleanup and smaller bugfixes" * 'for-4.2' of git://linux-nfs.org/~bfields/linux: (24 commits) sunrpc: use sg_init_one() in krb5_rc4_setup_enc/seq_key() nfsd: wrap too long lines in nfsd4_encode_read nfsd: fput rd_file from XDR encode context nfsd: take struct file setup fully into nfs4_preprocess_stateid_op nfsd: refactor nfs4_preprocess_stateid_op nfsd: clean up raparams handling nfsd: use swap() in sort_pacl_range() rpcrdma: Merge svcrdma and xprtrdma modules into one svcrdma: Add a separate "max data segs macro for svcrdma svcrdma: Replace GFP_KERNEL in a loop with GFP_NOFAIL svcrdma: Keep rpcrdma_msg fields in network byte-order svcrdma: Fix byte-swapping in svc_rdma_sendto.c nfsd: Update callback sequnce id only CB_SEQUENCE success nfsd: Reset cb_status in nfsd4_cb_prepare() at retrying svcrdma: Remove svc_rdma_xdr_decode_deferred_req() SUNRPC: Move EXPORT_SYMBOL for svc_process uapi/nfs: Add NFSv4.1 ACL definitions nfsd: Remove dead declarations nfsd: work around a gcc-5.1 warning nfsd: Checking for acl support does not require fetching any acls ...
2015-06-25net: sched: flower fix typoJamal Hadi Salim1-2/+2
Fix typo in the validation rules for flower's attributes Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds390-6682/+14698
Pull networking updates from David Miller: 1) Add TX fast path in mac80211, from Johannes Berg. 2) Add TSO/GRO support to ibmveth, from Thomas Falcon 3) Move away from cached routes in ipv6, just like ipv4, from Martin KaFai Lau. 4) Lots of new rhashtable tests, from Thomas Graf. 5) Run ingress qdisc lockless, from Alexei Starovoitov. 6) Allow servers to fetch TCP packet headers for SYN packets of new connections, for fingerprinting. From Eric Dumazet. 7) Add mode parameter to pktgen, for testing receive. From Alexei Starovoitov. 8) Cache access optimizations via simplifications of build_skb(), from Alexander Duyck. 9) Move page frag allocator under mm/, also from Alexander. 10) Add xmit_more support to hv_netvsc, from KY Srinivasan. 11) Add a counter guard in case we try to perform endless reclassify loops in the packet scheduler. 12) Extern flow dissector to be programmable and use it in new "Flower" classifier. From Jiri Pirko. 13) AF_PACKET fanout rollover fixes, performance improvements, and new statistics. From Willem de Bruijn. 14) Add netdev driver for GENEVE tunnels, from John W Linville. 15) Add ingress netfilter hooks and filtering, from Pablo Neira Ayuso. 16) Fix handling of epoll edge triggers in TCP, from Eric Dumazet. 17) Add an ECN retry fallback for the initial TCP handshake, from Daniel Borkmann. 18) Add tail call support to BPF, from Alexei Starovoitov. 19) Add several pktgen helper scripts, from Jesper Dangaard Brouer. 20) Add zerocopy support to AF_UNIX, from Hannes Frederic Sowa. 21) Favor even port numbers for allocation to connect() requests, and odd port numbers for bind(0), in an effort to help avoid ip_local_port_range exhaustion. From Eric Dumazet. 22) Add Cavium ThunderX driver, from Sunil Goutham. 23) Allow bpf programs to access skb_iif and dev->ifindex SKB metadata, from Alexei Starovoitov. 24) Add support for T6 chips in cxgb4vf driver, from Hariprasad Shenai. 25) Double TCP Small Queues default to 256K to accomodate situations like the XEN driver and wireless aggregation. From Wei Liu. 26) Add more entropy inputs to flow dissector, from Tom Herbert. 27) Add CDG congestion control algorithm to TCP, from Kenneth Klette Jonassen. 28) Convert ipset over to RCU locking, from Jozsef Kadlecsik. 29) Track and act upon link status of ipv4 route nexthops, from Andy Gospodarek. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1670 commits) bridge: vlan: flush the dynamically learned entries on port vlan delete bridge: multicast: add a comment to br_port_state_selection about blocking state net: inet_diag: export IPV6_V6ONLY sockopt stmmac: troubleshoot unexpected bits in des0 & des1 net: ipv4 sysctl option to ignore routes when nexthop link is down net: track link-status of ipv4 nexthops net: switchdev: ignore unsupported bridge flags net: Cavium: Fix MAC address setting in shutdown state drivers: net: xgene: fix for ACPI support without ACPI ip: report the original address of ICMP messages net/mlx5e: Prefetch skb data on RX net/mlx5e: Pop cq outside mlx5e_get_cqe net/mlx5e: Remove mlx5e_cq.sqrq back-pointer net/mlx5e: Remove extra spaces net/mlx5e: Avoid TX CQE generation if more xmit packets expected net/mlx5e: Avoid redundant dev_kfree_skb() upon NOP completion net/mlx5e: Remove re-assignment of wq type in mlx5e_enable_rq() net/mlx5e: Use skb_shinfo(skb)->gso_segs rather than counting them net/mlx5e: Static mapping of netdev priv resources to/from netdev TX queues net/mlx4_en: Use HW counters for rx/tx bytes/packets in PF device ...
2015-06-24bridge: vlan: flush the dynamically learned entries on port vlan deleteNikolay Aleksandrov6-7/+11
Add a new argument to br_fdb_delete_by_port which allows to specify a vid to match when flushing entries and use it in nbp_vlan_delete() to flush the dynamically learned entries of the vlan/port pair when removing a vlan from a port. Before this patch only the local mac was being removed and the dynamically learned ones were left to expire. Note that the do_all argument is still respected and if specified, the vid will be ignored. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-24bridge: multicast: add a comment to br_port_state_selection about blocking stateNikolay Aleksandrov1-0/+4
Add a comment to explain why we're not disabling port's multicast when it goes in blocking state. Since there's a check in the timer's function which bypasses the timer if the port's in blocking/disabled state, the timer will simply expire and stop without sending more queries. Suggested-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller14-42/+100
Conflicts: drivers/net/ethernet/mellanox/mlx4/main.c net/packet/af_packet.c Both conflicts were cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-24net: inet_diag: export IPV6_V6ONLY sockoptPhil Sutter1-0/+4
For AF_INET6 sockets, the value of struct ipv6_pinfo.ipv6only is exported to userspace. It indicates whether a socket bound to in6addr_any listens on IPv4 as well as IPv6. Since the socket is natively IPv6, it is not listed by e.g. 'ss -l -4'. This patch is accompanied by an appropriate one for iproute2 to enable the additional information in 'ss -e'. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-24net: ipv4 sysctl option to ignore routes when nexthop link is downAndy Gospodarek7-16/+48
This feature is only enabled with the new per-interface or ipv4 global sysctls called 'ignore_routes_with_linkdown'. net.ipv4.conf.all.ignore_routes_with_linkdown = 0 net.ipv4.conf.default.ignore_routes_with_linkdown = 0 net.ipv4.conf.lo.ignore_routes_with_linkdown = 0 ... When the above sysctls are set, will report to userspace that a route is dead and will no longer resolve to this nexthop when performing a fib lookup. This will signal to userspace that the route will not be selected. The signalling of a RTNH_F_DEAD is only passed to userspace if the sysctl is enabled and link is down. This was done as without it the netlink listeners would have no idea whether or not a nexthop would be selected. The kernel only sets RTNH_F_DEAD internally if the interface has IFF_UP cleared. With the new sysctl set, the following behavior can be observed (interface p8p1 is link-down): default via 10.0.5.2 dev p9p1 10.0.5.0/24 dev p9p1 proto kernel scope link src 10.0.5.15 70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1 80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1 dead linkdown 90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 1 dead linkdown 90.0.0.0/24 via 70.0.0.2 dev p7p1 metric 2 90.0.0.1 via 70.0.0.2 dev p7p1 src 70.0.0.1 cache local 80.0.0.1 dev lo src 80.0.0.1 cache <local> 80.0.0.2 via 10.0.5.2 dev p9p1 src 10.0.5.15 cache While the route does remain in the table (so it can be modified if needed rather than being wiped away as it would be if IFF_UP was cleared), the proper next-hop is chosen automatically when the link is down. Now interface p8p1 is linked-up: default via 10.0.5.2 dev p9p1 10.0.5.0/24 dev p9p1 proto kernel scope link src 10.0.5.15 70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1 80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1 90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 1 90.0.0.0/24 via 70.0.0.2 dev p7p1 metric 2 192.168.56.0/24 dev p2p1 proto kernel scope link src 192.168.56.2 90.0.0.1 via 80.0.0.2 dev p8p1 src 80.0.0.1 cache local 80.0.0.1 dev lo src 80.0.0.1 cache <local> 80.0.0.2 dev p8p1 src 80.0.0.1 cache and the output changes to what one would expect. If the sysctl is not set, the following output would be expected when p8p1 is down: default via 10.0.5.2 dev p9p1 10.0.5.0/24 dev p9p1 proto kernel scope link src 10.0.5.15 70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1 80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1 linkdown 90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 1 linkdown 90.0.0.0/24 via 70.0.0.2 dev p7p1 metric 2 Since the dead flag does not appear, there should be no expectation that the kernel would skip using this route due to link being down. v2: Split kernel changes into 2 patches, this actually makes a behavioral change if the sysctl is set. Also took suggestion from Alex to simplify code by only checking sysctl during fib lookup and suggestion from Scott to add a per-interface sysctl. v3: Code clean-ups to make it more readable and efficient as well as a reverse path check fix. v4: Drop binary sysctl v5: Whitespace fixups from Dave v6: Style changes from Dave and checkpatch suggestions v7: One more checkpatch fixup Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: Dinesh Dutt <ddutt@cumulusnetworks.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-24net: track link-status of ipv4 nexthopsAndy Gospodarek2-21/+62
Add a fib flag called RTNH_F_LINKDOWN to any ipv4 nexthops that are reachable via an interface where carrier is off. No action is taken, but additional flags are passed to userspace to indicate carrier status. This also includes a cleanup to fib_disable_ip to more clearly indicate what event made the function call to replace the more cryptic force option previously used. v2: Split out kernel functionality into 2 patches, this patch simply sets and clears new nexthop flag RTNH_F_LINKDOWN. v3: Cleanups suggested by Alex as well as a bug noticed in fib_sync_down_dev and fib_sync_up when multipath was not enabled. v5: Whitespace and variable declaration fixups suggested by Dave. v6: Style fixups noticed by Dave; ran checkpatch to be sure I got them all. Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: Dinesh Dutt <ddutt@cumulusnetworks.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-24net: switchdev: ignore unsupported bridge flagsVivien Didelot1-1/+1
switchdev_port_bridge_getlink() queries SWITCHDEV_ATTR_PORT_BRIDGE_FLAGS attributes, but a driver doesn't need to implement this in order to get bridge link information. So error out only on errors different than -EOPNOTSUPP. (This is a follow-up patch for 7d4f8d8.) Fixes: 8793d0a664a8 ("switchdev: add new switchdev_port_bridge_getlink") Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-24ip: report the original address of ICMP messagesJulian Anastasov2-2/+21
ICMP messages can trigger ICMP and local errors. In this case serr->port is 0 and starting from Linux 4.0 we do not return the original target address to the error queue readers. Add function to define which errors provide addr_offset. With this fix my ping command is not silent anymore. Fixes: c247f0534cc5 ("ip: fix error queue empty skb handling") Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-23Merge tag 'for-linus' of ↵Linus Torvalds14-266/+102
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma updates from Doug Ledford: - a large cleanup of how device capabilities are checked for various features - additional cleanups in the MAD processing - update to the srp driver - creation and use of centralized log message helpers - add const to a number of args to calls and clean up call chain - add support for extended cq create verb - add support for timestamps on cq completion - add support for processing OPA MAD packets * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (92 commits) IB/mad: Add final OPA MAD processing IB/mad: Add partial Intel OPA MAD support IB/mad: Add partial Intel OPA MAD support IB/core: Add OPA MAD core capability flag IB/mad: Add support for additional MAD info to/from drivers IB/mad: Convert allocations from kmem_cache to kzalloc IB/core: Add ability for drivers to report an alternate MAD size. IB/mad: Support alternate Base Versions when creating MADs IB/mad: Create a generic helper for DR forwarding checks IB/mad: Create a generic helper for DR SMP Recv processing IB/mad: Create a generic helper for DR SMP Send processing IB/mad: Split IB SMI handling from MAD Recv handler IB/mad cleanup: Generalize processing of MAD data IB/mad cleanup: Clean up function params -- find_mad_agent IB/mlx4: Add support for CQ time-stamping IB/mlx4: Add mmap call to map the hardware clock IB/core: Pass hardware specific data in query_device IB/core: Add timestamp_mask and hca_core_clock to query_device IB/core: Extend ib_uverbs_create_cq IB/core: Add CQ creation time-stamping flag ...
2015-06-23Merge branch 'for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina: "As usual, mostly comment, kerneldoc and printk() fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: lpfc: Grammar s/an negative/a negative/ ARM: lib/lib1funcs.S: fix typo s/substractions/subtractions/ cx25821: cx25821-medusa-reg.h: fix 0x0x prefix lib: crc-itu-t.[ch] fix 0x0x prefix in integer constants rapidio: Fix kerneldoc and comment qla4xxx: Fix printk() in qla4_83xx_read_reset_template() and qla4_83xx_pre_loopback_config() treewide: Kconfig: fix wording / spelling usb/serial: fix grammar in Kconfig help text for FTDI_SIO megaraid_sas: fix kerneldoc netfilter: ebtables: fix comment grammar drm/radeon: fix comment isdn: fix grammar in comment ARM: KVM: fix comment
2015-06-23switchdev: change BUG_ON to WARN for attr set failure caseScott Feldman1-1/+2
This particular BUG_ON condition was checking for attr set err in the COMMIT phase, which isn't expected (it's a driver bug if PREPARE phase is OK but COMMIT fails). But BUG_ON() is too strong for this case, so change to WARN(). BUG_ON() would be warranted if the system was corrupted beyond repair, but this is not the case here. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-23switchdev; add VLAN support for port's bridge_getlinkScott Feldman2-4/+137
One more missing piece of the puzzle. Add vlan dump support to switchdev port's bridge_getlink. iproute2 "bridge vlan show" cmd already knows how to show the vlans installed on the bridge and the device , but (until now) no one implemented the port vlan part of the netlink PF_BRIDGE:RTM_GETLINK msg. Before this patch, "bridge vlan show": $ bridge -c vlan show port vlan ids sw1p1 30-34 << bridge side vlans 57 sw1p1 << device side vlans (missing) sw1p2 57 sw1p2 sw1p3 sw1p4 br0 None (When the port is bridged, the output repeats the vlan list for the vlans on the bridge side of the port and the vlans on the device side of the port. The listing above show no vlans for the device side even though they are installed). After this patch: $ bridge -c vlan show port vlan ids sw1p1 30-34 << bridge side vlan 57 sw1p1 30-34 << device side vlans 57 3840 PVID sw1p2 57 sw1p2 57 3840 PVID sw1p3 3842 PVID sw1p4 3843 PVID br0 None I re-used ndo_dflt_bridge_getlink to add vlan fill call-back func. switchdev support adds an obj dump for VLAN objects, using the same call-back scheme as FDB dump. Support included for both compressed and un-compressed vlan dumps. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-23switchdev: rename vlan vid_start to vid_beginScott Feldman2-8/+8
Use vid_begin/end to be consistent with BRIDGE_VLAN_INFO_RANGE_BEGIN/END. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-23packet: remove handling of tx_ringManinder Singh1-9/+5
Remove handling of tx_ring in prb_setup_retire_blk_timer for TPACKET_V3 because init_prb_bdqc is called only for zero tx_ring and thus prb_setup_retire_blk_timer for zero tx_ring only. And also in functon init_prb_bdqc there is no usage of tx_ring. Thus removing tx_ring from init_prb_bdqc. Signed-off-by: Maninder Singh <maninder1.s@samsung.com> Suggested-by: Frans Klaver <fransklaver@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-23Merge tag 'linux-can-fixes-for-4.1-20150621' of ↵David S. Miller1-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== Oliver Hartkopp fixed a bug in the generic CAN frame handling code, which may lead to loss of CAN frames. It was introduced during v4.1 development. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> gpg: Signature made Sun 21 Jun 2015 09:59:36 AM PDT using RSA key ID C9B5CFC7
2015-06-23netfilter: nf_qeueue: Drop queue entries on nf_unregister_hookEric W. Biederman4-1/+42
Add code to nf_unregister_hook to flush the nf_queue when a hook is unregistered. This guarantees that the pointer that the nf_queue code retains into the nf_hook list will remain valid while a packet is queued. I tested what would happen if we do not flush queued packets and was trivially able to obtain the oops below. All that was required was to stop the nf_queue listening process, to delete all of the nf_tables, and to awaken the nf_queue listening process. > BUG: unable to handle kernel paging request at 0000000100000001 > IP: [<0000000100000001>] 0x100000001 > PGD b9c35067 PUD 0 > Oops: 0010 [#1] SMP > Modules linked in: > CPU: 0 PID: 519 Comm: lt-nfqnl_test Not tainted > task: ffff8800b9c8c050 ti: ffff8800ba9d8000 task.ti: ffff8800ba9d8000 > RIP: 0010:[<0000000100000001>] [<0000000100000001>] 0x100000001 > RSP: 0018:ffff8800ba9dba40 EFLAGS: 00010a16 > RAX: ffff8800bab48a00 RBX: ffff8800ba9dba90 RCX: ffff8800ba9dba90 > RDX: ffff8800b9c10128 RSI: ffff8800ba940900 RDI: ffff8800bab48a00 > RBP: ffff8800b9c10128 R08: ffffffff82976660 R09: ffff8800ba9dbb28 > R10: dead000000100100 R11: dead000000200200 R12: ffff8800ba940900 > R13: ffffffff8313fd50 R14: ffff8800b9c95200 R15: 0000000000000000 > FS: 00007fb91fc34700(0000) GS:ffff8800bfa00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000100000001 CR3: 00000000babfb000 CR4: 00000000000007f0 > Stack: > ffffffff8206ab0f ffffffff82982240 ffff8800bab48a00 ffff8800b9c100a8 > ffff8800b9c10100 0000000000000001 ffff8800ba940900 ffff8800b9c10128 > ffffffff8206bd65 ffff8800bfb0d5e0 ffff8800bab48a00 0000000000014dc0 > Call Trace: > [<ffffffff8206ab0f>] ? nf_iterate+0x4f/0xa0 > [<ffffffff8206bd65>] ? nf_reinject+0x125/0x190 > [<ffffffff8206dee5>] ? nfqnl_recv_verdict+0x255/0x360 > [<ffffffff81386290>] ? nla_parse+0x80/0xf0 > [<ffffffff8206c42c>] ? nfnetlink_rcv_msg+0x13c/0x240 > [<ffffffff811b2fec>] ? __memcg_kmem_get_cache+0x4c/0x150 > [<ffffffff8206c2f0>] ? nfnl_lock+0x20/0x20 > [<ffffffff82068159>] ? netlink_rcv_skb+0xa9/0xc0 > [<ffffffff820677bf>] ? netlink_unicast+0x12f/0x1c0 > [<ffffffff82067ade>] ? netlink_sendmsg+0x28e/0x650 > [<ffffffff81fdd814>] ? sock_sendmsg+0x44/0x50 > [<ffffffff81fde07b>] ? ___sys_sendmsg+0x2ab/0x2c0 > [<ffffffff810e8f73>] ? __wake_up+0x43/0x70 > [<ffffffff8141a134>] ? tty_write+0x1c4/0x2a0 > [<ffffffff81fde9f4>] ? __sys_sendmsg+0x44/0x80 > [<ffffffff823ff8d7>] ? system_call_fastpath+0x12/0x6a > Code: Bad RIP value. > RIP [<0000000100000001>] 0x100000001 > RSP <ffff8800ba9dba40> > CR2: 0000000100000001 > ---[ end trace 08eb65d42362793f ]--- Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-23netfilter: nftables: Do not run chains in the wrong network namespaceEric W. Biederman1-1/+6
Currenlty nf_tables chains added in one network namespace are being run in all network namespace. The issues are myriad with the simplest being an unprivileged user can cause any network packets to be dropped. Address this by simply not running nf_tables chains in the wrong network namespace. Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-23bridge: multicast: restore router configuration on port link down/upSatish Ashok1-0/+4
When a port goes through a link down/up the multicast router configuration is not restored. Signed-off-by: Satish Ashok <sashok@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Fixes: 0909e11758bd ("bridge: Add multicast_router sysfs entries") Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-23bridge: multicast: start querier timer when running user-space stpNikolay Aleksandrov1-1/+2
When STP is running in user-space and querier is configured, the querier timer is not started when a port goes to a non-blocking state. This patch unifies the user- and kernel-space stp multicast port enable path and enables it in all states different from blocking. Note that when a port goes in BR_STATE_DISABLED it's not enabled because that is handled in the beginning of the port list loop. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-23NET: ROSE: Don't dereference NULL neighbour pointer.Ralf Baechle1-1/+2
A ROSE socket doesn't necessarily always have a neighbour pointer so check if the neighbour pointer is valid before dereferencing it. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Tested-by: Bernard Pidoux <f6bvp@free.fr> Cc: stable@vger.kernel.org #2.6.11+ Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-23Merge tag 'nfc-next-4.2-2' of ↵David S. Miller1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next NFC 4.2 2nd pull request This one only contains a one liner fix for a typo that I introduced while cleaning some of the nfcmrvl patches that were part of the 1st 4.2 pull request.
2015-06-23Merge branch 'for-upstream' of ↵David S. Miller22-330/+1301
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2015-06-18 Here's the final bluetooth-next pull request for 4.2. - Cleanups & fixes to 802.15.4 code and related drivers - Fix btusb driver memory leak - New USB IDs for Atheros controllers - Support for BCM4324B3 UART based Broadcom controller - Fix for Bluetooth encryption key size handling - Broadcom controller initialization fixes - Support for Intel controller DDC parameters - Support for multiple Bluetooth LE advertising instances - Fix for HCI user channel cleanup path Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-23tcp: Do not call tcp_fastopen_reset_cipher from interrupt contextChristoph Paasch3-4/+7
tcp_fastopen_reset_cipher really cannot be called from interrupt context. It allocates the tcp_fastopen_context with GFP_KERNEL and calls crypto_alloc_cipher, which allocates all kind of stuff with GFP_KERNEL. Thus, we might sleep when the key-generation is triggered by an incoming TFO cookie-request which would then happen in interrupt- context, as shown by enabling CONFIG_DEBUG_ATOMIC_SLEEP: [ 36.001813] BUG: sleeping function called from invalid context at mm/slub.c:1266 [ 36.003624] in_atomic(): 1, irqs_disabled(): 0, pid: 1016, name: packetdrill [ 36.004859] CPU: 1 PID: 1016 Comm: packetdrill Not tainted 4.1.0-rc7 #14 [ 36.006085] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 [ 36.008250] 00000000000004f2 ffff88007f8838a8 ffffffff8171d53a ffff880075a084a8 [ 36.009630] ffff880075a08000 ffff88007f8838c8 ffffffff810967d3 ffff88007f883928 [ 36.011076] 0000000000000000 ffff88007f8838f8 ffffffff81096892 ffff88007f89be00 [ 36.012494] Call Trace: [ 36.012953] <IRQ> [<ffffffff8171d53a>] dump_stack+0x4f/0x6d [ 36.014085] [<ffffffff810967d3>] ___might_sleep+0x103/0x170 [ 36.015117] [<ffffffff81096892>] __might_sleep+0x52/0x90 [ 36.016117] [<ffffffff8118e887>] kmem_cache_alloc_trace+0x47/0x190 [ 36.017266] [<ffffffff81680d82>] ? tcp_fastopen_reset_cipher+0x42/0x130 [ 36.018485] [<ffffffff81680d82>] tcp_fastopen_reset_cipher+0x42/0x130 [ 36.019679] [<ffffffff81680f01>] tcp_fastopen_init_key_once+0x61/0x70 [ 36.020884] [<ffffffff81680f2c>] __tcp_fastopen_cookie_gen+0x1c/0x60 [ 36.022058] [<ffffffff816814ff>] tcp_try_fastopen+0x58f/0x730 [ 36.023118] [<ffffffff81671788>] tcp_conn_request+0x3e8/0x7b0 [ 36.024185] [<ffffffff810e3872>] ? __module_text_address+0x12/0x60 [ 36.025327] [<ffffffff8167b2e1>] tcp_v4_conn_request+0x51/0x60 [ 36.026410] [<ffffffff816727e0>] tcp_rcv_state_process+0x190/0xda0 [ 36.027556] [<ffffffff81661f97>] ? __inet_lookup_established+0x47/0x170 [ 36.028784] [<ffffffff8167c2ad>] tcp_v4_do_rcv+0x16d/0x3d0 [ 36.029832] [<ffffffff812e6806>] ? security_sock_rcv_skb+0x16/0x20 [ 36.030936] [<ffffffff8167cc8a>] tcp_v4_rcv+0x77a/0x7b0 [ 36.031875] [<ffffffff816af8c3>] ? iptable_filter_hook+0x33/0x70 [ 36.032953] [<ffffffff81657d22>] ip_local_deliver_finish+0x92/0x1f0 [ 36.034065] [<ffffffff81657f1a>] ip_local_deliver+0x9a/0xb0 [ 36.035069] [<ffffffff81657c90>] ? ip_rcv+0x3d0/0x3d0 [ 36.035963] [<ffffffff81657569>] ip_rcv_finish+0x119/0x330 [ 36.036950] [<ffffffff81657ba7>] ip_rcv+0x2e7/0x3d0 [ 36.037847] [<ffffffff81610652>] __netif_receive_skb_core+0x552/0x930 [ 36.038994] [<ffffffff81610a57>] __netif_receive_skb+0x27/0x70 [ 36.040033] [<ffffffff81610b72>] process_backlog+0xd2/0x1f0 [ 36.041025] [<ffffffff81611482>] net_rx_action+0x122/0x310 [ 36.042007] [<ffffffff81076743>] __do_softirq+0x103/0x2f0 [ 36.042978] [<ffffffff81723e3c>] do_softirq_own_stack+0x1c/0x30 This patch moves the call to tcp_fastopen_init_key_once to the places where a listener socket creates its TFO-state, which always happens in user-context (either from the setsockopt, or implicitly during the listen()-call) Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Fixes: 222e83d2e0ae ("tcp: switch tcp_fastopen key generation to net_get_random_once") Signed-off-by: Christoph Paasch <cpaasch@apple.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-23inet_diag: Remove _bh suffix in inet_diag_dump_reqs().Hiroaki SHIMODA1-2/+2
inet_diag_dump_reqs() is called from inet_diag_dump_icsk() with BH disabled. So no need to disable BH in inet_diag_dump_reqs(). Signed-off-by: Hiroaki Shimoda <shimoda.hiroaki@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-23switchdev: fdb filter_dev is always NULL for self (device), so remove checkScott Feldman1-6/+0
Remove the filter_dev check when dumping fdb entries, otherwise dump returns empty list. filter_dev is always passed as NULL when dumping fdbs on SELF. We want the fdbs installed on the device to be listed in the dump. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Fixes: 45d4122c ("switchdev: add support for fdb add/del/dump via switchdev_port_obj ops") Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-23module: add per-module param_lockDan Streetman1-2/+2
Add a "param_lock" mutex to each module, and update params.c to use the correct built-in or module mutex while locking kernel params. Remove the kparam_block_sysfs_r/w() macros, replace them with direct calls to kernel_param_[un]lock(module). The kernel param code currently uses a single mutex to protect modification of any and all kernel params. While this generally works, there is one specific problem with it; a module callback function cannot safely load another module, i.e. with request_module() or even with indirect calls such as crypto_has_alg(). If the module to be loaded has any of its params configured (e.g. with a /etc/modprobe.d/* config file), then the attempt will result in a deadlock between the first module param callback waiting for modprobe, and modprobe trying to lock the single kernel param mutex to set the new module's param. This fixes that by using per-module mutexes, so that each individual module is protected against concurrent changes in its own kernel params, but is not blocked by changes to other module params. All built-in modules continue to use the built-in mutex, since they will always be loaded at runtime and references (e.g. request_module(), crypto_has_alg()) to them will never cause load-time param changing. This also simplifies the interface used by modules to block sysfs access to their params; while there are currently functions to block and unblock sysfs param access which are split up by read and write and expect a single kernel param to be passed, their actual operation is identical and applies to all params, not just the one passed to them; they simply lock and unlock the global param mutex. They are replaced with direct calls to kernel_param_[un]lock(THIS_MODULE), which locks THIS_MODULE's param_lock, or if the module is built-in, it locks the built-in mutex. Suggested-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Dan Streetman <ddstreet@ieee.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-06-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds9-239/+356
Pull crypto update from Herbert Xu: "Here is the crypto update for 4.2: API: - Convert RNG interface to new style. - New AEAD interface with one SG list for AD and plain/cipher text. All external AEAD users have been converted. - New asymmetric key interface (akcipher). Algorithms: - Chacha20, Poly1305 and RFC7539 support. - New RSA implementation. - Jitter RNG. - DRBG is now seeded with both /dev/random and Jitter RNG. If kernel pool isn't ready then DRBG will be reseeded when it is. - DRBG is now the default crypto API RNG, replacing krng. - 842 compression (previously part of powerpc nx driver). Drivers: - Accelerated SHA-512 for arm64. - New Marvell CESA driver that supports DMA and more algorithms. - Updated powerpc nx 842 support. - Added support for SEC1 hardware to talitos" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (292 commits) crypto: marvell/cesa - remove COMPILE_TEST dependency crypto: algif_aead - Temporarily disable all AEAD algorithms crypto: af_alg - Forbid the use internal algorithms crypto: echainiv - Only hold RNG during initialisation crypto: seqiv - Add compatibility support without RNG crypto: eseqiv - Offer normal cipher functionality without RNG crypto: chainiv - Offer normal cipher functionality without RNG crypto: user - Add CRYPTO_MSG_DELRNG crypto: user - Move cryptouser.h to uapi crypto: rng - Do not free default RNG when it becomes unused crypto: skcipher - Allow givencrypt to be NULL crypto: sahara - propagate the error on clk_disable_unprepare() failure crypto: rsa - fix invalid select for AKCIPHER crypto: picoxcell - Update to the current clk API crypto: nx - Check for bogus firmware properties crypto: marvell/cesa - add DT bindings documentation crypto: marvell/cesa - add support for Kirkwood and Dove SoCs crypto: marvell/cesa - add support for Orion SoCs crypto: marvell/cesa - add allhwsupport module parameter crypto: marvell/cesa - add support for all armada SoCs ...
2015-06-22Merge branch 'timers-core-for-linus' of ↵Linus Torvalds2-6/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "A rather largish update for everything time and timer related: - Cache footprint optimizations for both hrtimers and timer wheel - Lower the NOHZ impact on systems which have NOHZ or timer migration disabled at runtime. - Optimize run time overhead of hrtimer interrupt by making the clock offset updates smarter - hrtimer cleanups and removal of restrictions to tackle some problems in sched/perf - Some more leap second tweaks - Another round of changes addressing the 2038 problem - First step to change the internals of clock event devices by introducing the necessary infrastructure - Allow constant folding for usecs/msecs_to_jiffies() - The usual pile of clockevent/clocksource driver updates The hrtimer changes contain updates to sched, perf and x86 as they depend on them plus changes all over the tree to cleanup API changes and redundant code, which got copied all over the place. The y2038 changes touch s390 to remove the last non 2038 safe code related to boot/persistant clock" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits) clocksource: Increase dependencies of timer-stm32 to limit build wreckage timer: Minimize nohz off overhead timer: Reduce timer migration overhead if disabled timer: Stats: Simplify the flags handling timer: Replace timer base by a cpu index timer: Use hlist for the timer wheel hash buckets timer: Remove FIFO "guarantee" timers: Sanitize catchup_timer_jiffies() usage hrtimer: Allow hrtimer::function() to free the timer seqcount: Introduce raw_write_seqcount_barrier() seqcount: Rename write_seqcount_barrier() hrtimer: Fix hrtimer_is_queued() hole hrtimer: Remove HRTIMER_STATE_MIGRATE selftest: Timers: Avoid signal deadlock in leap-a-day timekeeping: Copy the shadow-timekeeper over the real timekeeper last clockevents: Check state instead of mode in suspend/resume path selftests: timers: Add leap-second timer edge testing to leap-a-day.c ntp: Do leapsecond adjustment in adjtimex read path time: Prevent early expiry of hrtimers[CLOCK_REALTIME] at the leap second edge ntp: Introduce and use SECS_PER_DAY macro instead of 86400 ...
2015-06-22sunrpc: use sg_init_one() in krb5_rc4_setup_enc/seq_key()Fabian Frederick1-6/+2
Don't opencode sg_init_one() Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-06-21packet: avoid out of bounds read in round robin fanoutWillem de Bruijn1-16/+2
PACKET_FANOUT_LB computes f->rr_cur such that it is modulo f->num_members. It returns the old value unconditionally, but f->num_members may have changed since the last store. Ensure that the return value is always < num. When modifying the logic, simplify it further by replacing the loop with an unconditional atomic increment. Fixes: dc99f600698d ("packet: Add fanout support.") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-21ipv4: include NLM_F_APPEND flag in append route notificationsRoopa Prabhu1-2/+5
This patch adds NLM_F_APPEND flag to struct nlmsg_hdr->nlmsg_flags in newroute notifications if the route add was an append. (This is similar to how NLM_F_REPLACE is already part of new route replace notifications today) This helps userspace determine if the route add operation was an append. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-21netlink: add API to retrieve all group membershipsDavid Herrmann1-0/+22
This patch adds getsockopt(SOL_NETLINK, NETLINK_LIST_MEMBERSHIPS) to retrieve all groups a socket is a member of. Currently, we have to use getsockname() and look at the nl.nl_groups bitmask. However, this mask is limited to 32 groups. Hence, similar to NETLINK_ADD_MEMBERSHIP and NETLINK_DROP_MEMBERSHIP, this adds a separate sockopt to manager higher groups IDs than 32. This new NETLINK_LIST_MEMBERSHIPS option takes a pointer to __u32 and the size of the array. The array is filled with the full membership-set of the socket, and the required array size is returned in optlen. Hence, user-space can retry with a properly sized array in case it was too small. Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-21sock_diag: fetch source port from inet_sockCraig Gallek1-0/+2
When an inet_sock is destroyed, its source port (sk_num) is set to zero as part of the unhash procedure. In order to supply a source port as part of the NETLINK_SOCK_DIAG socket destruction broadcasts, the source port number must be read from inet_sport instead. Tested: ss -E Signed-off-by: Craig Gallek <kraig@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-21mac80211: fix locking in update_vlan_tailroom_need_count()Johannes Berg1-3/+10
Unfortunately, Michal's change to fix AP_VLAN crypto tailroom caused a locking issue that was reported by lockdep, but only in a few cases - the issue was a classic ABBA deadlock caused by taking the mtx after the key_mtx, where normally they're taken the other way around. As the key mutex protects the field in question (I'm adding a few annotations to make that clear) only the iteration needs to be protected, but we can also iterate the interface list with just RCU protection while holding the key mutex. Fixes: f9dca80b98ca ("mac80211: fix AP_VLAN crypto tailroom calculation") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-21can: fix loss of CAN frames in raw_rcvOliver Hartkopp1-1/+5
As reported by Manfred Schlaegl here http://marc.info/?l=linux-netdev&m=143482089824232&w=2 commit 514ac99c64b "can: fix multiple delivery of a single CAN frame for overlapping CAN filters" requires the skb->tstamp to be set to check for identical CAN skbs. As net timestamping is influenced by several players (netstamp_needed and netdev_tstamp_prequeue) Manfred missed a proper timestamp which leads to CAN frame loss. As skb timestamping became now mandatory for CAN related skbs this patch makes sure that received CAN skbs always have a proper timestamp set. Maybe there's a better solution in the future but this patch fixes the CAN frame loss so far. Reported-by: Manfred Schlaegl <manfred.schlaegl@gmx.at> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2015-06-21pkt_sched: sch_qfq: remove redundant -if- control statementAndrea Parri1-2/+1
The control !hlist_unhashed() in qfq_destroy_agg() is unnecessary because already performed in hlist_del_init(), so remove it. Signed-off-by: Andrea Parri <parri.andrea@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-21neigh: do not modify unlinked entriesJulian Anastasov1-0/+13
The lockless lookups can return entry that is unlinked. Sometimes they get reference before last neigh_cleanup_and_release, sometimes they do not need reference. Later, any modification attempts may result in the following problems: 1. entry is not destroyed immediately because neigh_update can start the timer for dead entry, eg. on change to NUD_REACHABLE state. As result, entry lives for some time but is invisible and out of control. 2. __neigh_event_send can run in parallel with neigh_destroy while refcnt=0 but if timer is started and expired refcnt can reach 0 for second time leading to second neigh_destroy and possible crash. Thanks to Eric Dumazet and Ying Xue for their work and analyze on the __neigh_event_send change. Fixes: 767e97e1e0db ("neigh: RCU conversion of struct neighbour") Fixes: a263b3093641 ("ipv4: Make neigh lookups directly in output packet path.") Fixes: 6fd6ce2056de ("ipv6: Do not depend on rt->n in ip6_finish_output2().") Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ying Xue <ying.xue@windriver.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>