summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2017-12-15bpf: add test case for ld_abs and helper changing pkt dataDaniel Borkmann2-0/+67
Add a test that i) uses LD_ABS, ii) zeroing R6 before call, iii) calls a helper that triggers reload of cached skb data, iv) uses LD_ABS again. It's added for test_bpf in order to do runtime testing after JITing as well as test_verifier to test that the sequence is allowed. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-15bpf, sparc: fix usage of wrong reg for load_skb_regs after callDaniel Borkmann1-2/+4
When LD_ABS/IND is used in the program, and we have a BPF helper call that changes packet data (bpf_helper_changes_pkt_data() returns true), then in case of sparc JIT, we try to reload cached skb data from bpf2sparc[BPF_REG_6]. However, there is no such guarantee or assumption that skb sits in R6 at this point, all helpers changing skb data only have a guarantee that skb sits in R1. Therefore, store BPF R1 in L7 temporarily and after procedure call use L7 to reload cached skb data. skb sitting in R6 is only true at the time when LD_ABS/IND is executed. Fixes: 7a12b5031c6b ("sparc64: Add eBPF JIT.") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-15bpf: guarantee r1 to be ctx in case of bpf_helper_changes_pkt_dataDaniel Borkmann1-0/+6
Some JITs don't cache skb context on stack in prologue, so when LD_ABS/IND is used and helper calls yield bpf_helper_changes_pkt_data() as true, then they temporarily save/restore skb pointer. However, the assumption that skb always has to be in r1 is a bit of a gamble. Right now it turned out to be true for all helpers listed in bpf_helper_changes_pkt_data(), but lets enforce that from verifier side, so that we make this a guarantee and bail out if the func proto is misconfigured in future helpers. In case of BPF helper calls from cBPF, bpf_helper_changes_pkt_data() is completely unrelevant here (since cBPF is context read-only) and therefore always false. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-15bpf, ppc64: do not reload skb pointers in non-skb contextDaniel Borkmann1-2/+4
The assumption of unconditionally reloading skb pointers on BPF helper calls where bpf_helper_changes_pkt_data() holds true is wrong. There can be different contexts where the helper would enforce a reload such as in case of XDP. Here, we do have a struct xdp_buff instead of struct sk_buff as context, thus this will access garbage. JITs only ever need to deal with cached skb pointer reload when ld_abs/ind was seen, therefore guard the reload behind SEEN_SKB. Fixes: 156d0e290e96 ("powerpc/ebpf/jit: Implement JIT compiler for extended BPF") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-15bpf, s390x: do not reload skb pointers in non-skb contextDaniel Borkmann1-6/+5
The assumption of unconditionally reloading skb pointers on BPF helper calls where bpf_helper_changes_pkt_data() holds true is wrong. There can be different contexts where the BPF helper would enforce a reload such as in case of XDP. Here, we do have a struct xdp_buff instead of struct sk_buff as context, thus this will access garbage. JITs only ever need to deal with cached skb pointer reload when ld_abs/ind was seen, therefore guard the reload behind SEEN_SKB only. Tested on s390x. Fixes: 9db7f2b81880 ("s390/bpf: recache skb->data/hlen for skb_vlan_push/pop") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-15xdp: linearize skb in netif_receive_generic_xdp()Song Liu1-1/+1
In netif_receive_generic_xdp(), it is necessary to linearize all nonlinear skb. However, in current implementation, skb with troom <= 0 are not linearized. This patch fixes this by calling skb_linearize() for all nonlinear skb. Fixes: de8f3a83b0a0 ("bpf: add meta pointer for direct access") Signed-off-by: Song Liu <songliubraving@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller5-19/+23
Daniel Borkmann says: ==================== pull-request: bpf 2017-12-13 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Addition of explicit scheduling points to map alloc/free in order to avoid having to hold the CPU for too long, from Eric. 2) Fixing of a corruption in overlapping perf_event_output calls from different BPF prog types on the same CPU out of different contexts, from Daniel. 3) Fallout fixes for recent correction of broken uapi for BPF_PROG_TYPE_PERF_EVENT. um had a missing asm header that needed to be pulled in from asm-generic and for BPF selftests the asm-generic include did not work, so similar asm include scheme was adapted for that problematic header that perf is having with other header files under tools, from Daniel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13Merge branch 'mlx4-misc-fixes'David S. Miller4-28/+35
Tariq Toukan says: ==================== mlx4 misc fixes This patchset contains misc bug fixes from the team to the mlx4 Core and Eth drivers. Patch 1 by Eugenia fixes an MTU issue in selftest. Patch 2 by Eran fixes an accounting issue in the resource tracker. Patch 3 by Eran fixes a race condition that causes counter inconsistency. Series generated against net commit: 200809716aed fou: fix some member types in guehdr v2: Patch 2: Add reviewer credit, rephrase commit message. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13net/mlx4_en: Fill all counters under one call of stats lockEran Ben Elisha1-26/+31
Before this patch, the stats_lock was acquired twice. In between the locks Driver sent command to gather some more statistics (per priority and counter statistics). If the stats lock was acquired by get statistics NDO in between we would have report out of sync counters. Fix this by collecting all stats from Firmware in advance and then fill the Software structs under one lock. Fixes: 0b131561a7d6 ("net/mlx4_en: Add Flow control statistics display via ethtool") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13net/mlx4_core: Fix wrong calculation of free countersEran Ben Elisha1-1/+0
The field res_free indicates the total number of counters which are available for allocation (reserved and unreserved). Fixed a bug where the reserved counters were subtracted from res_free before any allocation was performed. Before this fix, free counters which were not reserved could not be allocated. Fixes: 9de92c60beaa ("net/mlx4_core: Adjust counter grant policy in the resource tracker") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13net/mlx4_en: Fix selftest for small MTUsEugenia Emantayev2-1/+4
Set the minimal MTU threshold for running loopback selftest. MTU should be big enough to include packet payload, NET_IP_ALIGN, Ethernet headers and preamble length. Fixes: e7c1c2c46201 ("mlx4_en: Added self diagnostics test implementation") Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13net: phy: marvell: avoid configuring fiber page for SGMII-to-CopperRussell King1-0/+4
When in SGMII-to-Copper mode, the fiber page is used for the MAC facing link, and does not require configuration of the fiber auto-negotiation settings. Avoid trying. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13dwc-xlgmac: Add co-maintainerJie Deng1-0/+1
Jose Abreu will join to maintain dwc-xlgmac. He will help with new feature development for this driver. Thanks Jose and welcome on board! Signed-off-by: Jie Deng <jiedeng@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13tcp: refresh tcp_mstamp from timers callbacksEric Dumazet1-0/+2
Only the retransmit timer currently refreshes tcp_mstamp We should do the same for delayed acks and keepalives. Even if RFC 7323 does not request it, this is consistent to what linux did in the past, when TS values were based on jiffies. Fixes: 385e20706fac ("tcp: use tp->tcp_mstamp in output path") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Mike Maloney <maloney@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Mike Maloney <maloney@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13tcp: fix potential underestimation on rcv_rttWei Wang1-4/+6
When ms timestamp is used, current logic uses 1us in tcp_rcv_rtt_update() when the real rcv_rtt is within 1 - 999us. This could cause rcv_rtt underestimation. Fix it by always using a min value of 1ms if ms timestamp is used. Fixes: 645f4c6f2ebd ("tcp: switch rcv_rtt_est and rcvq_space to high resolution timestamps") Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13skge: remove redundunt free_irq under spinlockStephen Hemminger1-1/+0
The code to handle multi-port SKGE boards was freeing IRQ twice. The first one was under lock and might sleep. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13net: phy: meson-gxl: make function meson_gxl_read_status staticColin Ian King1-1/+1
The function meson_gxl_read_status is local to the source and does not need to be in global scope, so make it static. Cleans up sparse warning: symbol 'meson_gxl_read_status' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13of_mdio / mdiobus: ensure mdio devices have fwnode correctly populatedRussell King2-0/+4
Ensure that all mdio devices populate the struct device fwnode pointer as well as the of_node pointer to allow drivers that wish to use fwnode APIs to work. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13net: phy: fix resume handlingRussell King3-14/+9
When a PHY has the BMCR_PDOWN bit set, it may decide to ignore writes to other registers, or reset the registers to power-on defaults. Micrel PHYs do this for their interrupt registers. The current structure of phylib tries to enable interrupts before resuming (and releasing) the BMCR_PDOWN bit. This fails, causing Micrel PHYs to stop working after a suspend/resume sequence if they are using interrupts. Fix this by ensuring that the PHY driver resume methods do not take the phydev->lock mutex themselves, but the callers of phy_resume() take that lock. This then allows us to move the call to phy_resume() before we enable interrupts in phy_start(). Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13ARM: dts: vf610-zii-dev: use XAUI for DSA link portsRussell King1-2/+2
Use XAUI rather than XGMII for DSA link ports, as this is the interface mode that the switches actually use. XAUI is the 4 lane bus with clock per direction, whereas XGMII is a 32 bit bus with clock. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13net: dsa: allow XAUI phy interface modeRussell King1-0/+1
XGMII is a 32-bit bus plus two clock signals per direction. XAUI is four serial lanes per direction. The 88e6190 supports XAUI but not XGMII as it doesn't have enough pins. The same is true of 88e6176. Match on PHY_INTERFACE_MODE_XAUI for the XAUI port type, but keep accepting XGMII for backwards compatibility. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13hippi: Fix a Fix a possible sleep-in-atomic bug in rr_closeJia-Ju Bai1-1/+1
The driver may sleep under a spinlock. The function call path is: rr_close (acquire the spinlock) free_irq --> may sleep To fix it, free_irq is moved to the place without holding the spinlock. This bug is found by my static analysis tool(DSAC) and checked by my code review. Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller16-38/+170
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The follow patchset contains Netfilter fixes for your net tree, they are: 1) Fix compilation warning in x_tables with clang due to useless redundant reassignment, from Colin Ian King. 2) Add bugtrap to net_exit to catch uninitialized lists, patch from Vasily Averin. 3) Fix out of bounds memory reads in H323 conntrack helper, this comes with an initial patch to remove replace the obscure CHECK_BOUND macro as a dependency. From Eric Sesterhenn. 4) Reduce retransmission timeout when window is 0 in TCP conntrack, from Florian Westphal. 6) ctnetlink clamp timeout to INT_MAX if timeout is too large, otherwise timeout wraps around and it results in killing the entry that is being added immediately. 7) Missing CAP_NET_ADMIN checks in cthelper and xt_osf, due to no netns support. From Kevin Cernekee. 8) Missing maximum number of instructions checks in xt_bpf, patch from Jann Horn. 9) With no CONFIG_PROC_FS ipt_CLUSTERIP compilation breaks, patch from Arnd Bergmann. 10) Missing netlink attribute policy in nftables exthdr, from Florian Westphal. 11) Enable conntrack with IPv6 MASQUERADE rules, as a357b3f80bc8 should have done in first place, from Konstantin Khlebnikov. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13net: ethernet: arc: fix error handling in emac_rockchip_probeBranislav Radocaj1-3/+7
If clk_set_rate() fails, we should disable clk before return. Found by Linux Driver Verification project (linuxtesting.org). Changes since v2 [1]: * Merged with latest code changes Changes since v1: Update made thanks to David's review, much appreciated David. * Improved inconsistent failure handling of clock rate setting * For completeness of usecase, added arc_emac_probe error handling Signed-off-by: Branislav Radocaj <branislav@radocaj.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13net: qmi_wwan: add Sierra EM7565 1199:9091Sebastian Sjoholm1-0/+1
Sierra Wireless EM7565 is an Qualcomm MDM9x50 based M.2 modem. The USB id is added to qmi_wwan.c to allow QMI communication with the EM7565. Signed-off-by: Sebastian Sjoholm <ssjoholm@mac.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13net: igmp: Use correct source address on IGMPv3 reportsKevin Cernekee1-1/+19
Closing a multicast socket after the final IPv4 address is deleted from an interface can generate a membership report that uses the source IP from a different interface. The following test script, run from an isolated netns, reproduces the issue: #!/bin/bash ip link add dummy0 type dummy ip link add dummy1 type dummy ip link set dummy0 up ip link set dummy1 up ip addr add 10.1.1.1/24 dev dummy0 ip addr add 192.168.99.99/24 dev dummy1 tcpdump -U -i dummy0 & socat EXEC:"sleep 2" \ UDP4-DATAGRAM:239.101.1.68:8889,ip-add-membership=239.0.1.68:10.1.1.1 & sleep 1 ip addr del 10.1.1.1/24 dev dummy0 sleep 5 kill %tcpdump RFC 3376 specifies that the report must be sent with a valid IP source address from the destination subnet, or from address 0.0.0.0. Add an extra check to make sure this is the case. Signed-off-by: Kevin Cernekee <cernekee@chromium.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13tipc: eliminate potential memory leakJon Maloy1-1/+1
In the function tipc_sk_mcast_rcv() we call refcount_dec(&skb->users) on received sk_buffers. Since the reference counter might hit zero at this point, we have a potential memory leak. We fix this by replacing refcount_dec() with kfree_skb(). Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13net: remove duplicate includesPravin Shedge7-7/+0
These duplicate includes have been found with scripts/checkincludes.pl but they have been removed manually to avoid removing false positives. Signed-off-by: Pravin Shedge <pravin.shedge4linux@gmail.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13ipv4: igmp: guard against silly MTU valuesEric Dumazet4-12/+19
IPv4 stack reacts to changes to small MTU, by disabling itself under RTNL. But there is a window where threads not using RTNL can see a wrong device mtu. This can lead to surprises, in igmp code where it is assumed the mtu is suitable. Fix this by reading device mtu once and checking IPv4 minimal MTU. This patch adds missing IPV4_MIN_MTU define, to not abuse ETH_MIN_MTU anymore. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13ipv6: mcast: better catch silly mtu valuesEric Dumazet1-10/+15
syzkaller reported crashes in IPv6 stack [1] Xin Long found that lo MTU was set to silly values. IPv6 stack reacts to changes to small MTU, by disabling itself under RTNL. But there is a window where threads not using RTNL can see a wrong device mtu. This can lead to surprises, in mld code where it is assumed the mtu is suitable. Fix this by reading device mtu once and checking IPv6 minimal MTU. [1] skbuff: skb_over_panic: text:0000000010b86b8d len:196 put:20 head:000000003b477e60 data:000000000e85441e tail:0xd4 end:0xc0 dev:lo ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:104! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.15.0-rc2-mm1+ #39 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:skb_panic+0x15c/0x1f0 net/core/skbuff.c:100 RSP: 0018:ffff8801db307508 EFLAGS: 00010286 RAX: 0000000000000082 RBX: ffff8801c517e840 RCX: 0000000000000000 RDX: 0000000000000082 RSI: 1ffff1003b660e61 RDI: ffffed003b660e95 RBP: ffff8801db307570 R08: 1ffff1003b660e23 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff85bd4020 R13: ffffffff84754ed2 R14: 0000000000000014 R15: ffff8801c4e26540 FS: 0000000000000000(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000463610 CR3: 00000001c6698000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> skb_over_panic net/core/skbuff.c:109 [inline] skb_put+0x181/0x1c0 net/core/skbuff.c:1694 add_grhead.isra.24+0x42/0x3b0 net/ipv6/mcast.c:1695 add_grec+0xa55/0x1060 net/ipv6/mcast.c:1817 mld_send_cr net/ipv6/mcast.c:1903 [inline] mld_ifc_timer_expire+0x4d2/0x770 net/ipv6/mcast.c:2448 call_timer_fn+0x23b/0x840 kernel/time/timer.c:1320 expire_timers kernel/time/timer.c:1357 [inline] __run_timers+0x7e1/0xb60 kernel/time/timer.c:1660 run_timer_softirq+0x4c/0xb0 kernel/time/timer.c:1686 __do_softirq+0x29d/0xbb2 kernel/softirq.c:285 invoke_softirq kernel/softirq.c:365 [inline] irq_exit+0x1d3/0x210 kernel/softirq.c:405 exiting_irq arch/x86/include/asm/apic.h:540 [inline] smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052 apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:920 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Tested-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13Revert "ravb: add workaround for clock when resuming with WoL enabled"Geert Uytterhoeven1-25/+2
This reverts commit fbf3d034f2ff6264183cfa6845770e8cc2a986c8. As of commit 560869100b99a3da ("clk: renesas: cpg-mssr: Restore module clocks during resume"), the workaround is no longer needed. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-12bpf: add schedule points to map alloc/freeEric Dumazet1-0/+2
While using large percpu maps, htab_map_alloc() can hold cpu for hundreds of ms. This patch adds cond_resched() calls to percpu alloc/free call sites, all running in process context. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-12Merge branch 'bpf-misc-fixes'Alexei Starovoitov4-19/+21
Daniel Borkmann says: ==================== Couple of outstanding fixes for BPF tree: 1) fixes a perf RB corruption, 2) and 3) fixes a few build issues from the recent bpf_perf_event.h uapi corrections. Thanks! ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-12bpf: fix broken BPF selftest buildDaniel Borkmann2-12/+8
At least on x86_64, the kernel's BPF selftests seemed to have stopped to build due to 618e165b2a8e ("selftests/bpf: sync kernel headers and introduce arch support in Makefile"): [...] In file included from test_verifier.c:29:0: ../../../include/uapi/linux/bpf_perf_event.h:11:32: fatal error: asm/bpf_perf_event.h: No such file or directory #include <asm/bpf_perf_event.h> ^ compilation terminated. [...] While pulling in tools/arch/*/include/uapi/asm/bpf_perf_event.h seems to work fine, there's no automated fall-back logic right now that would do the same out of tools/include/uapi/asm-generic/bpf_perf_event.h. The usual convention today is to add a include/[uapi/]asm/ equivalent that would pull in the correct arch header or generic one as fall-back, all ifdef'ed based on compiler target definition. It's similarly done also in other cases such as tools/include/asm/barrier.h, thus adapt the same here. Fixes: 618e165b2a8e ("selftests/bpf: sync kernel headers and introduce arch support in Makefile") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-12bpf: fix build issues on um due to mising bpf_perf_event.hDaniel Borkmann1-0/+1
Since c895f6f703ad ("bpf: correct broken uapi for BPF_PROG_TYPE_PERF_EVENT program type") um (uml) won't build on i386 or x86_64: [...] CC init/main.o In file included from ../include/linux/perf_event.h:18:0, from ../include/linux/trace_events.h:10, from ../include/trace/syscall.h:7, from ../include/linux/syscalls.h:82, from ../init/main.c:20: ../include/uapi/linux/bpf_perf_event.h:11:32: fatal error: asm/bpf_perf_event.h: No such file or directory #include <asm/bpf_perf_event.h> [...] Lets add missing bpf_perf_event.h also to um arch. This seems to be the only one still missing. Fixes: c895f6f703ad ("bpf: correct broken uapi for BPF_PROG_TYPE_PERF_EVENT program type") Reported-by: Randy Dunlap <rdunlap@infradead.org> Suggested-by: Richard Weinberger <richard@sigma-star.at> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Randy Dunlap <rdunlap@infradead.org> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Richard Weinberger <richard@sigma-star.at> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Richard Weinberger <richard@nod.at> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-12bpf: fix corruption on concurrent perf_event_output callsDaniel Borkmann1-7/+12
When tracing and networking programs are both attached in the system and both use event-output helpers that eventually call into perf_event_output(), then we could end up in a situation where the tracing attached program runs in user context while a cls_bpf program is triggered on that same CPU out of softirq context. Since both rely on the same per-cpu perf_sample_data, we could potentially corrupt it. This can only ever happen in a combination of the two types; all tracing programs use a bpf_prog_active counter to bail out in case a program is already running on that CPU out of a different context. XDP and cls_bpf programs by themselves don't have this issue as they run in the same context only. Therefore, split both perf_sample_data so they cannot be accessed from each other. Fixes: 20b9d7ac4852 ("bpf: avoid excessive stack usage for perf_sample_data") Reported-by: Alexei Starovoitov <ast@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Song Liu <songliubraving@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-12tcp md5sig: Use skb's saddr when replying to an incoming segmentChristoph Paasch2-2/+2
The MD5-key that belongs to a connection is identified by the peer's IP-address. When we are in tcp_v4(6)_reqsk_send_ack(), we are replying to an incoming segment from tcp_check_req() that failed the seq-number checks. Thus, to find the correct key, we need to use the skb's saddr and not the daddr. This bug seems to have been there since quite a while, but probably got unnoticed because the consequences are not catastrophic. We will call tcp_v4_reqsk_send_ack only to send a challenge-ACK back to the peer, thus the connection doesn't really fail. Fixes: 9501f9722922 ("tcp md5sig: Let the caller pass appropriate key for tcp_v{4,6}_do_calc_md5_hash().") Signed-off-by: Christoph Paasch <cpaasch@apple.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11fou: fix some member types in guehdrXin Long1-9/+9
guehdr struct is used to build or parse gue packets, which are always in big endian. It's better to define all guehdr members as __beXX types. Also, in validate_gue_flags it's not good to use a __be32 variable for both Standard flags(__be16) and Private flags (__be32), and pass it to other funcions. This patch could fix a bunch of sparse warnings from fou. Fixes: 5024c33ac354 ("gue: Add infrastructure for flags and options") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11sctp: make sure stream nums can match optlen in sctp_setsockopt_reset_streamsXin Long1-1/+5
Now in sctp_setsockopt_reset_streams, it only does the check optlen < sizeof(*params) for optlen. But it's not enough, as params->srs_number_streams should also match optlen. If the streams in params->srs_stream_list are less than stream nums in params->srs_number_streams, later when dereferencing the stream list, it could cause a slab-out-of-bounds crash, as reported by syzbot. This patch is to fix it by also checking the stream numbers in sctp_setsockopt_reset_streams to make sure at least it's not greater than the streams in the list. Fixes: 7f9d68ac944e ("sctp: implement sender-side procedures for SSN Reset Request Parameter") Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11net: ipv4: fix for a race condition in raw_sendmsgMohamed Ghannam1-5/+10
inet->hdrincl is racy, and could lead to uninitialized stack pointer usage, so its value should be read only once. Fixes: c008ba5bdc9f ("ipv4: Avoid reading user iov twice after raw_probe_proto_opt") Signed-off-by: Mohamed Ghannam <simo.ghannam@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11netlink: Add netns check on tapsKevin Cernekee1-0/+3
Currently, a nlmon link inside a child namespace can observe systemwide netlink activity. Filter the traffic so that nlmon can only sniff netlink messages from its own netns. Test case: vpnns -- bash -c "ip link add nlmon0 type nlmon; \ ip link set nlmon0 up; \ tcpdump -i nlmon0 -q -w /tmp/nlmon.pcap -U" & sudo ip xfrm state add src 10.1.1.1 dst 10.1.1.2 proto esp \ spi 0x1 mode transport \ auth sha1 0x6162633132330000000000000000000000000000 \ enc aes 0x00000000000000000000000000000000 grep --binary abc123 /tmp/nlmon.pcap Signed-off-by: Kevin Cernekee <cernekee@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11net: sh_eth: do not advertise Gigabit capabilities when not availableThomas Petazzoni1-0/+10
Not all variants of the sh_eth hardware have Gigabit support. Unfortunately, the current driver doesn't tell the PHY about the limited MAC capabilities. Due to this, if you have a Gigabit capable PHY, the PHY will advertise its Gigabit capability and establish a link at 1Gbit/s, even though the MAC doesn't support it. In order to avoid this, we use the recently introduced phy_set_max_speed() to tell the PHY to not advertise speed higher than 100 MBit/s. Tested on a SH7786 platform, with a Gigabit PHY. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11net: phy: meson-gxl: detect LPA corruptionJerome Brunet1-1/+73
The purpose of this change is to fix the incorrect detection of the link partner (LP) advertised capabilities which sometimes happens with this PHY (roughly 1 time in a dozen) This issue may cause the link to be negotiated at 10Mbps/Full or 10Mbps/Half when 100MBps/Full is actually possible. In some case, the link is even completely broken and no communication is possible. To detect the corruption, we must look for a magic undocumented bit in the WOL bank (hint given by the SoC vendor kernel) but this is not enough to cover all cases. We also have to look at the LPA ack. If the LP supports Aneg but did not ack our base code when aneg is completed, we assume something went wrong. The detection of a corrupted LPA triggers a restart of the aneg process. This solves the problem but may take up to 6 retries to complete. Fixes: 7334b3e47aee ("net: phy: Add Meson GXL Internal PHY driver") Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11netfilter: ip6t_MASQUERADE: add dependency on conntrack moduleKonstantin Khlebnikov1-1/+7
After commit 4d3a57f23dec ("netfilter: conntrack: do not enable connection tracking unless needed") conntrack is disabled by default unless some module explicitly declares dependency in particular network namespace. Fixes: a357b3f80bc8 ("netfilter: nat: add dependencies on conntrack module") Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-12-11ptr_ring: add barriersMichael S. Tsirkin1-0/+9
Users of ptr_ring expect that it's safe to give the data structure a pointer and have it be available to consumers, but that actually requires an smb_wmb or a stronger barrier. In absence of such barriers and on architectures that reorder writes, consumer might read an un=initialized value from an skb pointer stored in the skb array. This was observed causing crashes. To fix, add memory barriers. The barrier we use is a wmb, the assumption being that producers do not need to read the value so we do not need to order these reads. Reported-by: George Cherian <george.cherian@cavium.com> Suggested-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11Merge tag 'mac80211-for-davem-2017-12-11' of ↵David S. Miller2-13/+40
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Three fixes: * for certificate C file generation, don't use hexdump as it's not always installed by default, use pure posix instead (od/sed) * for certificate C file generation, don't write the file if anything fails, so the build abort will not cause a bad build upon a second attempt * fix locking in ieee80211_sta_tear_down_BA_sessions() which had been causing lots of locking warnings ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11netfilter: exthdr: add missign attributes to policyFlorian Westphal1-0/+2
Add missing netlink attribute policy. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-12-11mac80211: fix locking in ieee80211_sta_tear_down_BA_sessionsJohannes Berg1-3/+2
Due to overlap between commit 1281103770e9 ("mac80211: Simplify locking in ieee80211_sta_tear_down_BA_sessions()") and the way that Luca modified commit 72e2c3438ba3 ("mac80211: tear down RX aggregations first") when sending it upstream from Intel's internal tree, we get the following warning: WARNING: CPU: 0 PID: 5472 at net/mac80211/agg-tx.c:315 ___ieee80211_stop_tx_ba_session+0x158/0x1f0 since there's no appropriate locking around the call to ___ieee80211_stop_tx_ba_session; Sara's original just had a call to the locked __ieee80211_stop_tx_ba_session (one less underscore) but it looks like Luca modified both of the calls when fixing it up for upstream, leading to the problem at hand. Move the locking appropriately to fix this problem. Reported-by: Kalle Valo <kvalo@codeaurora.org> Reported-by: Pavel Machek <pavel@ucw.cz> Tested-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-12-08kmemcheck: rip it out for realMichal Hocko13-13/+0
Commit 4675ff05de2d ("kmemcheck: rip it out") has removed the code but for some reason SPDX header stayed in place. This looks like a rebase mistake in the mmotm tree or the merge mistake. Let's drop those leftovers as well. Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds113-367/+1052
Pull networking fixes from David Miller: 1) CAN fixes from Martin Kelly (cancel URBs properly in all the CAN usb drivers). 2) Revert returning -EEXIST from __dev_alloc_name() as this propagates to userspace and broke some apps. From Johannes Berg. 3) Fix conn memory leaks and crashes in TIPC, from Jon Malloc and Cong Wang. 4) Gianfar MAC can't do EEE so don't advertise it by default, from Claudiu Manoil. 5) Relax strict netlink attribute validation, but emit a warning. From David Ahern. 6) Fix regression in checksum offload of thunderx driver, from Florian Westphal. 7) Fix UAPI bpf issues on s390, from Hendrik Brueckner. 8) New card support in iwlwifi, from Ihab Zhaika. 9) BBR congestion control bug fixes from Neal Cardwell. 10) Fix port stats in nfp driver, from Pieter Jansen van Vuuren. 11) Fix leaks in qualcomm rmnet, from Subash Abhinov Kasiviswanathan. 12) Fix DMA API handling in sh_eth driver, from Thomas Petazzoni. 13) Fix spurious netpoll warnings in bnxt_en, from Calvin Owens. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (67 commits) net: mvpp2: fix the RSS table entry offset tcp: evaluate packet losses upon RTT change tcp: fix off-by-one bug in RACK tcp: always evaluate losses in RACK upon undo tcp: correctly test congestion state in RACK bnxt_en: Fix sources of spurious netpoll warnings tcp_bbr: reset long-term bandwidth sampling on loss recovery undo tcp_bbr: reset full pipe detection on loss recovery undo tcp_bbr: record "full bw reached" decision in new full_bw_reached bit sfc: pass valid pointers from efx_enqueue_unwind gianfar: Disable EEE autoneg by default tcp: invalidate rate samples during SACK reneging can: peak/pcie_fd: fix potential bug in restarting tx queue can: usb_8dev: cancel urb on -EPIPE and -EPROTO can: kvaser_usb: cancel urb on -EPIPE and -EPROTO can: esd_usb2: cancel urb on -EPIPE and -EPROTO can: ems_usb: cancel urb on -EPIPE and -EPROTO can: mcba_usb: cancel urb on -EPROTO usbnet: fix alignment for frames with no ethernet header tcp: use current time in tcp_rcv_space_adjust() ...