summaryrefslogtreecommitdiffstats
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2018-04-02net: socket: add __compat_sys_...msg() helpers; remove in-kernel calls to ↵Dominik Brodowski1-7/+30
compat syscalls Using the net-internal helpers __compat_sys_...msg() allows us to avoid the internal calls to the compat_sys_...msg() syscalls. compat_sys_recvmmsg() is handled in a different patch. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __compat_sys_recvmmsg() helper; remove in-kernel call to ↵Dominik Brodowski1-5/+12
compat syscall Using the net-internal helper __compat_sys_recvmmsg() allows us to avoid the internal calls to the compat_sys_recvmmsg() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __compat_sys_getsockopt() helper; remove in-kernel call to ↵Dominik Brodowski1-4/+12
compat syscall Using the net-internal helper __compat_sys_getsockopt() allows us to avoid the internal calls to the compat_sys_getsockopt() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __compat_sys_setsockopt() helper; remove in-kernel call to ↵Dominik Brodowski1-4/+10
compat syscall Using the net-internal helper __compat_sys_setsockopt() allows us to avoid the internal calls to the compat_sys_setsockopt() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __compat_sys_recvfrom() helper; remove in-kernel call to ↵Dominik Brodowski1-7/+16
compat syscall Using the net-internal helper __compat_sys_recvfrom() allows us to avoid the internal calls to the compat_sys_recvfrom() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: replace call to sys_recv() with __sys_recvfrom()Dominik Brodowski2-2/+4
sys_recv() merely expands the parameters to __sys_recvfrom() by NULL and NULL. Open-code this in the two places which used sys_recv() as a wrapper to __sys_recvfrom(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: replace calls to sys_send() with __sys_sendto()Dominik Brodowski2-2/+3
sys_send() merely expands the parameters to __sys_sendto() by NULL and 0. Open-code this in the two places which used sys_send() as a wrapper to __sys_sendto(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: move check for forbid_cmsg_compat to __sys_...msg()Dominik Brodowski2-18/+28
The non-compat codepaths for sys_...msg() verify that MSG_CMSG_COMPAT is not set. By moving this check to the __sys_...msg() functions (and making it dependent on a static flag passed to this function), we can call the __sys...msg() functions instead of the syscall functions in all cases. __sys_recvmmsg() does not need this trickery, as the check is handled within the do_sys_recvmmsg() function internal to net/socket.c. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add do_sys_recvmmsg() helper; remove in-kernel call to syscallDominik Brodowski1-5/+12
Using the net-internal helper do_sys_recvmmsg() allows us to avoid the internal calls to the sys_getsockopt() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_getsockopt() helper; remove in-kernel call to syscallDominik Brodowski1-4/+10
Using the net-internal helper __sys_getsockopt() allows us to avoid the internal calls to the sys_getsockopt() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_setsockopt() helper; remove in-kernel call to syscallDominik Brodowski1-3/+10
Using the net-internal helper __sys_setsockopt() allows us to avoid the internal calls to the sys_setsockopt() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_shutdown() helper; remove in-kernel call to syscallDominik Brodowski2-3/+8
Using the net-internal helper __sys_shutdown() allows us to avoid the internal calls to the sys_shutdown() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_socketpair() helper; remove in-kernel call to syscallDominik Brodowski2-4/+9
Using the net-internal helper __sys_socketpair() allows us to avoid the internal calls to the sys_socketpair() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_getpeername() helper; remove in-kernel call to syscallDominik Brodowski2-5/+11
Using the net-internal helper __sys_getpeername() allows us to avoid the internal calls to the sys_getpeername() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_getsockname() helper; remove in-kernel call to syscallDominik Brodowski2-5/+11
Using the net-internal helper __sys_getsockname() allows us to avoid the internal calls to the sys_getsockname() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_listen() helper; remove in-kernel call to syscallDominik Brodowski2-3/+8
Using the net-internal helper __sys_listen() allows us to avoid the internal calls to the sys_listen() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_connect() helper; remove in-kernel call to syscallDominik Brodowski2-4/+9
Using the net-internal helper __sys_connect() allows us to avoid the internal calls to the sys_connect() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_bind() helper; remove in-kernel call to syscallDominik Brodowski2-3/+8
Using the net-internal helper __sys_bind() allows us to avoid the internal calls to the sys_bind() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_socket() helper; remove in-kernel call to syscallDominik Brodowski2-3/+8
Using the net-internal helper __sys_socket() allows us to avoid the internal calls to the sys_socket() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_accept4() helper; remove in-kernel call to syscallDominik Brodowski2-9/+15
Using the net-internal helper __sys_accept4() allows us to avoid the internal calls to the sys_accept4() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_sendto() helper; remove in-kernel call to syscallDominik Brodowski2-8/+14
Using the net-internal helper __sys_sendto() allows us to avoid the internal calls to the sys_sendto() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_recvfrom() helper; remove in-kernel call to syscallDominik Brodowski2-9/+15
Using the net-internal helper __sys_recvfrom() allows us to avoid the internal calls to the sys_recvfrom() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-03-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds58-357/+433
Pull networking fixes from David Miller: 1) Use an appropriate TSQ pacing shift in mac80211, from Toke Høiland-Jørgensen. 2) Just like ipv4's ip_route_me_harder(), we have to use skb_to_full_sk in ip6_route_me_harder, from Eric Dumazet. 3) Fix several shutdown races and similar other problems in l2tp, from James Chapman. 4) Handle missing XDP flush properly in tuntap, for real this time. From Jason Wang. 5) Out-of-bounds access in powerpc ebpf tailcalls, from Daniel Borkmann. 6) Fix phy_resume() locking, from Andrew Lunn. 7) IFLA_MTU values are ignored on newlink for some tunnel types, fix from Xin Long. 8) Revert F-RTO middle box workarounds, they only handle one dimension of the problem. From Yuchung Cheng. 9) Fix socket refcounting in RDS, from Ka-Cheong Poon. 10) Don't allow ppp unit registration to an unregistered channel, from Guillaume Nault. 11) Various hv_netvsc fixes from Stephen Hemminger. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (98 commits) hv_netvsc: propagate rx filters to VF hv_netvsc: filter multicast/broadcast hv_netvsc: defer queue selection to VF hv_netvsc: use napi_schedule_irqoff hv_netvsc: fix race in napi poll when rescheduling hv_netvsc: cancel subchannel setup before halting device hv_netvsc: fix error unwind handling if vmbus_open fails hv_netvsc: only wake transmit queue if link is up hv_netvsc: avoid retry on send during shutdown virtio-net: re enable XDP_REDIRECT for mergeable buffer ppp: prevent unregistered channels from connecting to PPP units tc-testing: skbmod: fix match value of ethertype mlxsw: spectrum_switchdev: Check success of FDB add operation net: make skb_gso_*_seglen functions private net: xfrm: use skb_gso_validate_network_len() to check gso sizes net: sched: tbf: handle GSO_BY_FRAGS case in enqueue net: rename skb_gso_validate_mtu -> skb_gso_validate_network_len rds: Incorrect reference counting in TCP socket creation net: ethtool: don't ignore return from driver get_fecparam method vrf: check forwarding on the original netdevice when generating ICMP dest unreachable ...
2018-03-04net: make skb_gso_*_seglen functions privateDaniel Axtens1-2/+35
They're very hard to use properly as they do not consider the GSO_BY_FRAGS case. Code should use skb_gso_validate_network_len and skb_gso_validate_mac_len as they do consider this case. Make the seglen functions static, which stops people using them outside of skbuff.c Signed-off-by: Daniel Axtens <dja@axtens.net> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-04net: xfrm: use skb_gso_validate_network_len() to check gso sizesDaniel Axtens2-2/+3
Replace skb_gso_network_seglen() with skb_gso_validate_network_len(), as it considers the GSO_BY_FRAGS case. Signed-off-by: Daniel Axtens <dja@axtens.net> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-04net: sched: tbf: handle GSO_BY_FRAGS case in enqueueDaniel Axtens1-1/+2
tbf_enqueue() checks the size of a packet before enqueuing it. However, the GSO size check does not consider the GSO_BY_FRAGS case, and so will drop GSO SCTP packets, causing a massive drop in throughput. Use skb_gso_validate_mac_len() instead, as it does consider that case. Signed-off-by: Daniel Axtens <dja@axtens.net> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-04net: rename skb_gso_validate_mtu -> skb_gso_validate_network_lenDaniel Axtens8-12/+13
If you take a GSO skb, and split it into packets, will the network length (L3 headers + L4 headers + payload) of those packets be small enough to fit within a given MTU? skb_gso_validate_mtu gives you the answer to that question. However, we recently added to add a way to validate the MAC length of a split GSO skb (L2+L3+L4+payload), and the names get confusing, so rename skb_gso_validate_mtu to skb_gso_validate_network_len Signed-off-by: Daniel Axtens <dja@axtens.net> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-03Merge tag 'batadv-net-for-davem-20180302' of git://git.open-mesh.org/linux-mergeDavid S. Miller9-39/+50
Simon Wunderlich says: ==================== Here are some batman-adv bugfixes: - fix skb checksum issues, by Matthias Schiffer (2 patches) - fix exception handling when dumping data objects through netlink, by Sven Eckelmann (4 patches) - fix handling of interface indices, by Sven Eckelmann ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller11-39/+98
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter fixes for your net tree, they are: 1) Put back reference on CLUSTERIP configuration structure from the error path, patch from Florian Westphal. 2) Put reference on CLUSTERIP configuration instead of freeing it, another cpu may still be walking over it, also from Florian. 3) Refetch pointer to IPv6 header from nf_nat_ipv6_manip_pkt() given packet manipulation may reallocation the skbuff header, from Florian. 4) Missing match size sanity checks in ebt_among, from Florian. 5) Convert BUG_ON to WARN_ON in ebtables, from Florian. 6) Sanity check userspace offsets from ebtables kernel, from Florian. 7) Missing checksum replace call in flowtable IPv4 DNAT, from Felix Fietkau. 8) Bump the right stats on checksum error from bridge netfilter, from Taehee Yoo. 9) Unset interface flag in IPv6 fib lookups otherwise we get misleading routing lookup results, from Florian. 10) Missing sk_to_full_sk() in ip6_route_me_harder() from Eric Dumazet. 11) Don't allow devices to be part of multiple flowtables at the same time, this may break setups. 12) Missing netlink attribute validation in flowtable deletion. 13) Wrong array index in nf_unregister_net_hook() call from error path in flowtable addition path. 14) Fix FTP IPVS helper when NAT mangling is in place, patch from Julian Anastasov. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-02Merge tag 'mac80211-for-davem-2018-03-02' of ↵David S. Miller3-9/+14
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Three more patches: * fix for a regression in 4-addr mode with fast-RX * fix for a Kconfig problem with the new regdb * fix for the long-standing TCP performance issue in wifi using the new sk_pacing_shift_update() ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-02rds: Incorrect reference counting in TCP socket creationKa-Cheong Poon1-3/+11
Commit 0933a578cd55 ("rds: tcp: use sock_create_lite() to create the accept socket") has a reference counting issue in TCP socket creation when accepting a new connection. The code uses sock_create_lite() to create a kernel socket. But it does not do __module_get() on the socket owner. When the connection is shutdown and sock_release() is called to free the socket, the owner's reference count is decremented and becomes incorrect. Note that this bug only shows up when the socket owner is configured as a kernel module. v2: Update comments Fixes: 0933a578cd55 ("rds: tcp: use sock_create_lite() to create the accept socket") Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01net: ethtool: don't ignore return from driver get_fecparam methodEdward Cree1-1/+4
If ethtool_ops->get_fecparam returns an error, pass that error on to the user, rather than ignoring it. Fixes: 1a5f3da20bd9 ("net: ethtool: add support for forward error correction modes") Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01vrf: check forwarding on the original netdevice when generating ICMP dest ↵Stephen Suryaputra1-1/+10
unreachable When ip_error() is called the device is the l3mdev master instead of the original device. So the forwarding check should be on the original one. Changes from v2: - Handle the original device disappearing (per David Ahern) - Minimize the change in code order Changes from v1: - Only need to reset the device on which __in_dev_get_rcu() is done (per David Ahern). Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01net: allow interface to be set into VRF if VLAN interface in same VRFMike Manning1-5/+9
Setting an interface into a VRF fails with 'RTNETLINK answers: File exists' if one of its VLAN interfaces is already in the same VRF. As the VRF is an upper device of the VLAN interface, it is also showing up as an upper device of the interface itself. The solution is to restrict this check to devices other than master. As only one master device can be linked to a device, the check in this case is that the upper device (VRF) being linked to is not the same as the master device instead of it not being any one of the upper devices. The following example shows an interface ens12 (with a VLAN interface ens12.10) being set into VRF green, which behaves as expected: # ip link add link ens12 ens12.10 type vlan id 10 # ip link set dev ens12 master vrfgreen # ip link show dev ens12 3: ens12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master vrfgreen state UP mode DEFAULT group default qlen 1000 link/ether 52:54:00:4c:a0:45 brd ff:ff:ff:ff:ff:ff But if the VLAN interface has previously been set into the same VRF, then setting the interface into the VRF fails: # ip link set dev ens12 nomaster # ip link set dev ens12.10 master vrfgreen # ip link show dev ens12.10 39: ens12.10@ens12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vrfgreen state UP mode DEFAULT group default qlen 1000 link/ether 52:54:00:4c:a0:45 brd ff:ff:ff:ff:ff:ff # ip link set dev ens12 master vrfgreen RTNETLINK answers: File exists The workaround is to move the VLAN interface back into the default VRF beforehand, but it has to be shut first so as to avoid the risk of traffic leaking from the VRF. This fix avoids needing this workaround. Signed-off-by: Mike Manning <mmanning@att.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01net: ipv4: avoid unused variable warning for sysctlArnd Bergmann1-2/+1
The newly introudced ip_min_valid_pmtu variable is only used when CONFIG_SYSCTL is set: net/ipv4/route.c:135:12: error: 'ip_min_valid_pmtu' defined but not used [-Werror=unused-variable] This moves it to the other variables like it, to avoid the harmless warning. Fixes: c7272c2f1229 ("net: ipv4: don't allow setting net.ipv4.route.min_pmtu below 68") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28ipvs: remove IPS_NAT_MASK check to fix passive FTPJulian Anastasov1-1/+1
The IPS_NAT_MASK check in 4.12 replaced previous check for nfct_nat() which was needed to fix a crash in 2.6.36-rc, see commit 7bcbf81a2296 ("ipvs: avoid oops for passive FTP"). But as IPVS does not set the IPS_SRC_NAT and IPS_DST_NAT bits, checking for IPS_NAT_MASK prevents PASV response to be properly mangled and blocks the transfer. Remove the check as it is not needed after 3.12 commit 41d73ec053d2 ("netfilter: nf_conntrack: make sequence number adjustments usuable without NAT") which changes nfct_nat() with nfct_seqadj() and especially after 3.13 commit b25adce16064 ("ipvs: correct usage/allocation of seqadj ext in ipvs"). Thanks to Li Shuang and Florian Westphal for reporting the problem! Reported-by: Li Shuang <shuali@redhat.com> Fixes: be7be6e161a2 ("netfilter: ipvs: fix incorrect conflict resolution") Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-28mlxsw: spectrum: Fix handling of resource_size_paramJiri Pirko1-3/+4
Current code uses global variables, adjusts them and passes pointer down to devlink. With every other mlxsw_core instance, the previously passed pointer values are rewritten. Fix this by de-globalize the variables and also memcpy size_params during devlink resource registration. Also, introduce a convenient size_param_init helper. Fixes: ef3116e5403e ("mlxsw: spectrum: Register KVD resources with devlink") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28net/smc: fix NULL pointer dereference on sock_create_kern() error pathDavide Caratti1-1/+3
when sock_create_kern(..., a) returns an error, 'a' might not be a valid pointer, so it shouldn't be dereferenced to read a->sk->sk_sndbuf and and a->sk->sk_rcvbuf; not doing that caused the following crash: general protection fault: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 0 PID: 4254 Comm: syzkaller919713 Not tainted 4.16.0-rc1+ #18 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:smc_create+0x14e/0x300 net/smc/af_smc.c:1410 RSP: 0018:ffff8801b06afbc8 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff8801b63457c0 RCX: ffffffff85a3e746 RDX: 0000000000000004 RSI: 00000000ffffffff RDI: 0000000000000020 RBP: ffff8801b06afbf0 R08: 00000000000007c0 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff8801b6345c08 R14: 00000000ffffffe9 R15: ffffffff8695ced0 FS: 0000000001afb880(0000) GS:ffff8801db200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000040 CR3: 00000001b0721004 CR4: 00000000001606f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __sock_create+0x4d4/0x850 net/socket.c:1285 sock_create net/socket.c:1325 [inline] SYSC_socketpair net/socket.c:1409 [inline] SyS_socketpair+0x1c0/0x6f0 net/socket.c:1366 do_syscall_64+0x282/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x26/0x9b RIP: 0033:0x4404b9 RSP: 002b:00007fff44ab6908 EFLAGS: 00000246 ORIG_RAX: 0000000000000035 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004404b9 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 000000000000002b RBP: 00007fff44ab6910 R08: 0000000000000002 R09: 00007fff44003031 R10: 0000000020000040 R11: 0000000000000246 R12: ffffffffffffffff R13: 0000000000000006 R14: 0000000000000000 R15: 0000000000000000 Code: 48 c1 ea 03 80 3c 02 00 0f 85 b3 01 00 00 4c 8b a3 48 04 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7c 24 20 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 82 01 00 00 4d 8b 7c 24 20 48 b8 00 00 00 00 RIP: smc_create+0x14e/0x300 net/smc/af_smc.c:1410 RSP: ffff8801b06afbc8 Fixes: cd6851f30386 smc: remote memory buffers (RMBs) Reported-and-tested-by: syzbot+aa0227369be2dcc26ebe@syzkaller.appspotmail.com Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28net/smc: use link_id of server in confirm link replyKarsten Graul2-1/+2
The CONFIRM LINK reply message must contain the link_id sent by the server. And set the link_id explicitly when initializing the link. Signed-off-by: Karsten Graul <kgraul@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28net/smc: use a constant for control message lengthKarsten Graul2-2/+2
The sizeof(struct smc_cdc_msg) evaluates to 48 bytes instead of the required 44 bytes. We need to use the constant value of SMC_WR_TX_SIZE to set and check the control message length. Signed-off-by: Karsten Graul <kgraul@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28net/tcp/illinois: replace broken algorithm reference linkJoey Pabalinas1-1/+1
The link to the pdf containing the algorithm description is now a dead link; it seems http://www.ifp.illinois.edu/~srikant/ has been moved to https://sites.google.com/a/illinois.edu/srikant/ and none of the original papers can be found there... I have replaced it with the only working copy I was able to find. n.b. there is also a copy available at: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.296.6350&rep=rep1&type=pdf However, this seems to only be a *cached* version, so I am unsure exactly how reliable that link can be expected to remain over time and have decided against using that one. Signed-off-by: Joey Pabalinas <joeypabalinas@gmail.com> 1 file changed, 1 insertion(+), 1 deletion(-) Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28tcp: purge write queue upon RSTSoheil Hassas Yeganeh1-0/+1
When the connection is reset, there is no point in keeping the packets on the write queue until the connection is closed. RFC 793 (page 70) and RFC 793-bis (page 64) both suggest purging the write queue upon RST: https://tools.ietf.org/html/draft-ietf-tcpm-rfc793bis-07 Moreover, this is essential for a correct MSG_ZEROCOPY implementation, because userspace cannot call close(fd) before receiving zerocopy signals even when the connection is reset. Fixes: f214f915e7db ("tcp: enable MSG_ZEROCOPY") Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28tcp: revert F-RTO extension to detect more spurious timeoutsYuchung Cheng1-18/+12
This reverts commit 89fe18e44f7ee5ab1c90d0dff5835acee7751427. While the patch could detect more spurious timeouts, it could cause poor TCP performance on broken middle-boxes that modifies TCP packets (e.g. receive window, SACK options). Since the performance gain is much smaller compared to the potential loss. The best solution is to fully revert the change. Fixes: 89fe18e44f7e ("tcp: extend F-RTO to catch more spurious timeouts") Reported-by: Teodor Milkov <tm@del.bg> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28tcp: revert F-RTO middle-box workaroundYuchung Cheng1-10/+7
This reverts commit cc663f4d4c97b7297fb45135ab23cfd508b35a77. While fixing some broken middle-boxes that modifies receive window fields, it does not address middle-boxes that strip off SACK options. The best solution is to fully revert this patch and the root F-RTO enhancement. Fixes: cc663f4d4c97 ("tcp: restrict F-RTO to work-around broken middle-boxes") Reported-by: Teodor Milkov <tm@del.bg> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-27tls: Use correct sk->sk_prot for IPV6Boris Pismenny1-15/+37
The tls ulp overrides sk->prot with a new tls specific proto structs. The tls specific structs were previously based on the ipv4 specific tcp_prot sturct. As a result, attaching the tls ulp to an ipv6 tcp socket replaced some ipv6 callback with the ipv4 equivalents. This patch adds ipv6 tls proto structs and uses them when attached to ipv6 sockets. Fixes: 3c4d7559159b ('tls: kernel TLS support') Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-27sit: fix IFLA_MTU ignored on NEWLINKXin Long1-0/+7
Commit 128bb975dc3c ("ip6_gre: init dev->mtu and dev->hard_header_len correctly") fixed IFLA_MTU ignored on NEWLINK for ip6_gre. The same mtu fix is also needed for sit. Note that dev->hard_header_len setting for sit works fine, no need to fix it. sit is actually ipv4 tunnel, it can't call ip6_tnl_change_mtu to set mtu. Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-27ip6_tunnel: fix IFLA_MTU ignored on NEWLINKXin Long1-4/+8
Commit 128bb975dc3c ("ip6_gre: init dev->mtu and dev->hard_header_len correctly") fixed IFLA_MTU ignored on NEWLINK for ip6_gre. The same mtu fix is also needed for ip6_tunnel. Note that dev->hard_header_len setting for ip6_tunnel works fine, no need to fix it. Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-27ip_gre: fix IFLA_MTU ignored on NEWLINKXin Long1-5/+0
It's safe to remove the setting of dev's needed_headroom and mtu in __gre_tunnel_init, as discussed in [1], ip_tunnel_newlink can do it properly. Now Eric noticed that it could cover the mtu value set in do_setlink when creating a ip_gre dev. It makes IFLA_MTU param not take effect. So this patch is to remove them to make IFLA_MTU work, as in other ipv4 tunnels. [1]: https://patchwork.ozlabs.org/patch/823504/ Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.") Reported-by: Eric Garver <e@erig.me> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-27netfilter: nf_tables: use the right index from flowtable error pathPablo Neira Ayuso1-1/+1
Use the right loop index, not the number of devices in the array that we need to remove, the following message uncovered the problem: [ 5437.044119] hook not found, pf 5 num 0 [ 5437.044140] WARNING: CPU: 2 PID: 24983 at net/netfilter/core.c:376 __nf_unregister_net_hook+0x250/0x280 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-27tipc: correct initial value for group congestion flagJon Maloy2-0/+2
In commit 60c253069632 ("tipc: fix race between poll() and setsockopt()") we introduced a pointer from struct tipc_group to the 'group_is_connected' flag in struct tipc_sock, so that this field can be checked without dereferencing the group pointer of the latter struct. The initial value for this flag is correctly set to 'false' when a group is created, but we miss the case when no group is created at all, in which case the initial value should be 'true'. This has the effect that SOCK_RDM/DGRAM sockets sending datagrams never receive POLLOUT if they request so. This commit corrects this bug. Fixes: 60c253069632 ("tipc: fix race between poll() and setsockopt()") Reported-by: Hoang Le <hoang.h.le@dektek.com.au> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>