summaryrefslogtreecommitdiffstats
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2014-03-14xfrm: Introduce xfrm_input_afinfo to access the the callbacks properlySteffen Klassert3-1/+88
IPv6 can be build as a module, so we need mechanism to access the address family dependent callback functions properly. Therefore we introduce xfrm_input_afinfo, similar to that what we have for the address family dependent part of policies and states. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-03-07xfrm: rename struct xfrm_filterNicolas Dichtel3-7/+7
iproute2 already defines a structure with that name, let's use another one to avoid any conflict. CC: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-03-04cfg80211: add MPLS and 802.21 classificationSimon Wunderlich1-0/+16
MPLS labels may contain traffic control information, which should be evaluated and used by the wireless subsystem if present. Also check for IEEE 802.21 which is always network control traffic. Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03tcp: snmp stats for Fast Open, SYN rtx, and data pktsYuchung Cheng6-4/+21
Add the following snmp stats: TCPFastOpenActiveFail: Fast Open attempts (SYN/data) failed beacuse the remote does not accept it or the attempts timed out. TCPSynRetrans: number of SYN and SYN/ACK retransmits to break down retransmissions into SYN, fast-retransmits, timeout retransmits, etc. TCPOrigDataSent: number of outgoing packets with original data (excluding retransmission but including data-in-SYN). This counter is different from TcpOutSegs because TcpOutSegs also tracks pure ACKs. TCPOrigDataSent is more useful to track the TCP retransmission rate. Change TCPFastOpenActive to track only successful Fast Opens to be symmetric to TCPFastOpenPassive. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Nandita Dukkipati <nanditad@google.com> Signed-off-by: Lawrence Brakmo <brakmo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03sch_tbf: Remove holes in struct tbf_sched_data.Hiroaki SHIMODA1-11/+12
On x86_64 we have 3 holes in struct tbf_sched_data. The member peak_present can be replaced with peak.rate_bytes_ps, because peak.rate_bytes_ps is set only when peak is specified in tbf_change(). tbf_peak_present() is introduced to test peak.rate_bytes_ps. The member max_size is moved to fill 32bit hole. Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-026lowpan: use memcpy to set tag value in fraghdrAlexander Aring1-2/+1
Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-026lowpan: remove initialization of tag valueAlexander Aring1-1/+0
The initialization of the tag value doesn't matter at begin of fragmentation. This patch removes the initialization to zero. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-026lowpan: fix type of datagram size parameterAlexander Aring1-1/+1
Datagram size value is u16 because we convert it to host byte order and we need to read it. Only the tag value belongs to __be16 type. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-286lowpan: handling 6lowpan fragmentation via inet_frag apiAlexander Aring4-208/+680
This patch drops the current way of 6lowpan fragmentation on receiving side and replace it with a implementation which use the inet_frag api. The old fragmentation handling has some race conditions and isn't rfc4944 compatible. Also adding support to match fragments on destination address, source address, tag value and datagram_size which is missing in the current implementation. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-286lowpan: fix some checkpatch issuesAlexander Aring1-18/+9
Detected with: ./scripts/checkpatch.pl --strict -f net/ieee802154/6lowpan_rtnl.c Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-286lowpan: move 6lowpan.c to 6lowpan_rtnl.cAlexander Aring2-0/+1
We have a 6lowpan.c file and 6lowpan.ko file. To avoid confusing we should move 6lowpan.c to 6lowpan_rtnl.c. Then we can support multiple source files for 6lowpan module. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-286lowpan: change tag type to __be16Alexander Aring1-2/+3
Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-286lowpan: fix fragmentation on sending sideAlexander Aring1-10/+27
This patch fix the fragmentation on sending side according to rfc4944. Also add improvement to use the full payload of a PDU which calculate the nearest divided to 8 payload length for the fragmentation datagram size attribute. The main issue is that the datagram size of fragmentation header use the ipv6 payload length, but rfc4944 says it's the ipv6 payload length inclusive network header size (and transport header size if compressed). Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-286lowpan: add uncompress header size functionAlexander Aring1-0/+113
This patch add a lookup function for uncompressed 6LoWPAN header size. This is needed to estimate the real size after uncompress the 6LoWPAN header. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-28packet: allow to transmit +4 byte in TX_RING slot for VLAN caseDaniel Borkmann1-3/+13
Commit 57f89bfa2140 ("network: Allow af_packet to transmit +4 bytes for VLAN packets.") added the possibility for non-mmaped frames to send extra 4 byte for VLAN header so the MTU increases from 1500 to 1504 byte, for example. Commit cbd89acb9eb2 ("af_packet: fix for sending VLAN frames via packet_mmap") attempted to fix that for the mmap part but was reverted as it caused regressions while using eth_type_trans() on output path. Lets just act analogous to 57f89bfa2140 and add a similar logic to TX_RING. We presume size_max as overcharged with +4 bytes and later on after skb has been built by tpacket_fill_skb() check for ETH_P_8021Q header on packets larger than normal MTU. Can be easily reproduced with a slightly modified trafgen in mmap(2) mode, test cases: { fill(0xff, 12) const16(0x8100) fill(0xff, <1504|1505>) } { fill(0xff, 12) const16(0x0806) fill(0xff, <1500|1501>) } Note that we need to do the test right after tpacket_fill_skb() as sockets can have PACKET_LOSS set where we would not fail but instead just continue to traverse the ring. Reported-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Ben Greear <greearb@candelatech.com> Cc: Phil Sutter <phil@nwl.cc> Tested-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-27neigh: directly goto out after setting nud_state to NUD_FAILEDDuan Jiong1-0/+1
Because those following if conditions will not be matched. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-27Merge branch 'master' of ↵David S. Miller10-178/+591
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== This is the rework of the IPsec virtual tunnel interface for ipv4 to support inter address family tunneling and namespace crossing. The only change to the last RFC version is a compile fix for an odd configuration where CONFIG_XFRM is set but CONFIG_INET is not set. 1) Add and use a IPsec protocol multiplexer. 2) Add xfrm_tunnel_skb_cb to the skb common buffer to store a receive callback there. 3) Make vti work with i_key set by not including the i_key when comupting the hash for the tunnel lookup in case of vti tunnels. 4) Update ip_vti to use it's own receive hook. 5) Remove xfrm_tunnel_notifier, this is replaced by the IPsec protocol multiplexer. 6) We need to be protocol family indepenent, so use the on xfrm_lookup returned dst_entry instead of the ipv4 rtable in vti_tunnel_xmit(). 7) Add support for inter address family tunneling. 8) Check if the tunnel endpoints of the xfrm state and the vti interface are matching and return an error otherwise. 8) Enable namespace crossing tor vti devices. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26tcp: switch rtt estimations to usec resolutionEric Dumazet14-160/+158
Upcoming congestion controls for TCP require usec resolution for RTT estimations. Millisecond resolution is simply not enough these days. FQ/pacing in DC environments also require this change for finer control and removal of bimodal behavior due to the current hack in tcp_update_pacing_rate() for 'small rtt' TCP_CONG_RTT_STAMP is no longer needed. As Julian Anastasov pointed out, we need to keep user compatibility : tcp_metrics used to export RTT and RTTVAR in msec resolution, so we added RTT_US and RTTVAR_US. An iproute2 patch is needed to use the new attributes if provided by the kernel. In this example ss command displays a srtt of 32 usecs (10Gbit link) lpk51:~# ./ss -i dst lpk52 Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port tcp ESTAB 0 1 10.246.11.51:42959 10.246.11.52:64614 cubic wscale:6,6 rto:201 rtt:0.032/0.001 ato:40 mss:1448 cwnd:10 send 3620.0Mbps pacing_rate 7240.0Mbps unacked:1 rcv_rtt:993 rcv_space:29559 Updated iproute2 ip command displays : lpk51:~# ./ip tcp_metrics | grep 10.246.11.52 10.246.11.52 age 561.914sec cwnd 10 rtt 274us rttvar 213us source 10.246.11.51 Old binary displays : lpk51:~# ip tcp_metrics | grep 10.246.11.52 10.246.11.52 age 561.914sec cwnd 10 rtt 250us rttvar 125us source 10.246.11.51 With help from Julian Anastasov, Stephen Hemminger and Yuchung Cheng Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Yuchung Cheng <ycheng@google.com> Cc: Larry Brakmo <brakmo@google.com> Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26ipv6: yet another new IPV6_MTU_DISCOVER option IPV6_PMTUDISC_OMITHannes Frederic Sowa2-5/+6
This option has the same semantic as IP_PMTUDISC_OMIT for IPv4 which got recently introduced. It doesn't honor the path mtu discovered by the host but in contrary to IPV6_PMTUDISC_INTERFACE allows the generation of fragments if the packet size exceeds the MTU of the outgoing interface MTU. Fixes: 93b36cf3425b9b ("ipv6: support IPV6_PMTU_INTERFACE on sockets") Cc: Florian Weimer <fweimer@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26ipv4: yet another new IP_MTU_DISCOVER option IP_PMTUDISC_OMITHannes Frederic Sowa2-7/+4
IP_PMTUDISC_INTERFACE has a design error: because it does not allow the generation of fragments if the interface mtu is exceeded, it is very hard to make use of this option in already deployed name server software for which I introduced this option. This patch adds yet another new IP_MTU_DISCOVER option to not honor any path mtu information and not accepting new icmp notifications destined for the socket this option is enabled on. But we allow outgoing fragmentation in case the packet size exceeds the outgoing interface mtu. As such this new option can be used as a drop-in replacement for IP_PMTUDISC_DONT, which is currently in use by most name server software making the adoption of this option very smooth and easy. The original advantage of IP_PMTUDISC_INTERFACE is still maintained: ignoring incoming path MTU updates and not honoring discovered path MTUs in the output path. Fixes: 482fc6094afad5 ("ipv4: introduce new IP_MTU_DISCOVER mode IP_PMTUDISC_INTERFACE") Cc: Florian Weimer <fweimer@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26ipv4: use ip_skb_dst_mtu to determine mtu in ip_fragmentHannes Frederic Sowa1-2/+1
ip_skb_dst_mtu mostly falls back to ip_dst_mtu_maybe_forward if no socket is attached to the skb (in case of forwarding) or determines the mtu like we do in ip_finish_output, which actually checks if we should branch to ip_fragment. Thus use the same function to determine the mtu here, too. This is important for the introduction of IP_PMTUDISC_OMIT, where we want the packets getting cut in pieces of the size of the outgoing interface mtu. IPv6 already does this correctly. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26neigh: probe application via netlink in NUD_PROBETimo Teräs1-4/+4
iproute2 arpd seems to expect this as there's code and comments to handle netlink probes with NUD_PROBE set. It is used to flush the arpd cached mappings. opennhrp instead turns off unicast probes (so it can handle all neighbour discovery). Without this change it will not see NUD_PROBE probes and cannot reconfirm the mapping. Thus currently neigh entry will just fail and can cause few packets dropped until broadcast discovery is restarted. Earlier discussion on the subject: http://marc.info/?t=139305877100001&r=1&w=2 Signed-off-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26ipv6: log src and dst along with "udp checksum is 0"Bjørn Mork1-1/+3
These info messages are rather pointless without any means to identify the source of the bogus packets. Logging the src and dst addresses and ports may help a bit. Cc: Joe Perches <joe@perches.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26net: Add sysfs file for port numberAmir Vadai1-0/+2
Add a sysfs file to enable user space to query the device port number used by a netdevice instance. This is needed for devices that have multiple ports on the same PCI function. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26net: tcp: add mib counters to track zero window transitionsFlorian Westphal2-1/+14
Three counters are added: - one to track when we went from non-zero to zero window - one to track the reverse - one counter incremented when we want to announce zero window, but can't because we would shrink current window. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-25vti4: Enable namespace changingSteffen Klassert1-1/+0
vti4 is now fully namespace aware, so allow namespace changing for vti devices Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25vti4: Check the tunnel endpoints of the xfrm state and the vti interfaceSteffen Klassert1-5/+24
The tunnel endpoints of the xfrm_state we got from the xfrm_lookup must match the tunnel endpoints of the vti interface. This patch ensures this matching. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25vti4: Support inter address family tunneling.Steffen Klassert1-14/+34
With this patch we can tunnel ipv6 traffic via a vti4 interface. A vti4 interface can now have an ipv6 address and ipv6 traffic can be routed via a vti4 interface. The resulting traffic is xfrm transformed and tunneled throuhg ipv4 if matching IPsec policies and states are present. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25vti4: Use the on xfrm_lookup returned dst_entry directlySteffen Klassert1-11/+11
We need to be protocol family indepenent to support inter addresss family tunneling with vti. So use a dst_entry instead of the ipv4 rtable in vti_tunnel_xmit. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25xfrm4: Remove xfrm_tunnel_notifierSteffen Klassert1-68/+0
This was used from vti and is replaced by the IPsec protocol multiplexer hooks. It is now unused, so remove it. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25vti: Update the ipv4 side to use it's own receive hook.Steffen Klassert1-47/+187
With this patch, vti uses the IPsec protocol multiplexer to register it's own receive side hooks for ESP, AH and IPCOMP. Vti now does the following on receive side: 1. Do an input policy check for the IPsec packet we received. This is required because this packet could be already prosecces by IPsec, so an inbuond policy check is needed. 2. Mark the packet with the i_key. The policy and the state must match this key now. Policy and state belong to the outer namespace and policy enforcement is done at the further layers. 3. Call the generic xfrm layer to do decryption and decapsulation. 4. Wait for a callback from the xfrm layer to properly clean the skb to not leak informations on namespace and to update the device statistics. On transmit side: 1. Mark the packet with the o_key. The policy and the state must match this key now. 2. Do a xfrm_lookup on the original packet with the mark applied. 3. Check if we got an IPsec route. 4. Clean the skb to not leak informations on namespace transitions. 5. Attach the dst_enty we got from the xfrm_lookup to the skb. 6. Call dst_output to do the IPsec processing. 7. Do the device statistics. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25ip_tunnel: Make vti work with i_key setSteffen Klassert1-1/+5
Vti uses the o_key to mark packets that were transmitted or received by a vti interface. Unfortunately we can't apply different marks to in and outbound packets with only one key availabe. Vti interfaces typically use wildcard selectors for vti IPsec policies. On forwarding, the same output policy will match for both directions. This generates a loop between the IPsec gateways until the ttl of the packet is exceeded. The gre i_key/o_key are usually there to find the right gre tunnel during a lookup. When vti uses the i_key to mark packets, the tunnel lookup does not work any more because vti does not use the gre keys as a hash key for the lookup. This patch workarounds this my not including the i_key when comupting the hash for the tunnel lookup in case of vti tunnels. With this we have separate keys available for the transmitting and receiving side of the vti interface. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25xfrm: Add xfrm_tunnel_skb_cb to the skb common bufferSteffen Klassert2-0/+12
IPsec vti_rcv needs to remind the tunnel pointer to check it later at the vti_rcv_cb callback. So add this pointer to the IPsec common buffer, initialize it and check it to avoid transport state matching of a tunneled packet. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25ipcomp4: Use the IPsec protocol multiplexer APISteffen Klassert1-9/+17
Switch ipcomp4 to use the new IPsec protocol multiplexer. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25ah4: Use the IPsec protocol multiplexer APISteffen Klassert1-9/+16
Switch ah4 to use the new IPsec protocol multiplexer. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25esp4: Use the IPsec protocol multiplexer APISteffen Klassert1-9/+17
Switch esp4 to use the new IPsec protocol multiplexer. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25xfrm4: Add IPsec protocol multiplexerSteffen Klassert4-16/+280
This patch add an IPsec protocol multiplexer. With this it is possible to add alternative protocol handlers as needed for IPsec virtual tunnel interfaces. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-24bridge: netfilter: Use ether_addr_copyJoe Perches5-7/+7
Convert the uses of memcpy to ether_addr_copy because for some architectures it is smaller and faster. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24bridge: Use ether_addr_copy and ETH_ALENJoe Perches3-5/+5
Convert the more obvious uses of memcpy to ether_addr_copy. There are still uses of memcpy that could be converted but these addresses are __aligned(2). Convert a couple uses of 6 in gr_private.h to ETH_ALEN. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24pktgen: document all supported flagsMathias Krause1-1/+7
The documentation misses a few of the supported flags. Fix this. Also respect the dependency to CONFIG_XFRM for the IPSEC flag. Cc: Fan Du <fan.du@windriver.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24pktgen: simplify error handling in pgctrl_write()Mathias Krause1-13/+6
The 'out' label is just a relict from previous times as pgctrl_write() had multiple error paths. Get rid of it and simply return right away on errors. Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24pktgen: fix out-of-bounds access in pgctrl_write()Mathias Krause1-1/+4
If a privileged user writes an empty string to /proc/net/pktgen/pgctrl the code for stripping the (then non-existent) '\n' actually writes the zero byte at index -1 of data[]. The then still uninitialized array will very likely fail the command matching tests and the pr_warning() at the end will therefore leak stack bytes to the kernel log. Fix those issues by simply ensuring we're passed a non-empty string as the user API apparently expects a trailing '\n' for all commands. Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24Merge branch 'master' of ↵David S. Miller8-138/+307
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== 1) Introduce skb_to_sgvec_nomark function to add further data to the sg list without calling sg_unmark_end first. Needed to add extended sequence number informations. From Fan Du. 2) Add IPsec extended sequence numbers support to the Authentication Header protocol for ipv4 and ipv6. From Fan Du. 3) Make the IPsec flowcache namespace aware, from Fan Du. 4) Avoid creating temporary SA for every packet when no key manager is registered. From Horia Geanta. 5) Support filtering of SA dumps to show only the SAs that match a given filter. From Nicolas Dichtel. 6) Remove caching of xfrm_policy_sk_bundles. The cached socket policy bundles are never used, instead we create a new cache entry whenever xfrm_lookup() is called on a socket policy. Most protocols cache the used routes to the socket, so this caching is not needed. 7) Fix a forgotten SADB_X_EXT_FILTER length check in pfkey, from Nicolas Dichtel. 8) Cleanup error handling of xfrm_state_clone. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-21xfrm: Cleanup error handling of xfrm_state_cloneSteffen Klassert1-11/+5
The error pointer passed to xfrm_state_clone() is unchecked, so remove it and indicate an error by returning a null pointer. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-21pfkey: fix SADB_X_EXT_FILTER length checkNicolas Dichtel1-0/+1
This patch fixes commit d3623099d350 ("ipsec: add support of limited SA dump"). sadb_ext_min_len array should be updated with the new type (SADB_X_EXT_FILTER). Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-20Merge branch 'master' of ↵John W. Linville39-422/+773
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
2014-02-19tcp: use zero-window when free_space is lowFlorian Westphal1-2/+15
Currently the kernel tries to announce a zero window when free_space is below the current receiver mss estimate. When a sender is transmitting small packets and reader consumes data slowly (or not at all), receiver might be unable to shrink the receive win because a) we cannot withdraw already-commited receive window, and, b) we have to round the current rwin up to a multiple of the wscale factor, else we would shrink the current window. This causes the receive buffer to fill up until the rmem limit is hit. When this happens, we start dropping packets. Moreover, tcp_clamp_window may continue to grow sk_rcvbuf towards rmem[2] even if socket is not being read from. As we cannot avoid the "current_win is rounded up to multiple of mss" issue [we would violate a) above] at least try to prevent the receive buf growth towards tcp_rmem[2] limit by attempting to move to zero-window announcement when free_space becomes less than 1/16 of the current allowed receive buffer maximum. If tcp_rmem[2] is large, this will increase our chances to get a zero-window announcement out in time. Reproducer: On server: $ nc -l -p 12345 <suspend it: CTRL-Z> Client: #!/usr/bin/env python import socket import time sock = socket.socket() sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1) sock.connect(("192.168.4.1", 12345)); while True: sock.send('A' * 23) time.sleep(0.005) socket buffer on server-side will grow until tcp_rmem[2] is hit, at which point the client rexmits data until -EDTIMEOUT: tcp_data_queue invokes tcp_try_rmem_schedule which will call tcp_prune_queue which calls tcp_clamp_window(). And that function will grow sk->sk_rcvbuf up until it eventually hits tcp_rmem[2]. Thanks to Eric Dumazet for running regression tests. Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Tested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-19tipc: failed transmissions should return errorErik Hugne1-8/+6
When a message could not be sent out because the destination node or link could not be found, the full message size is returned from sendmsg() as if it had been sent successfully. An application will then get a false indication that it's making forward progress. This problem has existed since the initial commit in 2.6.16. We change this to return -ENETUNREACH if the message cannot be delivered due to the destination node/link being unavailable. We also get rid of the redundant tipc_reject_msg call since freeing the buffer and doing a tipc_port_iovec_reject accomplishes exactly the same thing. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-19ipv6: honor IPV6_PKTINFO with v4 mapped addresses on sendmsgHannes Frederic Sowa4-4/+22
In case we decide in udp6_sendmsg to send the packet down the ipv4 udp_sendmsg path because the destination is either of family AF_INET or the destination is an ipv4 mapped ipv6 address, we don't honor the maybe specified ipv4 mapped ipv6 address in IPV6_PKTINFO. We simply can check for this option in ip_cmsg_send because no calls to ipv6 module functions are needed to do so. Reported-by: Gert Doering <gert@space.net> Cc: Tore Anderson <tore@fud.no> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-19xfrm: Remove caching of xfrm_policy_sk_bundlesSteffen Klassert1-28/+0
We currently cache socket policy bundles at xfrm_policy_sk_bundles. These cached bundles are never used. Instead we create and cache a new one whenever xfrm_lookup() is called on a socket policy. Most protocols cache the used routes to the socket, so let's remove the unused caching of socket policy bundles in xfrm. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>