summaryrefslogtreecommitdiffstats
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2017-07-01sctp: remove the typedef sctp_datahdr_tXin Long1-3/+3
This patch is to remove the typedef sctp_datahdr_t, and replace with struct sctp_datahdr in the places where it's using this typedef. It is also to use izeof(variable) instead of sizeof(type). Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01sctp: remove the typedef sctp_param_action_tXin Long1-2/+2
Remove this typedef, there is even no places using it. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01sctp: remove the typedef sctp_param_tXin Long2-10/+8
This patch is to remove the typedef sctp_param_t, and replace with struct sctp_paramhdr in the places where it's using this typedef. It is also to remove the useless declaration sctp_addip_addr_config and fix the lack of params for some other functions' declaration. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01sctp: remove the typedef sctp_paramhdr_tXin Long3-25/+26
This patch is to remove the typedef sctp_paramhdr_t, and replace with struct sctp_paramhdr in the places where it's using this typedef. It is also to fix some indents and use sizeof(variable) instead of sizeof(type). Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01sctp: remove the typedef sctp_cid_action_tXin Long1-2/+2
Remove this typedef, there is even no places using it. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01sctp: remove the typedef sctp_cid_tXin Long4-7/+9
This patch is to remove the typedef sctp_cid_t, and replace with struct sctp_cid in the places where it's using this typedef. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01sctp: remove the typedef sctp_chunkhdr_tXin Long2-18/+18
This patch is to remove the typedef sctp_chunkhdr_t, and replace with struct sctp_chunkhdr in the places where it's using this typedef. It is also to fix some indents and use sizeof(variable) instead of sizeof(type)., especially in sctp_new. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01sctp: remove the typedef sctp_sctphdr_tXin Long1-2/+2
This patch is to remove the typedef sctp_sctphdr_t, and replace with struct sctphdr in the places where it's using this typedef. It is also to fix some indents and use sizeof(variable) instead of sizeof(type). Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01net: switchdev: add SET_SWITCHDEV_OPS helperSimon Horman1-0/+4
Add a helper to allow switchdev ops to be set if NET_SWITCHDEV is configured and do nothing otherwise. This allows for slightly cleaner code which uses switchdev but does not select NET_SWITCHDEV. Signed-off-by: Simon Horman <simon.horman@netronome.com> Acked-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01net: convert netlbl_lsm_cache.refcount from atomic_t to refcount_tReshetova, Elena1-4/+4
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01net: convert net.passive from atomic_t to refcount_tReshetova, Elena1-1/+2
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01net: convert inet_frag_queue.refcnt from atomic_t to refcount_tReshetova, Elena1-2/+2
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01net: convert fib_rule.refcnt from atomic_t to refcount_tReshetova, Elena1-3/+4
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01net: convert unix_address.refcnt from atomic_t to refcount_tReshetova, Elena1-1/+2
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01net: convert netpoll_info.refcnt from atomic_t to refcount_tReshetova, Elena1-1/+2
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01net: convert in_device.refcnt from atomic_t to refcount_tReshetova, Elena1-5/+6
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01net: convert ip_mc_list.refcnt from atomic_t to refcount_tReshetova, Elena1-1/+2
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01net: convert sock.sk_refcnt from atomic_t to refcount_tReshetova, Elena3-14/+16
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. This patch uses refcount_inc_not_zero() instead of atomic_inc_not_zero_hint() due to absense of a _hint() version of refcount API. If the hint() version must be used, we might need to revisit API. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01net: convert sock.sk_wmem_alloc from atomic_t to refcount_tReshetova, Elena2-5/+5
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01net: convert sk_buff_fclones.fclone_ref from atomic_t to refcount_tReshetova, Elena1-2/+2
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01net: convert sk_buff.users from atomic_t to refcount_tReshetova, Elena1-5/+5
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01net: convert nf_bridge_info.use from atomic_t to refcount_tReshetova, Elena2-4/+4
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01net: convert neigh_params.refcnt from atomic_t to refcount_tReshetova, Elena1-3/+3
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01net: convert neighbour.refcnt from atomic_t to refcount_tReshetova, Elena3-6/+7
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01net: convert inet_peer.refcnt from atomic_t to refcount_tReshetova, Elena1-2/+2
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. This conversion requires overall +1 on the whole refcounting scheme. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller6-9/+10
A set of overlapping changes in macvlan and the rocker driver, nothing serious. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller6-18/+40
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree. This batch contains connection tracking updates for the cleanup iteration path, patches from Florian Westphal: X) Skip unconfirmed conntracks in nf_ct_iterate_cleanup_net(), just set dying bit to let the CPU release them. X) Add nf_ct_iterate_destroy() to be used on module removal, to kill conntrack from all namespace. X) Restart iteration on hashtable resizing, since both may occur at the same time. X) Use the new nf_ct_iterate_destroy() to remove conntrack with NAT mapping on module removal. X) Use nf_ct_iterate_destroy() to remove conntrack entries helper module removal, from Liping Zhang. X) Use nf_ct_iterate_cleanup_net() to remove the timeout extension if user requests this, also from Liping. X) Add net_ns_barrier() and use it from FTP helper, so make sure no concurrent namespace removal happens at the same time while the helper module is being removed. X) Use NFPROTO_MAX in layer 3 conntrack protocol array, to reduce module size. Same thing in nf_tables. Updates for the nf_tables infrastructure: X) Prepare usage of the extended ACK reporting infrastructure for nf_tables. X) Remove unnecessary forward declaration in nf_tables hash set. X) Skip set size estimation if number of element is not specified. X) Changes to accomodate a (faster) unresizable hash set implementation, for anonymous sets and dynamic size fixed sets with no timeouts. X) Faster lookup function for unresizable hash table for 2 and 4 bytes key. And, finally, a bunch of asorted small updates and cleanups: X) Do not hold reference to netdev from ipt_CLUSTER, instead subscribe to device events and look up for index from the packet path, this is fixing an issue that is present since the very beginning, patch from Xin Long. X) Use nf_register_net_hook() in ipt_CLUSTER, from Florian Westphal. X) Use ebt_invalid_target() whenever possible in the ebtables tree, from Gao Feng. X) Calm down compilation warning in nf_dup infrastructure, patch from stephen hemminger. X) Statify functions in nftables rt expression, also from stephen. X) Update Makefile to use canonical method to specify nf_tables-objs. From Jike Song. X) Use nf_conntrack_helpers_register() in amanda and H323. X) Space cleanup for ctnetlink, from linzhang. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-30ieee80211: update public action codesPeter Oh1-1/+34
Update Public Action field values as updated in IEEE Std 802.11-2016, so that modules/drivers can refer it. Signed-off-by: Peter Oh <peter.oh@bowerswilkins.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-06-30nl80211: Don't verify owner_nlportid on NAN commandsAndrei Otcheretianski1-5/+4
If NAN interface is created with NL80211_ATTR_SOCKET_OWNER, the socket that is used to create the interface is used for all NAN operations and reporting NAN events. However, it turns out that sending commands and receiving events on the same socket is not possible in a completely race-free way: If the socket buffer is overflowed by the events, the command response will not be sent. In that case the caller will block forever on recv. Using non-blocking socket for commands is more complicated and still the command response or ack may not be received. So, keep unicasting NAN events to the interface creator, but allow using a different socket for commands. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-06-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds1-5/+2
Pull networking fixes from David Miller: 1) Need to access netdev->num_rx_queues behind an accessor in netvsc driver otherwise the build breaks with some configs, from Arnd Bergmann. 2) Add dummy xfrm_dev_event() so that build doesn't fail when CONFIG_XFRM_OFFLOAD is not set. From Hangbin Liu. 3) Don't OOPS when pfkey_msg2xfrm_state() signals an erros, from Dan Carpenter. 4) Fix MCDI command size for filter operations in sfc driver, from Martin Habets. 5) Fix UFO segmenting so that we don't calculate incorrect checksums, from Michal Kubecek. 6) When ipv6 datagram connects fail, reset destination address and port. From Wei Wang. 7) TCP disconnect must reset the cached receive DST, from WANG Cong. 8) Fix sign extension bug on 32-bit in dev_get_stats(), from Eric Dumazet. 9) fman driver has to depend on HAS_DMA, from Madalin Bucur. 10) Fix bpf pointer leak with xadd in verifier, from Daniel Borkmann. 11) Fix negative page counts with GFO, from Michal Kubecek. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits) sfc: fix attempt to translate invalid filter ID net: handle NAPI_GRO_FREE_STOLEN_HEAD case also in napi_frags_finish() bpf: prevent leaking pointer via xadd on unpriviledged arcnet: com20020-pci: add missing pdev setup in netdev structure arcnet: com20020-pci: fix dev_id calculation arcnet: com20020: remove needless base_addr assignment Trivial fix to spelling mistake in arc_printk message arcnet: change irq handler to lock irqsave rocker: move dereference before free mlxsw: spectrum_router: Fix NULL pointer dereference net: sched: Fix one possible panic when no destroy callback virtio-net: serialize tx routine during reset net: usb: asix88179_178a: Add support for the Belkin B2B128 fsl/fman: add dependency on HAS_DMA net: prevent sign extension in dev_get_stats() tcp: reset sk_rx_dst in tcp_disconnect() net: ipv6: reset daddr and dport in sk if connect() fails bnx2x: Don't log mc removal needlessly bnxt_en: Fix netpoll handling. bnxt_en: Add missing logic to handle TPA end error conditions. ...
2017-06-29bpf: Add syscall lookup support for fd array and htabMartin KaFai Lau1-0/+3
This patch allows userspace to do BPF_MAP_LOOKUP_ELEM on BPF_MAP_TYPE_PROG_ARRAY, BPF_MAP_TYPE_ARRAY_OF_MAPS and BPF_MAP_TYPE_HASH_OF_MAPS. The lookup returns a prog-id or map-id to the userspace. The userspace can then use the BPF_PROG_GET_FD_BY_ID or BPF_MAP_GET_FD_BY_ID to get a fd. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29Merge tag 'mlx5-updates-2017-06-27' of ↵David S. Miller5-4/+334
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2017-06-27 (Innova IPsec offload support) This patchset adds support for Innova IPSec network interface card. About Innova device: -------------------- Innova is a network card with a ConnectX chip and an FPGA chip as a bump-on-the-wire. Internal +----------+ Link +-----------------+ | +--------------+ FPGA | +------+ | ConnectX | | Shell +--+ QSFP | | +--------------+ +-------+ | | Port | +----------+ I2C | | SBU | | +------+ | +-------+ | +--+----------+---+ | | +--+--+ +---+---+ | DDR | | Flash | +-----+ +-------+ The FPGA synthesized logic is loaded from dedicated flash storage and has access to its own dedicated DDR RAM. The ConnectX chip firmware programs the FPGA by accessing its configuration space over either the slow internal I2C link or the high-speed internal link. The FPGA logic is divided into a "Shell" and a "Sandbox Unit" (SBU). mlx5_core driver (with CONFIG_MLX5_FPGA) handles all shell functionality, while other components may handle the various SBU functionalities. The driver opens high-speed reliable communication channels with the shell and the SBU over the internal link. These channels may be used for high-bandwidth configuration or for SBU-specific out-of-band data paths. About Innova IPSec device: -------------------------- Innova IPSec is a network card that allows offloading IPSec cryptography operations from the host CPU to the NIC. It is an Innova card with an IPSec SBU. The hardware keeps the database of IPSec Security Associations (SADB) in the FPGA's DDR memory. Internal +----------+ Link +-----------------+ | +--------------+ FPGA | +------+ | ConnectX | | Shell +--+ QSFP | | +--------------+ +-------+ | | Port | +----------+ Internal I2C | | IPSec | | +------+ | | SBU | | | +-------+ | +--+----------+---+ | | +--+--+ +---+---+ | DDR | | | | | | Flash | |SADB | | | +-----+ +-------+ Modes and ciphers: Currently the following modes and ciphers are supported: IPv4 and IPv6 ESP tunnel and transport modes AES 128 and 256 bit encryption, with GCM authentication (RFC4106) IV is generated using seqiv, in sync with Linux's geniv. More modes and ciphers may be added later. Notes: In the future similar functionality will be included in a single-chip NIC. About the driver: ----------------- Patches 1-4 prepare some existing driver code for the new feature: * Add support for reserved GIDs in the hardware GID table * Allow multiple modules to enable hardware RoCE support independently Patches 5-6 define structs and helper functions for QP work-queues. Patches 7-11 add various FPGA-related features required for Innova. IPSec. Patch 12 adds abstraction layer for Mellanox IPSec-offload capable devices. atches 13-16 add IPSec offload support to the mlx5 netdevice. This driver services the new IPSec offload API introduced in commit d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API") Configuration Path: If Innova IPSec device is detected, the mlx5e netdevice gets the new NETIF_F_HW_ESP feature and the xdo callbacks, indicating ESP offload capabilities, and also the matching TX checksum and GSO features. The driver configures offloaded Security Associations (SAs) by sending an ADD_SA or DEL_SA message to the IPSec SBU, which updates the SADB in DDR. These messages and their responses are sent over a high-speed channel. Counters for ethtool are retrieved by the driver from the SBU. Data path: On receive path, the SBU decrypts ESP packets which match the offloaded SADB, but keeps them encapsulated. The SBU injects metadata (Mellanox owned ethertype) indicating that crypto-offload has taken place, the SA with which it was done, and the authentication result. The ConnectX chip performs RX checksum offload on the packet, and RSS using the ESP SPI value. The driver detects the special ethertype, and attaches a struct secpath to the RX SKB, including flags to indicate that crypto offload took place, the authentication result, and which xfrm_state was used for decryption, in the olen and ovec members. The RX SKB may have useful CHECKSUM_COMPLETE. A separate patchset will add support for that in the xfrm stack. On transmit path, the stack encapsulates the packet but does not encrypt it, and indicates in the SKB's secpath that crypto offload is to be performed and the SA to use to do so. The driver avoids performing crypto-offload for ESP fragments, and packets with IP options, as the SBU cannot currently do that. For eligible packets, the driver prepends a special ethertype with metadata instructing the hardware to perform crypto offload. The stack builds regular (non-GSO) SKBs so that they contain a placeholder for the ESP trailer. The driver trims it off, because the SBU automatically appends the trailer for offloaded packets. The ConnectX chip performs TX checksum offload on inner UDP or TCP packets, and GSO for TCP packets (duplicating the prepended metadata). The segmented packets then undergo encryption in the SBU before going on the wire. Performance: We measure single stream of TCP on Intel(R) Xeon(R) CPU E5-2643 v2 @3.50GHz Using AES-NI with ESP GSO we get constant 4.1 Gbps. Using crypto offload we get constant 18 Gbps. Note that these numbers require CHECKSUM_COMPLETE support in XFRM, which we submit separately. - Ilan Tayari ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-28block: provide bio_uninit() free freeing integrity/task associationsJens Axboe1-0/+1
Wen reports significant memory leaks with DIF and O_DIRECT: "With nvme devive + T10 enabled, On a system it has 256GB and started logging /proc/meminfo & /proc/slabinfo for every minute and in an hour it increased by 15968128 kB or ~15+GB.. Approximately 256 MB / minute leaking. /proc/meminfo | grep SUnreclaim... SUnreclaim: 6752128 kB SUnreclaim: 6874880 kB SUnreclaim: 7238080 kB .... SUnreclaim: 22307264 kB SUnreclaim: 22485888 kB SUnreclaim: 22720256 kB When testcases with T10 enabled call into __blkdev_direct_IO_simple, code doesn't free memory allocated by bio_integrity_alloc. The patch fixes the issue. HTX has been run with +60 hours without failure." Since __blkdev_direct_IO_simple() allocates the bio on the stack, it doesn't go through the regular bio free. This means that any ancillary data allocated with the bio through the stack is not freed. Hence, we can leak the integrity data associated with the bio, if the device is using DIF/DIX. Fix this by providing a bio_uninit() and export it, so that we can use it to free this data. Note that this is a minimal fix for this issue. Any current user of bio's that are allocated outside of bio_alloc_bioset() suffers from this issue, most notably some drivers. We will fix those in a more comprehensive patch for 4.13. This also means that the commit marked as being fixed by this isn't the real culprit, it's just the most obvious one out there. Fixes: 542ff7bf18c6 ("block: new direct I/O implementation") Reported-by: Wen Xiong <wenxiong@linux.vnet.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-27udp: move scratch area helpers into the include filePaolo Abeni1-0/+61
So that they can be later used by the IPv6 code, too. Also lift the comments a bit. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-27net/mlx5e: IPSec, Innova IPSec offload infrastructureIlan Tayari2-4/+18
Add Innova IPSec ESP crypto offload configuration paths. Detect Innova IPSec device and set the NETIF_F_HW_ESP flag. Configure Security Associations using the API introduced in a previous patch. Add Software-parser hardware descriptor layout Software-Parser (swp) is a hardware feature in ConnectX which allows the host software to specify protocol header offsets in the TX path, thus overriding the hardware parser. This is useful for protocols that the ASIC may not be able to parse on its own. Note that due to inline metadata, XDP is not supported in Innova IPSec. Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Yossi Kuperman <yossiku@mellanox.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-27net/mlx5: Accel, Add IPSec acceleration interfaceIlan Tayari1-0/+67
Add routines for manipulating the hardware IPSec SA database (SADB). In Innova IPSec, a Security Association (SA) is added or deleted via a command message over the SBU connection. The HW then sends a response message over the same connection. Add implementation for Innova IPSec (FPGA-based) hardware. These routines will be used by the IPSec offload support in a later patch However they may also be used by others such as RDMA and RoCE IPSec. mlx5/accel is a middle acceleration layer to allow mlx5e and other ULPs to work directly with mlx5_core rather than Innova FPGA or other mlx5 acceleration providers. In this patchset we add Innova IPSec support and mlx5/accel delegates IPSec offloads to Innova routines. In the future, when IPSec/TLS or any other acceleration gets integrated into ConnectX chip, mlx5/accel layer will provide the integrated acceleration, rather than the Innova one. Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-27net/mlx5: FPGA, Add SBU infrastructureIlan Tayari4-0/+18
Add interface to initialize and interact with Innova FPGA SBU connections. A client driver may use these functions to set up a high-speed DMA connection with its SBU hardware logic, and send/receive messages over this connection. A later patch in this patchset will make use of these functions for Innova IPSec offload in mlx5 Ethernet driver. Add commands to retrieve Innova FPGA SBU capabilities, and to read/write Innova FPGA configuration space registers and memory, over internal I2C. At high level, the FPGA configuration space is divided such: 0x00000000 - 0x007fffff is reserved for the SBU 0x00800000 - 0xffffffff is reserved for the Shell 0x400000000 - ... is DDR memory A later patchset will add support for accessing FPGA CrSpace and memory over a high-speed connection. This is the reason for the ACCESS_TYPE enumeration, which currently only supports I2C. Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-27net/mlx5: FPGA, Add SBU bypass and reset flowsIlan Tayari1-0/+9
The Innova FPGA includes shell hardware and Sandbox-Unit (SBU) hardware. The shell hardware is handled by mlx5_core itself, while the SBU is handled by a client driver. Reset the SBU to a well-known initial state when initializing a new device, and set the FPGA to bypass mode when uninitializing a device. This allows the client driver to assume that its device has been reset when a new device is detected. During SBU reset, the FPGA is put into SBU-bypass mode. In this mode packets do not pass through the SBU, so it cannot affect the network data stream at all. A factory-image does not have an SBU, so skip these flows. Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-27net/mlx5: FPGA, Add FW commands for FPGA QPsIlan Tayari2-0/+204
The FPGA QP is a high-bandwidth communication channel between the host CPU and the FPGA device. It allows performing DMA operations between host memory and the FPGA logic via the ConnectX chip. Add ConnectX FW commands which create and manipulate FPGA QPs. Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-27net/mlx5: Add support for multiple RoCE enableIlan Tayari1-0/+1
Previously, only mlx5_ib enabled RoCE on the port, but FPGA needs it as well. Add support for counting number of enables, so that FPGA and IB can work in parallel and independently. Program the HW to enable RoCE on the first enable call, and program to disable RoCE on the last disable call. Signed-off-by: Ilan Tayari <ilant@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-27net/mlx5: Add reserved-gids supportIlan Tayari1-0/+17
Reserved GIDs are entries in the GID table in use by the mlx5_core and its submodules (e.g. FPGA, SRIOV, E-Swtich, netdev). The entries are reserved at the high indexes of the GID table. A mlx5 submodule may reserve a certain amount of GIDs for its own use during the load sequence by calling mlx5_core_reserve_gids, and must also take care to un-reserve these GIDs when it closes. Reservation is only allowed during the load sequence and before any interfaces (e.g. mlx5_ib or mlx5_en) are up. After reservation, a submodule may call mlx5_core_reserved_gid_alloc/ free to allocate entries from the reserved GIDs pool. Reserve a GID table entry for every supported FPGA QP. A later patch in the patchset will remove them from being reported to IB core. Another such patch will make use of these for FPGA QPs in Innova NIC. Added lib/mlx5.h to serve as a library for mlx5 submodlues, and to expose only public mlx5 API, more mlx5 library files will be added in future submissions. Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-27net/mlx5: Cancel delayed recovery work when unloading the driverMohamad Haj Yahia1-0/+1
Draining the health workqueue will ignore future health works including the one that report hardware failure and thus we can't enter error state Instead cancel the recovery flow and make sure only recovery flow won't be scheduled. Fixes: 5e44fca50470 ('net/mlx5: Only cancel recovery work when cleaning up device') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-26net: add netlink_ext_ack argument to rtnl_link_ops.slave_validateMatthias Schiffer1-1/+2
Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26net: add netlink_ext_ack argument to rtnl_link_ops.slave_changelinkMatthias Schiffer1-1/+2
Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26net: add netlink_ext_ack argument to rtnl_link_ops.validateMatthias Schiffer1-1/+2
Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26net: add netlink_ext_ack argument to rtnl_link_ops.changelinkMatthias Schiffer1-1/+2
Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26net: add netlink_ext_ack argument to rtnl_link_ops.newlinkMatthias Schiffer1-1/+2
Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds1-3/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "A few fixes for timekeeping and timers: - Plug a subtle race due to a missing READ_ONCE() in the timekeeping code where reloading of a pointer results in an inconsistent callback argument being supplied to the clocksource->read function. - Correct the CLOCK_MONOTONIC_RAW sub-nanosecond accounting in the time keeping core code, to prevent a possible discontuity. - Apply a similar fix to the arm64 vdso clock_gettime() implementation - Add missing includes to clocksource drivers, which relied on indirect includes which fails in certain configs. - Use the proper iomem pointer for read/iounmap in a probe function" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: arm64/vdso: Fix nsec handling for CLOCK_MONOTONIC_RAW time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting time: Fix clock->read(clock) race around clocksource changes clocksource: Explicitly include linux/clocksource.h when needed clocksource/drivers/arm_arch_timer: Fix read and iounmap of incorrect variable
2017-06-25Merge tag 'wireless-drivers-next-for-davem-2017-06-25' of ↵David S. Miller1-87/+0
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== wireless-drivers-next patches for 4.13 New features and bug fixes to quite a few different drivers, but nothing really special standing out. What makes me happy that we have now more vendors actively contributing to upstream drivers. In this pull request we have patches from Broadcom, Intel, Qualcomm, Realtek and Redpine Signals, and I still have patches from Marvell and Quantenna pending in patchwork. Now that's something comparing to how things looked 11 years ago in Jeff Garzik's "State of the Union: Wireless" email: https://lkml.org/lkml/2006/1/5/671 Major changes: wil6210 * add low level RF sector interface via nl80211 vendor commands * add module parameter ftm_mode to load separate firmware for factory testing * support devices with different PCIe bar size * add support for PCIe D3hot in system suspend * remove ioctl interface which should not be in a wireless driver ath10k * go back to using dma_alloc_coherent() for firmware scratch memory * add per chain RSSI reporting brcmfmac * add support multi-scheduled scan * add scheduled scan support for specified BSSIDs * add support for brcm43430 revision 0 wlcore * add wil1285 compatible rsi * add RS9113 USB support iwlwifi * FW API documentation improvements (for tools and htmldoc) * continuing work for the new A000 family * bump the maximum supported FW API to 31 * improve the differentiation between 8000, 9000 and A000 families ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25net: Remove ndo_dfwd_start_xmitMintz, Yuval1-9/+0
Looks like commit f663dd9aaf9e ("net: core: explicitly select a txq before doing l2 forwarding") has removed the need for this dedicated xmit function [it even explicitly states so in its commit log message] but it hasn't removed the definition of the ndo. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> CC: Jason Wang <jasowang@redhat.com> CC: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>