summaryrefslogtreecommitdiffstats
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2015-03-18inet: request sock should init IPv6/IPv4 addressesEric Dumazet1-0/+5
In order to be able to use sk_ehashfn() for request socks, we need to initialize their IPv6/IPv4 addresses. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18inet: get rid of last __inet_hash_connect() argumentEric Dumazet1-3/+1
We now always call __inet_hash_nolisten(), no need to pass it as an argument. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18ipv6: get rid of __inet6_hash()Eric Dumazet2-2/+1
We can now use inet_hash() and __inet_hash() instead of private functions. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18inet: add IPv6 support to sk_ehashfn()Eric Dumazet1-0/+19
Intent is to converge IPv4 & IPv6 inet_hash functions to factorize code. IPv4 sockets initialize sk_rcv_saddr and sk_v6_daddr in this patch, thanks to new sk_daddr_set() and sk_rcv_saddr_set() helpers. __inet6_hash can now use sk_ehashfn() instead of a private inet6_sk_ehashfn() and will simply use __inet_hash() in a following patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18net: introduce sk_ehashfn() helperEric Dumazet1-0/+2
Goal is to unify IPv4/IPv6 inet_hash handling, and use common helpers for all kind of sockets (full sockets, timewait and request sockets) inet_sk_ehashfn() becomes sk_ehashfn() but still only copes with IPv4 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18netns: constify net_hash_mix() and various callersEric Dumazet3-6/+6
const qualifiers ease code review by making clear which objects are not written in a function. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18net/mlx4_core: Add basic support for QP max-rate limitingOr Gerlitz2-4/+27
Add the low-level device commands and definitions used for QP max-rate limiting. This is done through the following elements: - read rate-limit device caps in QUERY_DEV_CAP: number of different rates and the min/max rates in Kbs/Mbs/Gbs units - enhance the QP context struct to contain rate limit units and value - allow to do run time rate-limit setting to QPs through the update-qp firmware command - QP rate-limiting is disallowed for VFs Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18net: Add max rate tx queue attributeJohn Fastabend1-0/+8
This adds a tx_maxrate attribute to the tx queue sysfs entry allowing for max-rate limiting. Along with DCB-ETS and BQL this provides another knob to tune queue performance. The limit units are Mbps. By default it is disabled. To disable the rate limitation after it has been set for a queue, it should be set to zero. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18rhashtable: Remove max_shift and min_shiftHerbert Xu1-4/+0
Now that nobody uses max_shift and min_shift, we can safely remove them. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18rhashtable: Introduce max_size/min_sizeHerbert Xu1-0/+4
This patch adds the parameters max_size and min_size which are meant to replace max_shift and min_shift. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18rhashtable: Remove shift from bucket_tableHerbert Xu1-2/+0
Keeping both size and shift is silly. We only need one. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-17inet: fix request sock refcountingEric Dumazet3-6/+11
While testing last patch series, I found req sock refcounting was wrong. We must set skc_refcnt to 1 for all request socks added in hashes, but also on request sockets created by FastOpen or syncookies. It is tricky because we need to defer this initialization so that future RCU lookups do not try to take a refcount on a not yet fully initialized request socket. Also get rid of ireq_refcnt alias. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: 13854e5a6046 ("inet: add proper refcounting to request sock") Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-17tcp: rename struct tcp_request_sock listenerEric Dumazet1-1/+1
The listener field in struct tcp_request_sock is a pointer back to the listener. We now have req->rsk_listener, so TCP only needs one boolean and not a full pointer. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-17inet: add rsk_listener field to struct request_sockEric Dumazet1-3/+9
Once we'll be able to lookup request sockets in ehash table, we'll need to get access to listener which created this request. This avoid doing a lookup to find the listener, which benefits for a more solid SO_REUSEPORT, and is needed once we no longer queue request sock into a listener private queue. Note that 'struct tcp_request_sock'->listener could be reduced to a single bit, as TFO listener should match req->rsk_listener. TFO will no longer need to hold a reference on the listener. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-17inet: uninline inet_reqsk_alloc()Eric Dumazet1-22/+2
inet_reqsk_alloc() is becoming fat and should not be inlined. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-17inet: add sk_listener argument to inet_reqsk_alloc()Eric Dumazet1-3/+6
listener socket can be used to set net pointer, and will be later used to hold a reference on listener. Add a const qualifier to first argument (struct request_sock_ops *), and factorize all write_pnet(&ireq->ireq_net, sock_net(sk)); Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-17tcp: uninline tcp_oow_rate_limited()Eric Dumazet1-30/+2
tcp_oow_rate_limited() is hardly used in fast path, there is no point inlining it. Signed-of-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-17tcp: move tcp_openreq_init() to tcp_input.cEric Dumazet1-25/+0
This big helper is called once from tcp_conn_request(), there is no point having it in an include. Compiler will inline it anyway. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-17inet: move ir_mark to fill a holeEric Dumazet1-6/+5
On 64bit arches, we can save 8 bytes in inet_request_sock by moving ir_mark to fill a hole. While we are at it, inet_request_mark() can get a const qualifier for listener socket. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-17bpf: allow BPF programs access 'protocol' and 'vlan_tci' fieldsAlexei Starovoitov1-0/+3
as a follow on to patch 70006af95515 ("bpf: allow eBPF access skb fields") this patch allows 'protocol' and 'vlan_tci' fields to be accessible from extended BPF programs. The usage of 'protocol', 'vlan_present' and 'vlan_tci' fields is the same as corresponding SKF_AD_PROTOCOL, SKF_AD_VLAN_TAG_PRESENT and SKF_AD_VLAN_TAG accesses in classic BPF. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-16inet: add proper refcounting to request sockEric Dumazet3-6/+17
reqsk_put() is the generic function that should be used to release a refcount (and automatically call reqsk_free()) reqsk_free() might be called if refcount is known to be 0 or undefined. refcnt is set to one in inet_csk_reqsk_queue_add() As request socks are not yet in global ehash table, I added temporary debugging checks in reqsk_put() and reqsk_free() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-16net: add sk_fullsock() helperEric Dumazet1-0/+9
We have many places where we want to check if a socket is not a timewait or request socket. Use a helper to avoid hard coding this. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-16netdev: remove ndo ops for switchdevScott Feldman1-38/+0
Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-16switchdev: add swdev opsScott Feldman2-0/+41
As discussed at netconf, introduce swdev_ops as first step to move switchdev ops from ndo to swdev. This will keep switchdev from cluttering up ndo ops space. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-15bpf: allow extended BPF programs access skb fieldsAlexei Starovoitov2-1/+14
introduce user accessible mirror of in-kernel 'struct sk_buff': struct __sk_buff { __u32 len; __u32 pkt_type; __u32 mark; __u32 queue_mapping; }; bpf programs can do: int bpf_prog(struct __sk_buff *skb) { __u32 var = skb->pkt_type; which will be compiled to bpf assembler as: dst_reg = *(u32 *)(src_reg + 4) // 4 == offsetof(struct __sk_buff, pkt_type) bpf verifier will check validity of access and will convert it to: dst_reg = *(u8 *)(src_reg + offsetof(struct sk_buff, __pkt_type_offset)) dst_reg &= 7 since skb->pkt_type is a bitfield. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-15ebpf: add helper for obtaining current processor idDaniel Borkmann2-0/+2
This patch adds the possibility to obtain raw_smp_processor_id() in eBPF. Currently, this is only possible in classic BPF where commit da2033c28226 ("filter: add SKF_AD_RXHASH and SKF_AD_CPU") has added facilities for this. Perhaps most importantly, this would also allow us to track per CPU statistics with eBPF maps, or to implement a poor-man's per CPU data structure through eBPF maps. Example function proto-type looks like: u32 (*smp_processor_id)(void) = (void *)BPF_FUNC_get_smp_processor_id; Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-15ebpf: add prandom helper for packet samplingDaniel Borkmann2-0/+3
This work is similar to commit 4cd3675ebf74 ("filter: added BPF random opcode") and adds a possibility for packet sampling in eBPF. Currently, this is only possible in classic BPF and useful to combine sampling with f.e. packet sockets, possible also with tc. Example function proto-type looks like: u32 (*prandom_u32)(void) = (void *)BPF_FUNC_get_prandom_u32; Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-15rhashtable: Move future_tbl into struct bucket_tableHerbert Xu1-2/+3
This patch moves future_tbl to open up the possibility of having multiple rehashes on the same table. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-15rhashtable: Add rehash counter to bucket_tableHerbert Xu1-1/+3
This patch adds a rehash counter to bucket_table to indicate the last bucket that has been rehashed. This serves two purposes: 1. Any bucket that has been rehashed can never gain a new object. 2. If the rehash counter reaches the size of the table, the table will forever remain empty. This patch also downsizes bucket_table->size to an unsigned int since we do not support sizes greater than 32 bits yet. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-15rhashtable: Free bucket tables asynchronously after rehashHerbert Xu1-0/+2
There is in fact no need to wait for an RCU grace period in the rehash function, since all insertions are guaranteed to go into the new table through spin locks. This patch uses call_rcu to free the old/rehashed table at our leisure. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-15rhashtable: Fix walker behaviour during rehashHerbert Xu1-4/+4
Previously whenever the walker encountered a resize it simply snaps back to the beginning and starts again. However, this only works if the rehash started and completed while the walker was idle. If the walker attempts to restart while the rehash is still ongoing, we may miss objects that we shouldn't have. This patch fixes this by making the walker walk the old table followed by the new table just like all other readers. If a rehash is detected we will still signal our caller of the fact so they can prepare for duplicates but we will simply continue the walk onto the new table after the old one is finished either by us or by the rehasher. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-14Merge branch 'for-upstream' of ↵David S. Miller5-24/+64
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== Here's another set of Bluetooth & ieee802154 patches intended for 4.1: - Added support for QCA ROME chipset family in the btusb driver - at86rf230 driver fixes & cleanups - ieee802154 cleanups - Refactoring of Bluetooth mgmt API to allow new users - New setting for static Bluetooth address exposed to user space - Refactoring of hci_dev flags to remove limit of 32 - Remove unnecessary fast-connectable setting usage restrictions - Fix behavior to be consistent when trying to pair already paired device - Service discovery corner-case fixes Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-14mac802154: correct max sifs size handlingAlexander Aring1-0/+2
This patch fix the max sifs size correction when the IEEE802154_HW_TX_OMIT_CKSUM flag is set. With this flag the sk_buff doesn't contain the CRC, because the transceiver will add the CRC while transmit. Also add some defines for the max sifs frame size value and frame check sequence according to 802.15.4 standard. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-03-13Bluetooth: Merge hdev->dbg_flags fields into hdev->dev_flagsMarcel Holtmann2-10/+5
With the extension of hdev->dev_flags utilizing a bitmap now, the space is no longer restricted. Merge the hdev->dbg_flags into hdev->dev_flags to save space on 64-bit architectures. On 32-bit architectures no size reduction happens. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-03-13Bluetooth: Use DECLARE_BITMAP for hdev->dev_flags fieldMarcel Holtmann2-15/+16
The hdev->dev_flags field has outgrown itself on 32-bit systems. So instead of hacking around it, switch to using DECLARE_BITMAP. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-03-13Bluetooth: Introduce hci_dev_test_and_set_flag helper macroMarcel Holtmann1-0/+1
Instead of manually coding test_and_set_bit on hdev->dev_flags all the time, use hci_dev_test_and_set_flag helper macro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-03-13Bluetooth: Introduce hci_dev_test_and_clear_flag helper macroMarcel Holtmann1-0/+1
Instead of manually coding test_and_clear_bit on hdev->dev_flags all the time, use hci_dev_test_and_clear_flag helper macro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-03-13Bluetooth: Introduce hci_dev_test_and_change_flag helper macroMarcel Holtmann1-0/+2
Instead of manually coding test_and_change_bit on hdev->dev_flags all the time, use hci_dev_test_and_change_flag helper macro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-03-13Bluetooth: Introduce hci_dev_change_flag helper macroMarcel Holtmann1-0/+1
Instead of manually coding change_bit on hdev->dev_flags all the time, use hci_dev_change_flag helper macro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-03-13Bluetooth: Introduce hci_dev_clear_flag helper macroMarcel Holtmann1-0/+1
Instead of manually coding clear_bit on hdev->dev_flags all the time, use hci_dev_clear_flag helper macro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-03-13Bluetooth: Introduce hci_dev_set_flag helper macroMarcel Holtmann1-0/+1
Instead of manually coding set_bit on hdev->dev_flags all the time, use hci_dev_set_flag helper macro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-03-13Bluetooth: Introduce hci_dev_test_flag helper macroMarcel Holtmann1-6/+8
Instead of manually coding test_bit on hdev->dev_flags all the time, use hci_dev_test_flag helper macro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-03-13Bluetooth: Add support connectable advertising settingMarcel Holtmann1-0/+1
The patch adds a second advertising setting that allows switching of the controller into connectable mode independent of the global connectable setting. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-03-13tcp_metrics: Use a single hash table for all network namespaces.Eric W. Biederman1-2/+0
Now that all of the operations are safe on a single hash table accross network namespaces, allocate a single global hash table and update the code to use it. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-12rhashtable: kill ht->shift atomic operationsDaniel Borkmann1-3/+3
Commit c0c09bfdc415 ("rhashtable: avoid unnecessary wakeup for worker queue") changed ht->shift to be atomic, which is actually unnecessary. Instead of leaving the current shift in the core rhashtable structure, it can be cached inside the individual bucket tables. There, it will only be initialized once during a new table allocation in the shrink/expansion slow path, and from then onward it stays immutable for the rest of the bucket table liftime. That allows shift to be non-atomic. The patch also moves hash_rnd management into the table setup. The rhashtable structure now consumes 3 instead of 4 cachelines. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Ying Xue <ying.xue@windriver.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-12inet: introduce ireq_familyEric Dumazet1-0/+1
Before inserting request socks into general hash table, fill their socket family. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-12inet: prepare sock_edemux() & sock_gen_put() for new SYN_RECV stateEric Dumazet2-1/+6
sock_edemux() & sock_gen_put() should be ready to cope with request socks. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-12inet: add rsk_refcnt/ireq_refcnt to request socksEric Dumazet2-0/+9
When request socks will be in ehash, they'll need to be refcounted. This patch adds rsk_refcnt/ireq_refcnt macros, and adds reqsk_put() function, but nothing yet use them. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-12inet: add ireq_state field to inet_request_sockEric Dumazet1-0/+3
We need to identify request sock when they'll be visible in global ehash table. ireq_state is an alias to req.__req_common.skc_state. Its value is set to TCP_NEW_SYN_RECV Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-12inet: add TCP_NEW_SYN_RECV stateEric Dumazet1-1/+3
TCP_SYN_RECV state is currently used by fast open sockets. Initial TCP requests (the pseudo sockets created when a SYN is received) are not yet associated to a state. They are attached to their parent, and the parent is in TCP_LISTEN state. This commit adds TCP_NEW_SYN_RECV state, so that we can convert TCP stack to a different schem gradually. This state is not exported to user space. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>