summaryrefslogtreecommitdiffstats
path: root/include/net
AgeCommit message (Collapse)AuthorFilesLines
2008-07-21netfilter: accounting rework: ct_extend + 64bit counters (v4)Krzysztof Piotr Oledzki3-6/+53
Initially netfilter has had 64bit counters for conntrack-based accounting, but it was changed in 2.6.14 to save memory. Unfortunately in-kernel 64bit counters are still required, for example for "connbytes" extension. However, 64bit counters waste a lot of memory and it was not possible to enable/disable it runtime. This patch: - reimplements accounting with respect to the extension infrastructure, - makes one global version of seq_print_acct() instead of two seq_print_counters(), - makes it possible to enable it at boot time (for CONFIG_SYSCTL/CONFIG_SYSFS=n), - makes it possible to enable/disable it at runtime by sysctl or sysfs, - extends counters from 32bit to 64bit, - renames ip_conntrack_counter -> nf_conn_counter, - enables accounting code unconditionally (no longer depends on CONFIG_NF_CT_ACCT), - set initial accounting enable state based on CONFIG_NF_CT_ACCT - removes buggy IPCT_COUNTER_FILLING event handling. If accounting is enabled newly created connections get additional acct extend. Old connections are not changed as it is not possible to add a ct_extend area to confirmed conntrack. Accounting is performed for all connections with acct extend regardless of a current state of "net.netfilter.nf_conntrack_acct". Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-21netlink: add NLA_PUT_BE64 macroKrzysztof Piotr Oledzki1-0/+3
Add NLA_PUT_BE64 macro required for 64bit counters in netfilter Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-20pkt_sched: Fix build with NET_SCHED disabled.David S. Miller1-0/+2
The stab bits can't be referenced uniless the full packet scheduler layer is enabled. Reported by Stephen Rothwell. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-20net_sched: Add size table for qdiscsJussi Kivilinna2-1/+25
Add size table functions for qdiscs and calculate packet size in qdisc_enqueue(). Based on patch by Patrick McHardy http://marc.info/?l=linux-netdev&m=115201979221729&w=2 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-20net_sched: Add accessor function for packet length for qdiscsJussi Kivilinna1-6/+11
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-20net_sched: Add qdisc_enqueue wrapperJussi Kivilinna1-0/+10
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19net: Use standard structures for generic socket address structures.YOSHIFUJI Hideaki1-1/+1
Use sockaddr_storage{} for generic socket address storage and ensures proper alignment. Use sockaddr{} for pointers to omit several casts. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19Merge branch 'master' of ↵David S. Miller4-10/+80
git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6
2008-07-19ipv6: remove unused macros from net/ipv6.hDenis V. Lunev1-6/+0
Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19ipv6: remove unused parameter from ip6_ra_controlDenis V. Lunev1-3/+1
Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19tcp: options clean upAdam Langley1-0/+2
This should fix the following bugs: * Connections with MD5 signatures produce invalid packets whenever SACK options are included * MD5 signatures are counted twice in the MSS calculations Behaviour changes: * A SYN with MD5 + SACK + TS elicits a SYNACK with MD5 + SACK This is because we can't fit any SACK blocks in a packet with MD5 + TS options. There was discussion about disabling SACK rather than TS in order to fit in better with old, buggy kernels, but that was deemed to be unnecessary. * SYNs with MD5 don't include a TS option See above. Additionally, it removes a bunch of duplicated logic for calculating options, which should help avoid these sort of issues in the future. Signed-off-by: Adam Langley <agl@imperialviolet.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19tcp: Fix MD5 signatures for non-linear skbsAdam Langley1-17/+12
Currently, the MD5 code assumes that the SKBs are linear and, in the case that they aren't, happily goes off and hashes off the end of the SKB and into random memory. Reported by Stephen Hemminger in [1]. Advice thanks to Stephen and Evgeniy Polyakov. Also includes a couple of missed route_caps from Stephen's patch in [2]. [1] http://marc.info/?l=linux-netdev&m=121445989106145&w=2 [2] http://marc.info/?l=linux-netdev&m=121459157816964&w=2 Signed-off-by: Adam Langley <agl@imperialviolet.org> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18sctp: remove unnecessary byteshifting, calculate directly in big-endianHarvey Harrison1-9/+14
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18sctp: Support ipv6only AF_INET6 sockets.Vlad Yasevich1-0/+2
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18tcp: RTT metrics scalingStephen Hemminger1-0/+12
Some of the metrics (RTT, RTTVAR and RTAX_RTO_MIN) are stored in kernel units (jiffies) and this leaks out through the netlink API to user space where the units for jiffies are unknown. This patches changes the kernel to convert to/from milliseconds. This changes the ABI, but milliseconds seemed like the most natural unit for these parameters. Values available via syscall in /proc/net/rt_cache and netlink will be in milliseconds. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18pkt_sched: Get rid of u32_list.David S. Miller1-0/+2
The u32_list is just an indirect way of maintaining a reference to a U32 node on a per-qdisc basis. Just add an explicit node pointer for u32 to struct Qdisc an do away with this global list. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: put icmpmsg statistics on struct netPavel Emelyanov2-3/+3
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: put icmp statistics on struct netPavel Emelyanov2-3/+3
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: put udplite statistics on struct netPavel Emelyanov2-3/+3
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: put udp statistics on struct netPavel Emelyanov2-5/+5
Similar to... ouch, I repeat myself. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: put net statistics on struct netPavel Emelyanov2-6/+6
Similar to ip and tcp ones :) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: put ip statistics on struct netPavel Emelyanov2-4/+4
Similar to tcp one. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: put tcp statistics on struct netPavel Emelyanov2-5/+5
Proc temporary uses stats from init_net. BTW, TCP_XXX_STATS are beautiful (w/o do { } while (0) facing) again :) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: add netns/mib.h filePavel Emelyanov2-0/+11
The only structure declared within is the netns_mib, which will carry all our mibs within. I didn't put the mibs in the existing netns_xxx structures to make it possible to mark this one as properly aligned and get in a separate "read-mostly" cache-line. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17pkt_sched: Don't used locked skb_queue_purge() in __qdisc_reset_queue()David S. Miller1-1/+1
We have to have exclusive access to the given qdisc anyways, so doing even more locking is superfluous. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17pkt_sched: Kill netdev_queue lock.David S. Miller1-1/+6
We can simply use the qdisc->q.lock for all of the qdisc tree synchronization. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17pkt_sched: Kill qdisc_lock_tree and qdisc_unlock_tree.David S. Miller1-3/+0
No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17pkt_sched: Rework {sch,tbf}_tree_lock().David S. Miller1-4/+12
Make sch_tree_lock() lock the qdisc's root. All of the users hold the RTNL semaphore and the root qdisc is not changing. Implement tbf_tree_{lock,unlock}() simply in terms of sch_tree_{lock,unlock}(). Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17pkt_sched: Schedule qdiscs instead of netdev_queue.David S. Miller2-7/+6
When we have shared qdiscs, packets come out of the qdiscs for multiple transmit queues. Therefore it doesn't make any sense to schedule the transmit queue when logically we cannot know ahead of time the TX queue of the SKB that the qdisc->dequeue() will give us. Just for sanity I added a BUG check to make sure we never get into a state where the noop_qdisc is scheduled. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17pkt_sched: Add and use qdisc_root() and qdisc_root_lock().David S. Miller1-0/+12
When code wants to lock the qdisc tree state, the logic operation it's doing is locking the top-level qdisc that sits of the root of the netdev_queue. Add qdisc_root_lock() to represent this and convert the easiest cases. In order for this to work out in all cases, we have to hook up the noop_qdisc to a dummy netdev_queue. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17pkt_sched: Make QDISC_RUNNING a qdisc state.David S. Miller2-1/+9
Currently it is associated with a netdev_queue, but when we have qdisc sharing that no longer makes any sense. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17pkt_sched: Move gso_skb into Qdisc.David S. Miller1-0/+1
We liberate any dangling gso_skb during qdisc destruction. It really only matters for the root qdisc. But when qdiscs can be shared by multiple netdev_queue objects, we can't have the gso_skb in the netdev_queue any more. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17mac80211: Reimplement WME using ->select_queue().David S. Miller1-8/+0
The only behavior change is that we do not drop packets under any circumstances. If that is absolutely needed, we could easily add it back. With cleanups and help from Johannes Berg. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17net: Use queue aware tests throughout.David S. Miller1-3/+1
This effectively "flips the switch" by making the core networking and multiqueue-aware drivers use the new TX multiqueue structures. Non-multiqueue drivers need no changes. The interfaces they use such as netif_stop_queue() degenerate into an operation on TX queue zero. So everything "just works" for them. Code that really wants to do "X" to all TX queues now invokes a routine that does so, such as netif_tx_wake_all_queues(), netif_tx_stop_all_queues(), etc. pktgen and netpoll required a little bit more surgery than the others. In particular the pktgen changes, whilst functional, could be largely improved. The initial check in pktgen_xmit() will sometimes check the wrong queue, which is mostly harmless. The thing to do is probably to invoke fill_packet() earlier. The bulk of the netpoll changes is to make the code operate solely on the TX queue indicated by by the SKB queue mapping. Setting of the SKB queue mapping is entirely confined inside of net/core/dev.c:dev_pick_tx(). If we end up needing any kind of special semantics (drops, for example) it will be implemented here. Finally, we now have a "real_num_tx_queues" which is where the driver indicates how many TX queues are actually active. With IGB changes from Jeff Kirsher. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17netdev: Allocate multiple queues for TX.David S. Miller1-11/+26
alloc_netdev_mq() now allocates an array of netdev_queue structures for TX, based upon the queue_count argument. Furthermore, all accesses to the TX queues are now vectored through the netdev_get_tx_queue() and netdev_for_each_tx_queue() interfaces. This makes it easy to grep the tree for all things that want to get to a TX queue of a net device. Problem spots which are not really multiqueue aware yet, and only work with one queue, can easily be spotted by grepping for all netdev_get_tx_queue() calls that pass in a zero index. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16core: add stat to track unresolved discards in neighbor cacheNeil Horman1-1/+3
in __neigh_event_send, if we have a neighbour entry which is in NUD_INCOMPLETE state, we enqueue any outbound frames to that neighbour to the neighbours arp_queue, which is default capped to a length of 3 skbs. If that queue exceeds its set length, it will drop an skb on the queue to enqueue the newly arrived skb. This results in a drop for which we have no statistics incremented. This patch adds an unresolved_discards stat to /proc/net/stat/ndisc_cache to track these lost frames. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to NET_ADD_STATS_USERPavel Emelyanov1-1/+1
Done with NET_XXX_STATS macros :) To be continued... Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to NET_ADD_STATS_BHPavel Emelyanov1-1/+1
This one is tricky. The thing is that this macro is only used when killing tw buckets, but since this killer is promiscuous wrt to which net each particular tw belongs to, I have to use it only when NET_NS is off. When the net namespaces are on, I use the INET_INC_STATS_BH for each bucket. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to NET_INC_STATS_USERPavel Emelyanov1-1/+1
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to NET_INC_STATS_BHPavel Emelyanov2-2/+2
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to NET_INC_STATSPavel Emelyanov1-1/+1
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16sock: add net to prot->enter_memory_pressure callbackPavel Emelyanov2-3/+3
The tcp_enter_memory_pressure calls NET_INC_STATS, but doesn't have where to get the net from. I decided to add a sk argument, not the net itself, only to factor all the required sock_net(sk) calls inside the enter_memory_pressure callback itself. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to TCP_ADD_STATS_USERPavel Emelyanov1-5/+5
Now we're done with the TCP_XXX_STATS macros. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to TCP_DEC_STATSPavel Emelyanov1-1/+1
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to TCP_INC_STATS_BHPavel Emelyanov1-1/+1
Same as before - the sock is always there to get the net from, but there are also some places with the net already saved on the stack. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to TCP_INC_STATSPavel Emelyanov1-1/+1
Fortunately (almost) all the TCP code has a sock to get the net from :) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16tcp: add net to tcp_mib_initPavel Emelyanov1-1/+1
This one sets TCP MIBs after zeroing them, and thus requires the net. The existing single caller can use init_net (temporarily). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: drop unused TCP_XXX_STATS macrosPavel Emelyanov1-2/+0
TCP_INC_STATS_USER and TCP_ADD_STATS_BH are currently unused. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to IP_ADD_STATS_BHPavel Emelyanov1-1/+1
Very simple - only ip_evictor (fragments) requires such. This patch ends up the IP_XXX_STATS patching. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to IP_INC_STATS_BHPavel Emelyanov1-1/+1
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>