summaryrefslogtreecommitdiffstats
path: root/net/openvswitch/datapath.c
AgeCommit message (Collapse)AuthorFilesLines
2013-11-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds1-31/+28
Pull networking fixes from David Miller: "Mostly these are fixes for fallout due to merge window changes, as well as cures for problems that have been with us for a much longer period of time" 1) Johannes Berg noticed two major deficiencies in our genetlink registration. Some genetlink protocols we passing in constant counts for their ops array rather than something like ARRAY_SIZE(ops) or similar. Also, some genetlink protocols were using fixed IDs for their multicast groups. We have to retain these fixed IDs to keep existing userland tools working, but reserve them so that other multicast groups used by other protocols can not possibly conflict. In dealing with these two problems, we actually now use less state management for genetlink operations and multicast groups. 2) When configuring interface hardware timestamping, fix several drivers that simply do not validate that the hwtstamp_config value is one the driver actually supports. From Ben Hutchings. 3) Invalid memory references in mwifiex driver, from Amitkumar Karwar. 4) In dev_forward_skb(), set the skb->protocol in the right order relative to skb_scrub_packet(). From Alexei Starovoitov. 5) Bridge erroneously fails to use the proper wrapper functions to make calls to netdev_ops->ndo_vlan_rx_{add,kill}_vid. Fix from Toshiaki Makita. 6) When detaching a bridge port, make sure to flush all VLAN IDs to prevent them from leaking, also from Toshiaki Makita. 7) Put in a compromise for TCP Small Queues so that deep queued devices that delay TX reclaim non-trivially don't have such a performance decrease. One particularly problematic area is 802.11 AMPDU in wireless. From Eric Dumazet. 8) Fix crashes in tcp_fastopen_cache_get(), we can see NULL socket dsts here. Fix from Eric Dumzaet, reported by Dave Jones. 9) Fix use after free in ipv6 SIT driver, from Willem de Bruijn. 10) When computing mergeable buffer sizes, virtio-net fails to take the virtio-net header into account. From Michael Dalton. 11) Fix seqlock deadlock in ip4_datagram_connect() wrt. statistic bumping, this one has been with us for a while. From Eric Dumazet. 12) Fix NULL deref in the new TIPC fragmentation handling, from Erik Hugne. 13) 6lowpan bit used for traffic classification was wrong, from Jukka Rissanen. 14) macvlan has the same issue as normal vlans did wrt. propagating LRO disabling down to the real device, fix it the same way. From Michal Kubecek. 15) CPSW driver needs to soft reset all slaves during suspend, from Daniel Mack. 16) Fix small frame pacing in FQ packet scheduler, from Eric Dumazet. 17) The xen-netfront RX buffer refill timer isn't properly scheduled on partial RX allocation success, from Ma JieYue. 18) When ipv6 ping protocol support was added, the AF_INET6 protocol initialization cleanup path on failure was borked a little. Fix from Vlad Yasevich. 19) If a socket disconnects during a read/recvmsg/recvfrom/etc that blocks we can do the wrong thing with the msg_name we write back to userspace. From Hannes Frederic Sowa. There is another fix in the works from Hannes which will prevent future problems of this nature. 20) Fix route leak in VTI tunnel transmit, from Fan Du. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits) genetlink: make multicast groups const, prevent abuse genetlink: pass family to functions using groups genetlink: add and use genl_set_err() genetlink: remove family pointer from genl_multicast_group genetlink: remove genl_unregister_mc_group() hsr: don't call genl_unregister_mc_group() quota/genetlink: use proper genetlink multicast APIs drop_monitor/genetlink: use proper genetlink multicast APIs genetlink: only pass array to genl_register_family_with_ops() tcp: don't update snd_nxt, when a socket is switched from repair mode atm: idt77252: fix dev refcnt leak xfrm: Release dst if this dst is improper for vti tunnel netlink: fix documentation typo in netlink_set_err() be2net: Delete secondary unicast MAC addresses during be_close be2net: Fix unconditional enabling of Rx interface options net, virtio_net: replace the magic value ping: prevent NULL pointer dereference on write to msg_name bnx2x: Prevent "timeout waiting for state X" bnx2x: prevent CFC attention bnx2x: Prevent panic during DMAE timeout ...
2013-11-19genetlink: make multicast groups const, prevent abuseJohannes Berg1-28/+15
Register generic netlink multicast groups as an array with the family and give them contiguous group IDs. Then instead of passing the global group ID to the various functions that send messages, pass the ID relative to the family - for most families that's just 0 because the only have one group. This avoids the list_head and ID in each group, adding a new field for the mcast group ID offset to the family. At the same time, this allows us to prevent abusing groups again like the quota and dropmon code did, since we can now check that a family only uses a group it owns. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-19genetlink: pass family to functions using groupsJohannes Berg1-13/+22
This doesn't really change anything, but prepares for the next patch that will change the APIs to pass the group ID within the family, rather than the global group ID. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-19genetlink: add and use genl_set_err()Johannes Berg1-4/+4
Add a static inline to generic netlink to wrap netlink_set_err() to make it easier to use here - use it in openvswitch (the only generic netlink user of netlink_set_err()). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-19genetlink: only pass array to genl_register_family_with_ops()Johannes Berg1-2/+3
As suggested by David Miller, make genl_register_family_with_ops() a macro and pass only the array, evaluating ARRAY_SIZE() in the macro, this is a little safer. The openvswitch has some indirection, assing ops/n_ops directly in that code. This might ultimately just assign the pointers in the family initializations, saving the struct genl_family_and_ops and code (once mcast groups are handled differently.) Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-14genetlink: make all genl_ops users constJohannes Berg1-5/+5
Now that genl_ops are no longer modified in place when registering, they can be made const. This patch was done mostly with spatch: @@ identifier ops; @@ +const struct genl_ops ops[] = { ... }; (except the struct thing in net/openvswitch/datapath.c) Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-14Merge branch 'core-locking-for-linus' of ↵Linus Torvalds1-0/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core locking changes from Ingo Molnar: "The biggest changes: - add lockdep support for seqcount/seqlocks structures, this unearthed both bugs and required extra annotation. - move the various kernel locking primitives to the new kernel/locking/ directory" * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) block: Use u64_stats_init() to initialize seqcounts locking/lockdep: Mark __lockdep_count_forward_deps() as static lockdep/proc: Fix lock-time avg computation locking/doc: Update references to kernel/mutex.c ipv6: Fix possible ipv6 seqlock deadlock cpuset: Fix potential deadlock w/ set_mems_allowed seqcount: Add lockdep functionality to seqcount/seqlock structures net: Explicitly initialize u64_stats_sync structures for lockdep locking: Move the percpu-rwsem code to kernel/locking/ locking: Move the lglocks code to kernel/locking/ locking: Move the rwsem code to kernel/locking/ locking: Move the rtmutex code to kernel/locking/ locking: Move the semaphore core to kernel/locking/ locking: Move the spinlock code to kernel/locking/ locking: Move the lockdep code to kernel/locking/ locking: Move the mutex code to kernel/locking/ hung_task debugging: Add tracepoint to report the hang x86/locking/kconfig: Update paravirt spinlock Kconfig description lockstat: Report avg wait and hold times lockdep, x86/alternatives: Drop ancient lockdep fixup message ...
2013-11-06net: Explicitly initialize u64_stats_sync structures for lockdepJohn Stultz1-0/+6
In order to enable lockdep on seqcount/seqlock structures, we must explicitly initialize any locks. The u64_stats_sync structure, uses a seqcount, and thus we need to introduce a u64_stats_init() function and use it to initialize the structure. This unfortunately adds a lot of fairly trivial initialization code to a number of drivers. But the benefit of ensuring correctness makes this worth while. Because these changes are required for lockdep to be enabled, and the changes are quite trivial, I've not yet split this patch out into 30-some separate patches, as I figured it would be better to get the various maintainers thoughts on how to best merge this change along with the seqcount lockdep enablement. Feedback would be appreciated! Signed-off-by: John Stultz <john.stultz@linaro.org> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: James Morris <jmorris@namei.org> Cc: Jesse Gross <jesse@nicira.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Mirko Lindner <mlindner@marvell.com> Cc: Patrick McHardy <kaber@trash.net> Cc: Roger Luethi <rl@hellgate.ch> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Simon Horman <horms@verge.net.au> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Wensong Zhang <wensong@linux-vs.org> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/1381186321-4906-2-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-01openvswitch: Widen TCP flags handling.Jarno Rajahalme1-1/+1
Widen TCP flags handling from 7 bits (uint8_t) to 12 bits (uint16_t). The kernel interface remains at 8 bits, which makes no functional difference now, as none of the higher bits is currently of interest to the userspace. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-10-22openvswitch: collect mega flow mask statsAndy Zhou1-7/+31
Collect mega flow mask stats. ovs-dpctl show command can be used to display them for debugging and performance tuning. Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-10-04openvswitch: Simplify mega-flow APIs.Pravin B Shelar1-20/+7
Hides mega-flow implementation in flow_table.c rather than datapath.c. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-10-04openvswitch: Move mega-flow list out of rehashing struct.Pravin B Shelar1-60/+17
ovs-flow rehash does not touch mega flow list. Following patch moves it dp struct datapath. Avoid one extra indirection for accessing mega-flow list head on every packet receive. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-10-03openvswitch: Restructure datapath.c and flow.cPravin B Shelar1-498/+30
Over the time datapath.c and flow.c has became pretty large files. Following patch restructures functionality of component into three different components: flow.c: contains flow extract. flow_netlink.c: netlink flow api. flow_table.c: flow table api. This patch restructures code without changing logic. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-09-17openvswitch: Move flow table rehashing to flow install.Pravin B Shelar1-39/+11
Rehashing in ovs-workqueue can cause ovs-mutex lock contentions in case of heavy flow setups where both needs ovs-mutex. So by moving rehashing to flow-setup we can eliminate contention. This also simplify ovs locking and reduces dependence on workqueue. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-08-26openvswitch: Add SCTP supportJoe Stringer1-0/+6
This patch adds support for rewriting SCTP src,dst ports similar to the functionality already available for TCP/UDP. Rewriting SCTP ports is expensive due to double-recalculation of the SCTP checksums; this is performed to ensure that packets traversing OVS with invalid checksums will continue to the destination with any checksum corruption intact. Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-08-23openvswitch: Mega flow implementationAndy Zhou1-40/+100
Add wildcarded flow support in kernel datapath. Wildcarded flow can improve OVS flow set up performance by avoid sending matching new flows to the user space program. The exact performance boost will largely dependent on wildcarded flow hit rate. In case all new flows hits wildcard flows, the flow set up rate is within 5% of that of linux bridge module. Pravin has made significant contributions to this patch. Including API clean ups and bug fixes. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-08-23openvswitch: Use RCU lock for dp dump operation.Pravin B Shelar1-6/+7
RCUfy dp-dump operation which is already read-only. This makes all ovs dump operations lockless. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-08-23openvswitch: Use RCU lock for flow dump operation.Pravin B Shelar1-8/+9
Flow dump operation is read-only operation. There is no need to take ovs-lock. Following patch use rcu-lock for dumping flows. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-08-14openvswitch: Fix bad merge resolution.Jesse Gross1-3/+0
git silently included an extra hunk in vport_cmd_set() during automatic merging. This code is unreachable so it does not actually introduce a problem but it is clearly incorrect. Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-06-19openvswitch: Optimize flow key match for non tunnel flows.Pravin B Shelar1-5/+2
Following patch adds start offset for sw_flow-key, so that we can skip tunneling information in key for non-tunnel flows. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19openvswitch: Add tunneling interface.Pravin B Shelar1-3/+75
Add ovs tunnel interface for set tunnel action for userspace. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19openvswitch: Copy individual actions.Pravin B Shelar1-51/+220
Rather than validating actions and then copying all actiaons in one block, following patch does same operation in single pass. This validate and copy action one by one. This is required for ovs tunneling patch. This patch does not change any functionality. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-14openvswitch: Simplify interface ovs_flow_metadata_from_nlattrs()Pravin B Shelar1-4/+1
This is not functional change, this is just code cleanup. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-06-14openvswitch: Immediately exit on error in ovs_vport_cmd_set().Jesse Gross1-5/+7
It is an error to try to change the type of a vport using the set command. However, while we check that this is an error, we still proceed to allocate memory which then gets freed immediately. This stops processing after noticing the error, which does not actually fix a bug but is more correct. Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-04-30openvswitch: Remove unneeded ovs_netdev_get_ifindex()Thomas Graf1-1/+2
The only user is get_dpifindex(), no need to redirect via the port operations. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-25openvswitch: Use parallel_ops genl.Pravin B Shelar1-4/+8
OVS locking was recently changed to have private OVS lock which simplified overall locking. Therefore there is no need to have another global genl lock to protect OVS data structures. Following patch uses of parallel_ops genl family for OVS. This also allows more granual OVS locking using ovs_mutex for protecting OVS data structures, which gives more concurrencey. E.g multiple genl operations OVS_PACKET_CMD_EXECUTE can run in parallel, etc. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-12/+18
Conflicts: drivers/net/ethernet/emulex/benet/be_main.c drivers/net/ethernet/intel/igb/igb_main.c drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c include/net/scm.h net/batman-adv/routing.c net/ipv4/tcp_input.c The e{uid,gid} --> {uid,gid} credentials fix conflicted with the cleanup in net-next to now pass cred structs around. The be2net driver had a bug fix in 'net' that overlapped with the VLAN interface changes by Patrick McHardy in net-next. An IGB conflict existed because in 'net' the build_skb() support was reverted, and in 'net-next' there was a comment style fix within that code. Several batman-adv conflicts were resolved by making sure that all calls to batadv_is_my_mac() are changed to have a new bat_priv first argument. Eric Dumazet's TS ECR fix in TCP in 'net' conflicted with the F-RTO rewrite in 'net-next', mostly overlapping changes. Thanks to Stephen Rothwell and Antonio Quartulli for help with several of these merge resolutions. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19net: vlan: add protocol argument to packet tagging functionsPatrick McHardy1-1/+1
Add a protocol argument to the VLAN packet tagging functions. In case of HW tagging, we need that protocol available in the ndo_start_xmit functions, so it is stored in a new field in the skb. The new field fits into a hole (on 64 bit) and doesn't increase the sks's size. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-15openvswitch: Simplify datapath locking.Pravin B Shelar1-104/+170
Currently OVS uses combination of genl and rtnl lock to protect datapath state. This was done due to networking stack locking. But this has complicated locking and there are few lock ordering issues with new tunneling protocols. Following patch simplifies locking by introducing new ovs mutex and now this lock is used to protect entire ovs state. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-03-29openvswitch: Move common genl notify code into ovs_notify()Thomas Graf1-20/+16
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-03-29openvswitch: Refine Netlink message size calculation and kill FLOW_BUFSIZEThomas Graf1-24/+52
Kills the FLOW_BUFSIZE constant which needs to be calculated manually and replaces it with key_attr_size() based on nla_total_size(). Calculates the size of datapath messages instead of relying on NLMSG_DEFAULT_SIZE and moves the existing message size calculations into own functions for clarity. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-03-29openvswitch: Use nla_memcpy() to memcpy() data from attributesThomas Graf1-1/+1
Less error prone as it takes into account the length of both the destination buffer and the source attribute and documents when data is copied from an attribute. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-03-29openvswitch: Specify the minimal length of OVS_PACKET_ATTR_PACKET in the policyThomas Graf1-3/+2
Specifying the minimal length in the policy makes it reuseable and documents the interface. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-03-28net: add ETH_P_802_3_MINSimon Horman1-1/+1
Add a new constant ETH_P_802_3_MIN, the minimum ethernet type for an 802.3 frame. Frames with a lower value in the ethernet type field are Ethernet II. Also update all the users of this value that David Miller and I could find to use the new constant. Also correct a bug in util.c. The comparison with ETH_P_802_3_MIN should be >= not >. As suggested by Jesse Gross. Compile tested only. Cc: David Miller <davem@davemloft.net> Cc: Jesse Gross <jesse@nicira.com> Cc: Karsten Keil <isdn@linux-pingi.de> Cc: John W. Linville <linville@tuxdriver.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Bart De Schuymer <bart.de.schuymer@pandora.be> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: Johan Hedberg <johan.hedberg@gmail.com> Cc: linux-bluetooth@vger.kernel.org Cc: netfilter-devel@vger.kernel.org Cc: bridge@lists.linux-foundation.org Cc: linux-wireless@vger.kernel.org Cc: linux1394-devel@lists.sourceforge.net Cc: linux-media@vger.kernel.org Cc: netdev@vger.kernel.org Cc: dev@openvswitch.org Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-27openvswitch: Preallocate reply skb in ovs_vport_cmd_set().Jesse Gross1-12/+18
Allocation of the Netlink notification skb can potentially fail after changing vport configuration. In general, we try to avoid this by undoing any change we made but that is difficult for existing objects. This avoids the problem by preallocating the buffer (which is fixed size). Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-03-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-0/+3
Pull in the 'net' tree to get Daniel Borkmann's flow dissector infrastructure change. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-17Merge branch 'master' of ↵David S. Miller1-6/+7
git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch Conflicts: net/openvswitch/vport-internal_dev.c Jesse Gross says: ==================== A couple of minor enhancements for net-next/3.10. The largest is an extension to allow variable length metadata to be passed to userspace with packets. There is a merge conflict in net/openvswitch/vport-internal_dev.c: A existing commit modifies internal_dev_mac_addr() and a new commit deletes it. The new one is correct, so you can just remove that function. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-15Merge branch 'fixes' of ↵David S. Miller1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch Jesse Gross says: ==================== A few different bug fixes, including several for issues with userspace communication that have gone unnoticed up until now. These are intended for net/3.9. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-27hlist: drop the node parameter from iteratorsSasha Levin1-6/+4
I'm not sure why, but the hlist for each entry iterators were conceived list_for_each_entry(pos, head, member) The hlist ones were greedy and wanted an extra parameter: hlist_for_each_entry(tpos, pos, head, member) Why did they need an extra pos parameter? I'm not quite sure. Not only they don't really need it, it also prevents the iterator from looking exactly like the list iterator, which is unfortunate. Besides the semantic patch, there was some manual work required: - Fix up the actual hlist iterators in linux/list.h - Fix up the declaration of other iterators based on the hlist ones. - A very small amount of places were using the 'node' parameter, this was modified to use 'obj->member' instead. - Coccinelle didn't handle the hlist_for_each_entry_safe iterator properly, so those had to be fixed up manually. The semantic patch which is mostly the work of Peter Senna Tschudin is here: @@ iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host; type T; expression a,c,d,e; identifier b; statement S; @@ -T b; <+... when != b ( hlist_for_each_entry(a, - b, c, d) S | hlist_for_each_entry_continue(a, - b, c) S | hlist_for_each_entry_from(a, - b, c) S | hlist_for_each_entry_rcu(a, - b, c, d) S | hlist_for_each_entry_rcu_bh(a, - b, c, d) S | hlist_for_each_entry_continue_rcu_bh(a, - b, c) S | for_each_busy_worker(a, c, - b, d) S | ax25_uid_for_each(a, - b, c) S | ax25_for_each(a, - b, c) S | inet_bind_bucket_for_each(a, - b, c) S | sctp_for_each_hentry(a, - b, c) S | sk_for_each(a, - b, c) S | sk_for_each_rcu(a, - b, c) S | sk_for_each_from -(a, b) +(a) S + sk_for_each_from(a) S | sk_for_each_safe(a, - b, c, d) S | sk_for_each_bound(a, - b, c) S | hlist_for_each_entry_safe(a, - b, c, d, e) S | hlist_for_each_entry_continue_rcu(a, - b, c) S | nr_neigh_for_each(a, - b, c) S | nr_neigh_for_each_safe(a, - b, c, d) S | nr_node_for_each(a, - b, c) S | nr_node_for_each_safe(a, - b, c, d) S | - for_each_gfn_sp(a, c, d, b) S + for_each_gfn_sp(a, c, d) S | - for_each_gfn_indirect_valid_sp(a, c, d, b) S + for_each_gfn_indirect_valid_sp(a, c, d) S | for_each_host(a, - b, c) S | for_each_host_safe(a, - b, c, d) S | for_each_mesh_entry(a, - b, c, d) S ) ...+> [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c] [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c] [akpm@linux-foundation.org: checkpatch fixes] [akpm@linux-foundation.org: fix warnings] [akpm@linux-foudnation.org: redo intrusive kvm changes] Tested-by: Peter Senna Tschudin <peter.senna@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-22openvswitch: Call genlmsg_end in queue_userspace_packetRich Lane1-0/+1
Without genlmsg_end the upcall message ends (according to nlmsg_len) after the struct ovs_header. Signed-off-by: Rich Lane <rlane@bigswitch.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-02-22openvswitch: Fix ovs_vport_cmd_new return value on successRich Lane1-0/+1
If the pointer does not represent an error then the PTR_ERR macro may still return a nonzero value. Signed-off-by: Rich Lane <rlane@bigswitch.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-02-22openvswitch: Fix ovs_vport_cmd_del return value on successRich Lane1-0/+1
If the pointer does not represent an error then the PTR_ERR macro may still return a nonzero value. The fix is the same as in ovs_vport_cmd_set. Signed-off-by: Rich Lane <rlane@bigswitch.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-02-22openvswitch: Allow OVS_USERSPACE_ATTR_USERDATA to be variable length.Ben Pfaff1-5/+6
Until now, the optional OVS_USERSPACE_ATTR_USERDATA attribute had to be exactly 64 bits long, if it was present. However, 64 bits is not enough space to associate as much information with a flow as would be convenient for some userspace features now under development. This commit generalizes the attribute, allowing it to be any length. This generalization is backward-compatible: if userspace only uses 64-bit attributes, then it will not see any change in behavior. CC: Romain Lenglet <rlenglet@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-02-06net: adjust skb_gso_segment() for calling in rx pathCong Wang1-1/+1
skb_gso_segment() is almost always called in tx path, except for openvswitch. It calls this function when it receives the packet and tries to queue it to user-space. In this special case, the ->ip_summed check inside skb_gso_segment() is no longer true, as ->ip_summed value has different meanings on rx path. This patch adjusts skb_gso_segment() so that we can at least avoid such warnings on checksum. Cc: Jesse Gross <jesse@nicira.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-09openvswitch: Use FIELD_SIZEOF() in dp_init().YOSHIFUJI Hideaki / 吉藤英明1-2/+1
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-09openvswitch: Change ENOENT return value to ENODEV in lookup_vport().Jarno Rajahalme1-1/+1
This reduces the number of valid "no such device" error values that need special attention by the caller. Userspace code will need to keep on checking for both ENODEV and ENOENT as long as older kernel modules are around. Signed-off-by: Jarno Rajahalme <jarno.rajahalme@nsn.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-11-26openvswitch: add skb mark matching and set actionAnsis Atteka1-0/+3
This patch adds support for skb mark matching and set action. Signed-off-by: Ansis Atteka <aatteka@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-11-16net: openvswitch: use this_cpu_ptr per-cpu helperShan Wei1-2/+2
just use more faster this_cpu_ptr instead of per_cpu_ptr(p, smp_processor_id()); Signed-off-by: Shan Wei <davidshan@tencent.com> Reviewed-by: Christoph Lameter <cl@linux.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-11-13openvswitch: add ipv6 'set' actionAnsis Atteka1-0/+20
This patch adds ipv6 set action functionality. It allows to change traffic class, flow label, hop-limit, ipv6 source and destination address fields. Signed-off-by: Ansis Atteka <aatteka@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-09-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-3/+3
Conflicts: net/netfilter/nfnetlink_log.c net/netfilter/xt_LOG.c Rather easy conflict resolution, the 'net' tree had bug fixes to make sure we checked if a socket is a time-wait one or not and elide the logging code if so. Whereas on the 'net-next' side we are calculating the UID and GID from the creds using different interfaces due to the user namespace changes from Eric Biederman. Signed-off-by: David S. Miller <davem@davemloft.net>