linux - Linux Kernel (branches are rebased on master from time to time)

Age	Commit message (Collapse)	Author	Files	Lines
2014-09-07	net: phy: mdio-sun4i: don't select REGULATOR	Beniamino Galvani	1	-2/+0
	The mdio-sun4i driver automatically selects REGULATOR and REGULATOR_FIXED_VOLTAGE because it uses the regulator API. But a driver selecting a subsystem increases the chance of generating circular Kconfig dependencies, especially when other drivers depend on the selected symbol. Since the regulator API functions are replaced with no-ops when REGULATOR is disabled, the driver can be built successfully even without regulator support and so those 'select' dependencies can be safely dropped. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Beniamino Galvani <b.galvani@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-07	rose: use %*ph specifier	Andy Shevchenko	1	-1/+2
	Instead of dereference each byte let's use %*ph specifier in the printk() calls. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-07	amd-xgbe-phy: Fix build break for missing declaration	Tom Lendacky	1	-0/+1
	A previous patch inadvertently deleted a declaration in the amd_xgbe_an_tx_training function causing the build to fail. Add the declaration for 'priv' back to the function. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-06	Merge branch 'master' of ↵	David S. Miller	5	-94/+128
	git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2014-09-06 This series contains updates to e1000 and igb. Krzysztof provides a patch to cleanup the coding style in e1000 to quiet checkpatch.pl warnings. Todd adds two boolean flags to igb to allow for changes in the advertised EEE speeds from ethtool. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-06	tcp: remove obsolete comment about TCP_SKB_CB(skb)->when in tcp_fragment()	Neal Cardwell	1	-3/+0
	The TCP_SKB_CB(skb)->when field no longer exists as of recent change 7faee5c0d514 ("tcp: remove TCP_SKB_CB(skb)->when"). And in any case, tcp_fragment() is called on already-transmitted packets from the __tcp_retransmit_skb() call site, so copying timestamps of any kind in this spot is quite sensible. Signed-off-by: Neal Cardwell <ncardwell@google.com> Reported-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-06	igb: add flags to set eee advertisement mode	Todd Fujinaka	4	-18/+49
	Change e1000_set_eee and e1000_set_eee_i35(0\|4) to allow changes in the advertised EEE speeds from ethtool. Adds two boolean flags to e1000_set_eee_i35(0\|4) to pass in advertised speed data. Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-06	e1000: e1000_ethertool.c coding style fixes	Krzysztof Majzerowicz-Jaszcz	1	-76/+79
	Fixed many errors/warnings and checks in e1000_ethtool.c reported by checkpatch.pl. Suggestions from Joe Perches and Alexander Duyck applied as well Signed-off-by: Krzysztof Majzerowicz-Jaszcz <cristos@vipserv.org> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-05	mlx4: only pull headers into skb head	Eric Dumazet	1	-5/+8
	Use the new fancy eth_get_headlen() to pull exactly the headers into skb->head. This speeds up GRE traffic (or more generally tunneled traffuc), as GRO can aggregate up to 17 MSS per GRO packet instead of 8. (Pulling too much data was forcing GRO to keep 2 frags per MSS) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	mISDN: remove DSP_NEVER_DEFINED and adjust code identation	Colin Ian King	1	-56/+53
	The DSP_NEVER_DEFINED #ifdef is confusing, it slips in an extra } which is not required because the previous code is indented incorrectly. Correct the identation and remove the extraneous DSP_NEVER_DEFINED Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	bonding: add slave netlink policy and put slave-related ops together	Jiri Pirko	1	-2/+7
	Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	Merge branch 'tcp'	David S. Miller	7	-36/+36
	Eric Dumazet says: ==================== tcp: deduplicate TCP_SKB_CB(skb)->when TCP_SKB_CB(skb)->when has different meaning in output and input paths. In output path, it contains a timestamp. In input path, it contains an ISN, chosen by tcp_timewait_state_process() Its usage in output path is obsolete after usec timestamping. Lets simplify and clean this. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	tcp: remove TCP_SKB_CB(skb)->when	Eric Dumazet	5	-36/+31
	After commit 740b0f1841f6 ("tcp: switch rtt estimations to usec resolution"), we no longer need to maintain timestamps in two different fields. TCP_SKB_CB(skb)->when can be removed, as same information sits in skb_mstamp.stamp_jiffies Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn	Eric Dumazet	5	-6/+11
	TCP_SKB_CB(skb)->when has different meaning in output and input paths. In output path, it contains a timestamp. In input path, it contains an ISN, chosen by tcp_timewait_state_process() Lets add a different name to ease code comprehension. Note that 'when' field will disappear in following patch, as skb_mstamp already contains timestamp, the anonymous union will promptly disappear as well. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	Merge branch 'eth_get_headlen'	David S. Miller	8	-239/+68
	Alexander Duyck says: ==================== net: Drop get_headlen functions in favor of generic function This series replaces the igb_get_headlen and ixgbe_get_headlen functions with a generic function named eth_get_headlen. I have done some performance testing on ixgbe with 258 byte frames since the calls are only used on frames larger than 256 bytes and have seen no significant difference in CPU utilization. v2: renamed __skb_get_poff to skb_get_poff renamed ___skb_get_poff to __skb_get_poff ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	ixgbe: use new eth_get_headlen interface	Alexander Duyck	1	-115/+1
	Update ixgbe to drop the ixgbe_get_headlen function in favor of eth_get_headlen. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	igb: use new eth_get_headlen interface	Alexander Duyck	1	-108/+1
	Update igb to drop the igb_get_headlen function in favor of eth_get_headlen. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	net: Add function for parsing the header length out of linear ethernet frames	Alexander Duyck	6	-16/+66
	This patch updates some of the flow_dissector api so that it can be used to parse the length of ethernet buffers stored in fragments. Most of the changes needed were to __skb_get_poff as it needed to be updated to support sending a linear buffer instead of a skb. I have split __skb_get_poff into two functions, the first is skb_get_poff and it retains the functionality of the original __skb_get_poff. The other function is __skb_get_poff which now works much like __skb_flow_dissect in relation to skb_flow_dissect in that it provides the same functionality but works with just a data buffer and hlen instead of needing an skb. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	Merge branch 'timestamping'	David S. Miller	7	-67/+80
	Alexander Duyck says: ==================== This change makes it so that the core path for the phy timestamping logic is shared between skb_tx_tstamp and skb_complete_tx_timestamp. In addition it provides a means of using the same skb clone type path in non phy timestamping drivers. The main motivation for this is to enable non-phy drivers to be able to manipulate tx timestamp skbs for such things as putting them in lists or setting aside buffer in the context block. v2: Incorporated suggested changes from Willem de Bruijn and Eric Dumazet dropped uneeded comment restored order of hwtstamp vs swtstamp added destructor for skb Dropped usage of skb_complete_tx_timestamp as a kfree_skb w/ destructor v3: Updated destructor handling and dealt with socket reference counting issues v4: Split out combining destructors into a separate patch ====================
2014-09-05	net: merge cases where sock_efree and sock_edemux are the same function	Alexander Duyck	3	-3/+7
	Since sock_efree and sock_demux are essentially the same code for non-TCP sockets and the case where CONFIG_INET is not defined we can combine the code or replace the call to sock_edemux in several spots. As a result we can avoid a bit of unnecessary code or code duplication. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	net-timestamp: Make the clone operation stand-alone from phy timestamping	Alexander Duyck	6	-21/+40
	The phy timestamping takes a different path than the regular timestamping does in that it will create a clone first so that the packets needing to be timestamped can be placed in a queue, or the context block could be used. In order to support these use cases I am pulling the core of the code out so it can be used in other drivers beyond just phy devices. In addition I have added a destructor named sock_efree which is meant to provide a simple way for dropping the reference to skb exceptions that aren't part of either the receive or send windows for the socket, and I have removed some duplication in spots where this destructor could be used in place of sock_edemux. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	net-timestamp: Merge shared code between phy and regular timestamping	Alexander Duyck	2	-52/+42
	This change merges the shared bits that exist between skb_tx_tstamp and skb_complete_tx_timestamp. By doing this we can avoid the two diverging as there were already changes pushed into skb_tx_tstamp that hadn't made it into the other function. In addition this resolves issues with the fact that skb_complete_tx_timestamp was included in linux/skbuff.h even though it was only compiled in if phy timestamping was enabled. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	ipv4: harden fnhe_hashfun()	Eric Dumazet	2	-5/+6
	Lets make this hash function a bit secure, as ICMP attacks are still in the wild. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	net-timestamp: fix allocation error in test	Willem de Bruijn	1	-1/+0
	A buffer is incorrectly zeroed to the length of the pointer. If cfg_payload_len < sizeof(void *) this can overwrites unrelated memory. The buffer contents are never read, so no need to zero. Fixes: 8fe2f761cae9 ("net-timestamp: expand documentation") Reported-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	hyperv: NULL dereference on error	Dan Carpenter	1	-4/+2
	We try to call free_netvsc_device(net_device) when "net_device" is NULL. It leads to an Oops. Fixes: f90251c8a6d0 ('hyperv: Increase the buffer length for netvsc_channel_cb()') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	Merge branch 'master' of ↵	David S. Miller	12	-127/+156
	git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2014-09-04 This series contains updates to i40e, i40evf, ixgbe and ixgbevf. Catherine adds dual speed module support to i40e. Updates i40e to allow the user to change link settings when the link is down. Serey renames i40e_ndo_set_vf_spoofck() to i40e_ndo_set_vf_spookchk() to be more consistent with what is defined in netdev and removes a unnecessary variable assignment. Jesse makes a malicious driver detection warning only print if extended driver string is enabled for i40e. Fixes a panic under traffic load when resetting or if/whenever there was a Tx-timeout because we were enabling the Tx queue to early. Anjali fixes an issue when PF reset fails, where we were trying to restart the admin queue which has not been setup at that point. This resolves an occasional kernel panic when PF reset fails for some reason. Ethan Zhao replaces the use of a local i40e_vfs_are_assigned() with the global kernel pci_vfs_assigned() for i40e. Alex cleans up the FDB handling for ixgbe. This change makes it so that the behavior for FDB handling is consistent between both the SR-IOV and non-SR-IOV cases. The main change is that we perform bounds checking on the number of SR-IOV addresses regardless of if SR-IOV is enabled or not as we can only support a certain number of addresses in the hardware. Emil extends the pending Tx work check to the VF interfaces, where the driver initiates a reset of the interface on link loss with pending Tx work in order to clear the rings. Introduces a delay for 82599 VFs of at least 500 usecs to make sure the VFLINKS value is correct, since this bit tends to flap when a DA or SFP+ cable is disconnected. Jacob adds code comments in ixgbe to make it more obvious that we are resetting features based on the fact that we do not have MSI-X enabled, and cannot use the previous settings. Also resolves a kernel NULL pointer dereference by limiting the combined total of MACVLAN and SR-IOV VFs, since the hardware has a limited number of pools available (64). Previously, no checks were in place to limit the number of accelerated MACVLAN devices based on the number of pools, which would be ok since there was already a limit for these well below the number of available pools. However, SR-IOV uses the very same pools, therefore we need to ensure that the total number of pools does not exceed the number of pools available in the hardware. v2: - clean up code comment in patch 5 by replacing "an" with "auto negotiation" based on feedback from Sergei Shtylyov - removed un-necessary parenthesis around function call in patch 8 based on feedback from Sergei Shtylyov ====================
2014-09-05	net: ethernet: cpsw: improve interrupt lookup logic in cpsw_probe()	Daniel Mack	1	-8/+14
	Simplify the interrupt resource lookup code in cpsw_probe() by the following: * Only look at the first member of the resource. As the driver only works for DT-enabled platforms anyway, a resource of type IORESOURCE_IRQ will only contain one single entry (res->start == res->end), so there is no need for the iteration. * Add a bounds check to avoid overflows if we are passed more than ARRAY_SIZE(priv->irqs_table) resources. * Assign 'ret' with the return value of devm_request_irq() so that cpsw_probe() returns the appropriate error code. * If devm_request_irq() fails, report the error code in the log message. Signed-off-by: Daniel Mack <zonque@gmail.com> Acked-by: Mugunthan V N <mugunthanvnm@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	ipv4: fix a race in update_or_create_fnhe()	Eric Dumazet	3	-7/+9
	nh_exceptions is effectively used under rcu, but lacks proper barriers. Between kzalloc() and setting of nh->nh_exceptions(), we need a proper memory barrier. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: 4895c771c7f00 ("ipv4: Add FIB nexthop exceptions.") Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	l2tp: fix missing line continuation	Andy Zhou	1	-1/+1
	This syntax error was covered by L2TP_REFCNT_DEBUG not being set by default. Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	Merge branch 'amd-xgbe-next'	David S. Miller	12	-95/+102
	Tom Lendacky says: ==================== amd-xgbe: AMD XGBE driver updates 2014-09-03 The following series of patches includes fixes/updates to the driver. - Query the device for the actual speed mode (KR/KX) rather than trying to track it - Update parallel detection logic to support KR mode - Fix new warnings from checkpatch in the amd-xgbe and amd-xgbe-phy driver This patch series is based on net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	amd-xgbe-phy: Checkpatch driver fixes	Lendacky, Thomas	1	-4/+0
	This patch contains fixes identified by checkpatch when run with the strict option. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	amd-xgbe: Checkpatch driver fixes	Lendacky, Thomas	11	-26/+2
	This patch contains fixes identified by checkpatch when run with the strict option. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	amd-xgbe-phy: Enhance parallel detection to support KR speed	Lendacky, Thomas	1	-2/+28
	Add support to allow parallel detection to work in KR speed. With both speed modes of KX and KR supported, KX must be checked first. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	amd-xgbe-phy: Check device for current speed mode (KR/KX)	Lendacky, Thomas	1	-63/+72
	Since device resets can change the current mode it's possible to think the device is in a different mode than it actually is. Rather than trying to determine every place that is needed to set/save the current mode, be safe and check the devices actual mode when needed rather than trying to track it. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	Merge branch 'r8152-next'	David S. Miller	1	-28/+31
	Hayes Wang says: ==================== r8152: random MAC address If the interface has invalid MAC address, it couldn't be used. In order to let it work normally, give a random one. v3: Remove ether_addr_copy(dev->perm_addr, dev->dev_addr); v2: Use "%pM" format specifier for printing a MAC address. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	r8152: use eth_hw_addr_random	hayeswang	1	-16/+19
	If the hw doesn't have a valid MAC address, give a random one and set it to the hw. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	r8152: change the location of rtl8152_set_mac_address	hayeswang	1	-17/+17
	Exchange the location of rtl8152_set_mac_address() and set_ethernet_addr(). Then, the set_ethernet_addr() could set the MAC address by calling rtl8152_set_mac_address() later. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	Merge branch 'rx_copybreak'	David S. Miller	6	-3/+200
	Govindarajulu Varadarajan says: ==================== enic: Add support for rx_copybreak The following series implements rx_copybreak. dma_map_single()/dma_unmap_single() is more expensive than alloc_skb & memcpy for smaller packets. By doing this we can reuse the dma buff which is already mapped. This is very useful when iommu is on. The default skb copybreak value is 256. When iommu is on, we can go much higher than 256. All the drivers that supports rx_copybreak provides module parameter to change this value. Since module parameter is the least preferred way for changing driver values, this series adds ethtool support for setting rx_copybreak. v4: Validate tunable length in ethtool_get_tunable, not in driver implemented function. Loose tunable_ops array for each tunable type. Define one function and let the driver use switch case for each type. Use double underscore for data type in UAPI headers. Use const qualifier where possible. v3: Add tunable namespace to ethtool. Use new ethtool cmd ETHTOOL_S/GTUNABLE to set/get rx_copybreak from userspace. v2: Add new ethtool_cmd for DMA buffer parameters, instead of adding new members to existing ethtool_ringparam. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	enic: Add tunable_ops support for rx_copybreak	Govindarajulu Varadarajan	1	-0/+39
	This patch adds support for setting/getting rx_copybreak using generic ethtool tunable. Defines enic_get_tunable() & enic_set_tunable() to get/set rx_copybreak. As of now, these two function supports only rx_copybreak. Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	ethtool: Add generic options for tunables	Govindarajulu Varadarajan	3	-0/+113
	This patch adds new ethtool cmd, ETHTOOL_GTUNABLE & ETHTOOL_STUNABLE for getting tunable values from driver. Add get_tunable and set_tunable to ethtool_ops. Driver implements these functions for getting/setting tunable value. Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	enic: implement rx_copybreak	Govindarajulu Varadarajan	2	-3/+48
	Calling dma_map_single()/dma_unmap_single() is quite expensive compared to copying a small packet. So let's copy short frames and keep the buffers mapped. Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	dev_ioctl: remove dev_load() CAP_SYS_MODULE message	Daniel Borkmann	1	-5/+2
	Marcel reported to see the following message when autoloading is being triggered when adding nlmon device: Loading kernel module for a network device with CAP_SYS_MODULE (deprecated). Use CAP_NET_ADMIN and alias netdev-nlmon instead. This false-positive happens despite with having correct capabilities set, e.g. through issuing `ip link del dev nlmon` more than once on a valid device with name nlmon, but Marcel has also seen it on creation time when no nlmon module is previously compiled-in or loaded as module and the device name equals a link type name (e.g. nlmon, vxlan, team). Stephen says: The netdev module alias is a hold over from the past. For normal devices, people used to create a alias eth0 to and point it to the type of network device used, that was back in the bad old ISA days before real discovery. Also, the tunnels create module alias for the control device and ip used to use this to autoload the tunnel device. The message is bogus and should just be removed, I also see it in a couple of other cases where tap devices are renamed for other usese. As mentioned in 8909c9ad8ff0 ("net: don't allow CAP_NET_ADMIN to load non-netdev kernel modules"), we nevertheless still might want to leave the old autoloading behaviour in place as it could break old scripts, so for now, lets just remove the log message as Stephen suggests. Reference: http://thread.gmane.org/gmane.linux.kernel/1105168 Reported-by: Marcel Holtmann <marcel@holtmann.org> Suggested-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05	net: bpf: make eBPF interpreter images read-only	Daniel Borkmann	11	-32/+144
	With eBPF getting more extended and exposure to user space is on it's way, hardening the memory range the interpreter uses to steer its command flow seems appropriate. This patch moves the to be interpreted bytecode to read-only pages. In case we execute a corrupted BPF interpreter image for some reason e.g. caused by an attacker which got past a verifier stage, it would not only provide arbitrary read/write memory access but arbitrary function calls as well. After setting up the BPF interpreter image, its contents do not change until destruction time, thus we can setup the image on immutable made pages in order to mitigate modifications to that code. The idea is derived from commit 314beb9bcabf ("x86: bpf_jit_comp: secure bpf jit against spraying attacks"). This is possible because bpf_prog is not part of sk_filter anymore. After setup bpf_prog cannot be altered during its life-time. This prevents any modifications to the entire bpf_prog structure (incl. function/JIT image pointer). Every eBPF program (including classic BPF that are migrated) have to call bpf_prog_select_runtime() to select either interpreter or a JIT image as a last setup step, and they all are being freed via bpf_prog_free(), including non-JIT. Therefore, we can easily integrate this into the eBPF life-time, plus since we directly allocate a bpf_prog, we have no performance penalty. Tested with seccomp and test_bpf testsuite in JIT/non-JIT mode and manual inspection of kernel_page_tables. Brad Spengler proposed the same idea via Twitter during development of this patch. Joint work with Hannes Frederic Sowa. Suggested-by: Brad Spengler <spender@grsecurity.net> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Kees Cook <keescook@chromium.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-04	net: systemport: update UMAC_CMD only when link is detected	Florian Fainelli	1	-3/+6
	When we bring the interface down, phy_stop() will schedule the PHY state machine to call our link adjustment callback. By the time we do so, we may have clock gated off the SYSTEMPORT hardware block, and this will cause bus errors to happen in bcm_sysport_adj_link(): Make sure that we only touch the UMAC_CMD register when there is an actual link. This is safe to do for two reasons: - updating the Ethernet MAC registers only make sense when a physical link is present - the PHY library state machine first set phydev->link = 0 before invoking phydev->adjust_link in the PHY_HALTED case This is a similar fix to the GENET one: c677ba8b3c47650358572091ed8a6af50bfca877 ("net: bcmgenet: update UMAC_CMD only when link is detected"). Fixes: 80105befdb4b ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-04	ipv4: implement igmp_qrv sysctl to tune igmp robustness variable	Hannes Frederic Sowa	4	-16/+31
	As in IPv6 people might increase the igmp query robustness variable to make sure unsolicited state change reports aren't lost on the network. Add and document this new knob to igmp code. RFCs allow tuning this parameter back to first IGMP RFC, so we also use this setting for all counters, including source specific multicast. Also take over sysctl value when upping the interface and don't reuse the last one seen on the interface. Cc: Flavio Leitner <fbl@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-04	ipv6: add sysctl_mld_qrv to configure query robustness variable	Hannes Frederic Sowa	4	-10/+31
	This patch adds a new sysctl_mld_qrv knob to configure the mldv1/v2 query robustness variable. It specifies how many retransmit of unsolicited mld retransmit should happen. Admins might want to tune this on lossy links. Also reset mld state on interface down/up, so we pick up new sysctl settings during interface up event. IPv6 certification requests this knob to be available. I didn't make this knob netns specific, as it is mostly a setting in a physical environment and should be per host. Cc: Flavio Leitner <fbl@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-04	ixgbe: limit combined total of macvlan and SR-IOV VFs	Jacob Keller	2	-6/+16
	Hardware has a limited number of pools available (64). Previously, no checks were in place to limit the number of accelerated macvlan devices based on the number of pools. Normally this would be ok, because there was already a limit for these well below the number of available pools. However, SR-IOV uses the very same pools. Therefor, we need to ensure that the total number of pools (number of VFs plus the number of non-VF pools in use for accelerated macvlans) does not exceed the number of pools available in hardware. This patch resolves a kernel NULL pointer dereference caused by the following commands: $modprobe ixgbe max_vfs=63 $ethtool -K eth2 l2-fwd-offload on $ip link add link eth2 macvlan0 type macvlan $ip link set dev macvlan0 up [ 992.950080] BUG: unable to handle kernel NULL pointer dereference at 0000000000000056 [ 992.951109] IP: [<ffffffffa003b71e>] ixgbe_disable_fwd_ring+0x1e/0xf0 [ixgbe] [ 992.951684] PGD 22a80e067 PUD 232e9b067 PMD 0 [ 992.952389] Oops: 0000 [#1] SMP [ 992.953014] Modules linked in: nfsd lockd nfs_acl exportfs auth_rpcgss oid_registry sunrpc bridge stp llc vhost_net macvtap macvlan vhost tun kvm_intel kvm ioatdma ixgbe mdio igb dca [ 992.956042] CPU: 2 PID: 11928 Comm: ifconfig Not tainted 3.16.0-rc6-net-next-07-29-2014-FCoE+ #1 [ 992.956915] Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.03.0003.041920141333 04/19/2014 [ 992.957791] task: ffff8804341c0000 ti: ffff8801d7dc8000 task.ti: ffff8801d7dc8000 [ 992.958660] RIP: 0010:[<ffffffffa003b71e>] [<ffffffffa003b71e>] ixgbe_disable_fwd_ring+0x1e/0xf0 [ixgbe] [ 992.959613] RSP: 0018:ffff8801d7dcbbb8 EFLAGS: 00010286 [ 992.960093] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000001 [ 992.960575] RDX: ffff880232eb7000 RSI: 0000000000000000 RDI: ffff88022dc05800 [ 992.961059] RBP: ffff8801d7dcbbd8 R08: 0000000000000000 R09: 0000000000000000 [ 992.961541] R10: 0000000000000001 R11: 0000000000000000 R12: ffff88022ec20980 [ 992.962023] R13: ffff880232eb7000 R14: 0000000000000001 R15: 0000000000000001 [ 992.962508] FS: 00007fab264887a0(0000) GS:ffff880237640000(0000) knlGS:0000000000000000 [ 992.963378] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 992.963858] CR2: 0000000000000056 CR3: 000000022a939000 CR4: 00000000001427e0 [ 992.964340] Stack: [ 992.964806] ffff88022ec28840 ffff88022ec20980 ffff88022dc05800 ffff880232eb7000 [ 992.965976] ffff8801d7dcbc28 ffffffffa003bae8 ffff8801d7dcbbe8 0000000000000400 [ 992.967147] 000000000000000d ffff88022ec20980 ffff88022ec20000 ffff88022dc05800 [ 992.968319] Call Trace: [ 992.968795] [<ffffffffa003bae8>] ixgbe_fwd_ring_up+0x88/0x280 [ixgbe] [ 992.969284] [<ffffffffa0041d83>] ixgbe_fwd_add+0x173/0x220 [ixgbe] [ 992.969767] [<ffffffffa015056c>] macvlan_open+0x1bc/0x230 [macvlan] [ 992.970256] [<ffffffff816b8de7>] __dev_open+0xd7/0x150 [ 992.970735] [<ffffffff816b8bd7>] __dev_change_flags+0xa7/0x170 [ 992.971220] [<ffffffff816b8ccb>] dev_change_flags+0x2b/0x70 [ 992.971703] [<ffffffff817471b2>] devinet_ioctl+0x602/0x6d0 [ 992.972184] [<ffffffff81748168>] inet_ioctl+0x78/0x90 [ 992.972666] [<ffffffff816a143b>] sock_do_ioctl+0x2b/0x70 [ 992.973146] [<ffffffff816a14ed>] sock_ioctl+0x6d/0x260 [ 992.973627] [<ffffffff811ad3b4>] do_vfs_ioctl+0x84/0x540 [ 992.974109] [<ffffffff811a4c81>] ? final_putname+0x21/0x50 [ 992.974593] [<ffffffff818725d5>] ? sysret_check+0x22/0x5d [ 992.975073] [<ffffffff811ad901>] SyS_ioctl+0x91/0xa0 [ 992.975550] [<ffffffff818725a9>] system_call_fastpath+0x16/0x1b [ 992.976026] Code: ff 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec 20 48 89 5d e8 4c 89 65 f0 48 89 f3 4c 89 6d f8 4c 8b a7 08 02 00 00 <44> 0f b6 6e 56 44 03 af 14 02 00 00 4c 89 e7 e8 5e f2 ff ff be [ 992.982261] RIP [<ffffffffa003b71e>] ixgbe_disable_fwd_ring+0x1e/0xf0 [ixgbe] [ 992.983212] RSP <ffff8801d7dcbbb8> [ 992.983681] CR2: 0000000000000056 [ 992.984248] ---[ end trace 9f54802b5cc3638b ]--- Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-04	ixgbe: add comment noting recalculation of queues	Jacob Keller	1	-0/+8
	Since we previously called ixgbe_set_num_queues just prior to attempting to set our interrupt scheme, it may be non obvious why we have to call it again inside the function. Add a comment which helps make it more obvious that we are resetting features based on the fact that we do not have MSI-X enabled, and cannot use the previous settings. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-04	ixgbevf: introduce delay for checking VFLINKS on 82599	Emil Tantilov	1	-0/+15
	VFLINKS.LINKUP bit tends to flap when a DA or SFP+ cable is disconnected. It can take up to 500 usecs for the LINKUP bit to be correct. This patch resolves the issue by introducing a delay for 82599 VFs of at least 500 usecs to make sure the VFLINKS value is correct. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-04	ixgbe: reset interface on link loss with pending Tx work from the VF	Emil Tantilov	2	-12/+49
	ixgbe initiates a reset of the interface on link loss with pending Tx work in order to clear the rings. This patch extends the pending Tx work check to the VF interfaces with the same purpose. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-04	ixgbe: Cleanup FDB handling code	Alexander Duyck	1	-30/+4
	This change makes it so that the behavior for FDB handling is consistent between both the SR-IOV and non-SR-IOV cases. The main change here is that we perform bounds checking on the number of SR-IOV addresses regardless of if SR-IOV is enabled or not as we can only support a certain number of addresses in the hardware. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>