linux - Linux Kernel (branches are rebased on master from time to time)

Age	Commit message (Collapse)	Author	Files	Lines
2015-03-01	net/hsr: Fix NULL pointer dereference and refcnt bugs when deleting a HSR ↵	Arvid Brodin	3	-3/+14
	interface. To repeat: $ sudo ip link del hsr0 BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 IP: [<ffffffff8187f495>] hsr_del_port+0x15/0xa0 etc... Bug description: As part of the hsr master device destruction, hsr_del_port() is called for each of the hsr ports. At each such call, the master device is updated regarding features and mtu. When the master device is freed before the slave interfaces, master will be NULL in hsr_del_port(), which led to a NULL pointer dereference. Additionally, dev_put() was called on the master device itself in hsr_del_port(), causing a refcnt error. A third bug in the same code path was that the rtnl lock was not taken before hsr_del_port() was called as part of hsr_dev_destroy(). The reporter (Nicolas Dichtel) also said: "hsr_netdev_notify() supposes that the port will always be available when the notification is for an hsr interface. It's wrong. For example, netdev_wait_allrefs() may resend NETDEV_UNREGISTER.". As a precaution against this, a check for port == NULL was added in hsr_dev_notify(). Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Fixes: 51f3c605318b056a ("net/hsr: Move slave init to hsr_slave.c.") Signed-off-by: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-01	net: pasemi: Use setup_timer and mod_timer	Vaishali Thakkar	1	-5/+3
	Use timer API functions setup_timer and mod_timer instead of structure assignments as they are standard way to set the timer and to update the expire field of an active timer respectively. This is done using Coccinelle and semantic patch used for this is as follows: // <smpl> @@ expression x,y,z,a,b; @@ -init_timer (&x); +setup_timer (&x, y, z); +mod_timer (&a, b); -x.function = y; -x.data = z; -x.expires = b; -add_timer(&a); // </smpl> Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-01	net: stmmac: Use setup_timer and mod_timer	Vaishali Thakkar	1	-5/+5
	Use timer API functions setup_timer and mod_timer instead of structure assignments as they are standard way to set the timer and to update the expire field of an active timer respectively. This is done using Coccinelle and semantic patch used for this is as follows: // <smpl> @@ expression x,y,z,a,b; @@ -init_timer (&x); +setup_timer (&x, y, z); +mod_timer (&a, b); -x.function = y; -x.data = z; -x.expires = b; -add_timer(&a); // </smpl> Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-01	net: 8390: axnet_cs: Use setup_timer and mod_timer	Vaishali Thakkar	1	-5/+2
	Use timer API functions setup_timer and mod_timer instead of structure assignments as they are standard way to set the timer and to update the expire field of an active timer respectively. This is done using Coccinelle and semantic patch used for this is as follows: // <smpl> @@ expression x,y,z,a,b; @@ -init_timer (&x); +setup_timer (&x, y, z); +mod_timer (&a, b); -x.function = y; -x.data = z; -x.expires = b; -add_timer(&a); // </smpl> Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-01	net: 8390: pcnet_cs: Use setup_timer and mod_timer	Vaishali Thakkar	1	-5/+2
	Use timer API functions setup_timer and mod_timer instead of structure assignments as they are standard way to set the timer and to update the expire field of an active timer respectively. This is done using Coccinelle and semantic patch used for this is as follows: // <smpl> @@ expression x,y,z,a,b; @@ -init_timer (&x); +setup_timer (&x, y, z); +mod_timer (&a, b); -x.function = y; -x.data = z; -x.expires = b; -add_timer(&a); // </smpl> Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-01	net: smc91c92_cs: Use setup_timer and mod_timer	Vaishali Thakkar	1	-5/+2
	Use timer API functions setup_timer and mod_timer instead of structure assignments as they are standard way to set the timer and to update the expire field of an active timer respectively. This is done using Coccinelle and semantic patch used for this is as follows: // <smpl> @@ expression x,y,z,a,b; @@ -init_timer (&x); +setup_timer (&x, y, z); +mod_timer (&a, b); -x.function = y; -x.data = z; -x.expires = b; -add_timer(&a); // </smpl> Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-01	netxen_nic: Fix trivial typos in comments	Yannick Guerrini	1	-2/+2
	Change 'mutliple' to 'multiple' Change 'Firmare' to 'Firmware' Signed-off-by: Yannick Guerrini <yguerrini@tomshardware.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-01	qlcnic: Fix trivial typo in comment	Yannick Guerrini	1	-1/+1
	Change 'Firmare' to 'Firmware' Signed-off-by: Yannick Guerrini <yguerrini@tomshardware.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-01	net: ti: cpsw: add hibernation callbacks	Grygorii Strashko	1	-4/+3
	Setting a dev_pm_ops suspend/resume pair but not a set of hibernation functions means those pm functions will not be called upon hibernation. Fix this by using SIMPLE_DEV_PM_OPS, which appropriately assigns the suspend and hibernation handlers and move cpsw_suspend/resume calbacks under CONFIG_PM_SLEEP to avoid build warnings. Signed-off-by: Grygorii Strashko <Grygorii.Strashko@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-01	net: davinci_mdio: add hibernation callbacks	Grygorii Strashko	1	-2/+3
	Setting a dev_pm_ops suspend_late/resume_early pair but not a set of hibernation functions means those pm functions will not be called upon hibernation. Fix this by using SET_LATE_SYSTEM_SLEEP_PM_OPS, which appropriately assigns the suspend and hibernation handlers and move davinci_mdio_x callbacks under CONFIG_PM_SLEEP to avoid build warnings. Signed-off-by: Grygorii Strashko <Grygorii.Strashko@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-01	net: do not use rcu in rtnl_dump_ifinfo()	Eric Dumazet	1	-3/+1
	We did a failed attempt in the past to only use rcu in rtnl dump operations (commit e67f88dd12f6 "net: dont hold rtnl mutex during netlink dump callbacks") Now that dumps are holding RTNL anyway, there is no need to also use rcu locking, as it forbids any scheduling ability, like GFP_KERNEL allocations that controlling path should use instead of GFP_ATOMIC whenever possible. This should fix following splat Cong Wang reported : [ INFO: suspicious RCU usage. ] 3.19.0+ #805 Tainted: G W include/linux/rcupdate.h:538 Illegal context switch in RCU read-side critical section! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by ip/771: #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff8182b8f4>] netlink_dump+0x21/0x26c #1: (rcu_read_lock){......}, at: [<ffffffff817d785b>] rcu_read_lock+0x0/0x6e stack backtrace: CPU: 3 PID: 771 Comm: ip Tainted: G W 3.19.0+ #805 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 0000000000000001 ffff8800d51e7718 ffffffff81a27457 0000000029e729e6 ffff8800d6108000 ffff8800d51e7748 ffffffff810b539b ffffffff820013dd 00000000000001c8 0000000000000000 ffff8800d7448088 ffff8800d51e7758 Call Trace: [<ffffffff81a27457>] dump_stack+0x4c/0x65 [<ffffffff810b539b>] lockdep_rcu_suspicious+0x107/0x110 [<ffffffff8109796f>] rcu_preempt_sleep_check+0x45/0x47 [<ffffffff8109e457>] ___might_sleep+0x1d/0x1cb [<ffffffff8109e67d>] __might_sleep+0x78/0x80 [<ffffffff814b9b1f>] idr_alloc+0x45/0xd1 [<ffffffff810cb7ab>] ? rcu_read_lock_held+0x3b/0x3d [<ffffffff814b9f9d>] ? idr_for_each+0x53/0x101 [<ffffffff817c1383>] alloc_netid+0x61/0x69 [<ffffffff817c14c3>] __peernet2id+0x79/0x8d [<ffffffff817c1ab7>] peernet2id+0x13/0x1f [<ffffffff817d8673>] rtnl_fill_ifinfo+0xa8d/0xc20 [<ffffffff810b17d9>] ? __lock_is_held+0x39/0x52 [<ffffffff817d894f>] rtnl_dump_ifinfo+0x149/0x213 [<ffffffff8182b9c2>] netlink_dump+0xef/0x26c [<ffffffff8182bcba>] netlink_recvmsg+0x17b/0x2c5 [<ffffffff817b0adc>] __sock_recvmsg+0x4e/0x59 [<ffffffff817b1b40>] sock_recvmsg+0x3f/0x51 [<ffffffff817b1f9a>] ___sys_recvmsg+0xf6/0x1d9 [<ffffffff8115dc67>] ? handle_pte_fault+0x6e1/0xd3d [<ffffffff8100a3a0>] ? native_sched_clock+0x35/0x37 [<ffffffff8109f45b>] ? sched_clock_local+0x12/0x72 [<ffffffff8109f6ac>] ? sched_clock_cpu+0x9e/0xb7 [<ffffffff810cb7ab>] ? rcu_read_lock_held+0x3b/0x3d [<ffffffff811abde8>] ? __fcheck_files+0x4c/0x58 [<ffffffff811ac556>] ? __fget_light+0x2d/0x52 [<ffffffff817b376f>] __sys_recvmsg+0x42/0x60 [<ffffffff817b379f>] SyS_recvmsg+0x12/0x1c Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: 0c7aecd4bde4b7302 ("netns: add rtnl cmd to add and get peer netns ids") Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-01	sh_eth: Fix lost MAC address on kexec	Geert Uytterhoeven	1	-0/+3
	Commit 740c7f31c094703c ("sh_eth: Ensure DMA engines are stopped before freeing buffers") added a call to sh_eth_reset() to the sh_eth_set_ringparam() and sh_eth_close() paths. However, setting the software reset bit(s) in the EDMR register resets the MAC Address Registers to zero. Hence after kexec, the new kernel doesn't detect a valid MAC address and assigns a random MAC address, breaking DHCP. Set the MAC address again after the reset in sh_eth_dev_exit() to fix this. Tested on r8a7740/armadillo (GETHER) and r8a7791/koelsch (FAST_RCAR). Fixes: 740c7f31c094703c ("sh_eth: Ensure DMA engines are stopped before freeing buffers") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-28	net: bcmgenet: fix throughtput regression	Jaedon Shin	2	-27/+88
	This patch adds bcmgenet_tx_poll for the tx_rings. This can reduce the interrupt load and send xmit in network stack on time. This also separated for the completion of tx_ring16 from bcmgenet_poll. The bcmgenet_tx_reclaim of tx_ring[{0,1,2,3}] operative by an interrupt is to be not more than a certain number TxBDs. It is caused by too slowly reclaiming the transmitted skb. Therefore, performance degradation of xmit after 605ad7f ("tcp: refine TSO autosizing"). Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-28	macvtap: make sure neighbour code can push ethernet header	Eric Dumazet	1	-2/+5
	Brian reported crashes using IPv6 traffic with macvtap/veth combo. I tracked the crashes in neigh_hh_output() -> memcpy(skb->data - HH_DATA_MOD, hh->hh_data, HH_DATA_MOD); Neighbour code assumes headroom to push Ethernet header is at least 16 bytes. It appears macvtap has only 14 bytes available on arches where NET_IP_ALIGN is 0 (like x86) Effect is a corruption of 2 bytes right before skb->head, and possible crashes if accessing non existing memory. This fix should also increase IPv4 performance, as paranoid code in ip_finish_output2() wont have to call skb_realloc_headroom() Reported-by: Brian Rak <brak@vultr.com> Tested-by: Brian Rak <brak@vultr.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-28	Merge tag 'mac80211-for-davem-2015-02-27' of ↵	David S. Miller	7	-10/+18
	git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== A few patches have accumulated, among them the fix for Linus's four-way-handshake problem. The others are various small fixes for problems all over, nothing really stands out. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-28	net: Verify permission to link_net in newlink	Eric W. Biederman	1	-0/+3
	When applicable verify that the caller has permisson to the underlying network namespace for a newly created network device. Similary checks exist for the network namespace a network device will be created in. Fixes: 317f4810e45e ("rtnl: allow to create device with IFLA_LINK_NETNSID set") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-28	net: Verify permission to dest_net in newlink	Eric W. Biederman	1	-0/+4
	When applicable verify that the caller has permision to create a network device in another network namespace. This check is already present when moving a network device between network namespaces in setlink so all that is needed is to duplicate that check in newlink. This change almost backports cleanly, but there are context conflicts as the code that follows was added in v4.0-rc1 Fixes: b51642f6d77b net: Enable a userns root rtnl calls that are safe for unprivilged users Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-28	drivers: net: cpsw: Set SECURE for dual_emac ucast	George McCollister	1	-1/+1
	Prior to this patch, sending a packet with the source MAC address of one of the CPSW interfaces to one of the CPSW slave ports while it's configured in dual_emac mode would update the port_num field of the VLAN/Unicast Address Table Entry. This would cause it to discard all incoming traffic addressed to that MAC address, essentially rendering the port useless until the ALE table is cleared (by starting and stopping the interface or rebooting.) For example, if eth0 has a MAC address of 90:59:af:8f:43:e9 it will have an ALE table entry: 00 00 00 00 59 90 02 30 e9 43 8f af (VLAN Addr vlan_id=2 unicast type=0 port_num=0 addr=90:59:af:8f:43:e9) If you configure another device with the same MAC address and connect it to the first CPSW slave port and send some traffic the ALE table entry becomes: 04 00 00 00 59 90 02 30 e9 43 8f af (VLAN Addr vlan_id=2 unicast type=0 port_num=1 addr=90:59:af:8f:43:e9) >From this point forward all incoming traffic addressed to 90:59:af:8f:43:e9 will be dropped. Setting the SECURE bit for the VLAN/Unicast address table entry for each interface's MAC address corrects the problem. Signed-off-by: George McCollister <george.mccollister@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-28	niu: fix error handling in niu_class_to_ethflow()	Dan Carpenter	1	-4/+2
	There is a discrepancy here because the niu_class_to_ethflow() returns zero on failure and one on success but the caller expected zero on success and negative on failure. The problem means that we allow the user to pass classes and flow_types which we don't want. I've looked at it a bit and I don't see it as a very serious bug. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-28	net: smc91x: use run-time configuration on all ARM machines	Arnd Bergmann	10	-117/+57
	The smc91x driver traditionally gets configured at compile-time for whichever hardware it runs on. This no longer works on ARM as we continue to move to building all-in-one kernels. Most ARM configurations with this driver already use run-time configuration through DT or through platform_data, but a few have not been converted yet. I've checked all ARM boards that use this driver in their legacy board files, and converted the ones that were using compile-time configuration in smc91x.h to behave like the other ones and provide the interrupt polarity along with the MMIO configuration (width, stride) at platform device creation time. In particular, these combinations were previously selectable in Kconfig but in fact broken: - sa1100 assabet plus pleb - msm combined with any other armv6/v7 platform - pxa-idp combined with any non-DMA pxa variant - LogicPD PXA270 combined with any other pxa - nomadik combined with any other armv4/v5 platform, e.g. versatile. None of these seem critical enough to warrant a backport to stable, but it would be nice to clean this up for good. Signed-off-by: Arnd Bergmann <arnd@arndb.de> ---- I would like the patch to get merged through netdev, after Robert and/or Linus have verified it on at least some hardware. There are a few other non-ARM platforms using this driver, I could do the same patch for those if we want to take it further. arch/arm/mach-msm/board-halibut.c \| 8 ++++- arch/arm/mach-msm/board-qsd8x50.c \| 8 ++++- arch/arm/mach-pxa/idp.c \| 5 +++ arch/arm/mach-pxa/lpd270.c \| 8 ++++- arch/arm/mach-realview/core.c \| 7 ++++ arch/arm/mach-realview/realview_eb.c \| 2 +- arch/arm/mach-sa1100/neponset.c \| 6 ++++ arch/arm/mach-sa1100/pleb.c \| 7 ++++ drivers/net/ethernet/smsc/smc91x.c \| 9 +++-- drivers/net/ethernet/smsc/smc91x.h \| 114 ++---------------------------------------------------------- 10 files changed, 57 insertions(+), 117 deletions(-) Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27	rhashtable: use cond_resched()	Eric Dumazet	1	-0/+4
	If a hash table has 128 slots and 16384 elems, expand to 256 slots takes more than one second. For larger sets, a soft lockup is detected. Holding cpu for that long, even in a work queue is a show stopper for non preemptable kernels. cond_resched() at strategic points to allow process scheduler to reschedule us. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27	Merge branch 'master' of ↵	David S. Miller	9	-76/+280
	git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2015-02-26 This series contains fixes for i40e and i40evf only. Alexey Khoroshilov found a possible leak of 'cmd_buf' when copy_from_user() failed in i40e_dbg_command_write(), so resolved by calling kfree(). Shannon provides a fix to ensure the shift and bitwise precedences do not work backwards for us by adding parans. Fixed the driver by preventing the driver from allowing stray interrupts or causing system logs from un-handled interrupts by combining the ICR0 shutdown with the standard interrupt shutdown and add the interrupt clearing to the PCI shutdown path. Fixed an issue where a NVM write times out before a transaction can complete, so Shannon added logic to make another attempt by reacquiring the semaphore, then retry the write, if the one retry fails, we will then give up. Adds checks to pointers before their use to ensure we do not try to dereference NULL pointers when returning values from the AdminQ calls. Akeem adds a check to bail out if the device is already down when checking for Tx hang subtask. Anjali fixes TSO with more than 8 frags per segment issue. The hardware has some limitations which the driver needs to adhere to: 1) no more than 8 descriptors per packet on the wire 2) no header can span more than 3 descriptors If one of these events happens, the hardware will generate an internal error and freeze the Tx queue, so Anjali fixes this by linearizes the skb to avoid these situations. Fixed an issue where the per Traffic Class queue count was higher than queues enabled, which will fix a warning with multiple function mode where systems regularly have more cores than vectors. Fixed TCP/IPv6 over VXLAN Tx checksum offload, where we were checking the outer protocol flags and deciding the flow for the inner header. Jesse fixes a race condition in the transmit hang detection. Before we were having issues of false Tx hang detection, no the driver makes more direct with the checks for progress forward by directly checking the head write back address and tail register when determining progress. This avoids Tx hangs where the software gets behind, because we are directly checking hardware state when determining a hang state. Neerav fixes the transmit ring Qset handle when DCB reconfigures. The issue was when DCB is reconfigured to a single traffic class (TC) and the driver did not reset the Tx ring Qset handle to correct the mapping, which caused the Tx queue to disable timeouts. Also as part of DCB reconfiguration flow if the Tx queue disable times out, then issue a PF reset to do some level of recovery. Mitch stops flow director on shutdown because, in some cases, the hardware would continue to try to access the FDIR ring after entering D3Hot state, which would cause either PCIe errors or NMIs, depending upon the system configuration. * NOTE * I have verified that this series of patches for net will not cause any merge issues when you sync up your net tree with your net-next tree. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27	amd-xgbe: Request IRQs only after driver is fully setup	Lendacky, Thomas	1	-82/+93
	It is possible that the hardware may not have been properly shutdown before this driver gets control, through use by firmware, for example. Until the driver is loaded, interrupts associated with the hardware could go pending. When the IRQs are requested napi support has not been initialized yet, but the ISR will get control and schedule napi processing resulting in a kernel panic because the poll routine has not been set. Adjust the code so that the driver is fully ready to handle and process interrupts as soon as the IRQs are requested. This involves requesting and freeing IRQs during start and stop processing and ordering the napi add and delete calls appropriately. Also adjust the powerup and powerdown routines to match the start and stop routines in regards to the ordering of tasks, including napi related calls. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27	net: asix: add support for the Sitecom LN-028 USB adapter	Luca Ceresoli	2	-0/+5
	Just another AX88178-based 10/100/1000 USB-to-Ethernet dongle. This one shows up in lsusb as: "Sitecom Europe B.V. LN-028 Network USB 2.0 Adapter". Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net> Cc: Francois Romieu <romieu@fr.zoreil.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-usb@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27	Merge branch 'rhashtable'	David S. Miller	6	-61/+19
	Daniel Borkmann says: ==================== rhashtable updates As discussed, I'm sending out rhashtable fixups for -net. I have a couple of more patches I was working on last week pending, i.e. to get rid of ht->nelems and ht->shift atomic operations which speed-up pure insertions/deletions, e.g. on my laptop I have 2 threads, inserting 7M entries each, that will reduce insertion time from ~1,450 ms to 865 ms (performance should even be better after removing the grow/shrink indirections). I guess that however is rather something for net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27	rhashtable: remove indirection for grow/shrink decision functions	Daniel Borkmann	6	-58/+18
	Currently, all real users of rhashtable default their grow and shrink decision functions to rht_grow_above_75() and rht_shrink_below_30(), so that there's currently no need to have this explicitly selectable. It can/should be generic and private inside rhashtable until a real use case pops up. Since we can make this private, we'll save us this additional indirection layer and can improve insertion/deletion time as well. Reference: http://patchwork.ozlabs.org/patch/443040/ Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27	rhashtable: unconditionally grow when max_shift is not specified	Daniel Borkmann	2	-3/+1
	While commit c0c09bfdc415 ("rhashtable: avoid unnecessary wakeup for worker queue") rightfully moved part of the decision making of whether we should expand or shrink from the expand/shrink functions themselves into insert/delete functions in order to avoid unnecessary worker wake-ups, it however introduced a regression by doing so. Before that change, if no max_shift was specified (= 0) on rhashtable initialization, rhashtable_expand() would just grow unconditionally and lets the available memory be the limiting factor. After that change, if no max_shift was specified, there would be _no_ expansion step at all. Given that netlink and tipc have a max_shift specified, it was not visible there, but Josh Hunt reported that if nft that starts out with a default element hint of 3 if not otherwise provided, would slow i.e. inserts down trememdously as it cannot grow larger to relax table occupancy. Given that the test case verifies shrinks/expands manually, we also must remove pointer to the helper functions to explicitly avoid parallel resizing on insertions/deletions. test_bucket_stats() and test_rht_lookup() could also be wrapped around rhashtable mutex to explicitly synchronize a walk from resizing, but I think that defeats the actual test case which intended to have explicit test steps, i.e. 1) inserts, 2) expands, 3) shrinks, 4) deletions, with object verification after each stage. Reported-by: Josh Hunt <johunt@akamai.com> Fixes: c0c09bfdc415 ("rhashtable: avoid unnecessary wakeup for worker queue") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Ying Xue <ying.xue@windriver.com> Cc: Josh Hunt <johunt@akamai.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27	vhost: drop hard-coded num_buffers size	Michael S. Tsirkin	1	-1/+2
	The 2 that we use for copy_to_iter comes from sizeof(u16), it used to be that way before the iov iter update. Fix it up, making it obvious the size of stack access is right. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27	vhost: cleanup iterator update logic	Michael S. Tsirkin	1	-10/+12
	Recent iterator-related changes in vhost made it harder to follow the logic fixing up the header. In fact, the fixup always happens at the same offset: sizeof(virtio_net_hdr): sometimes the fixup iterator is updated by copy_to_iter, sometimes-by iov_iter_advance. Rearrange code to make this obvious. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27	rocker: silence shift wrapping warning	Dan Carpenter	1	-2/+2
	"val" is declared as a u64 so static checkers complain that this shift can wrap. I don't have the hardware but probably it's doesn't have over 31 ports. Still we may as well silence the warning even if it's not a real bug. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27	rocker: add a check for NULL in rocker_probe_ports()	Dan Carpenter	1	-0/+2
	Make sure kmalloc() succeeds. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27	cxgb4: Fix PCI-E Memory window interface for big-endian systems	Hariprasad Shenai	2	-11/+45
	When doing reads and writes to adapter memory via the PCI-E Memory Window interface, data gets swizzled on 4-byte boundaries on Big-Endian systems because we need to account for the register read/write interface which incorporates a swizzle onto the Little-Endian PCI-E Bus. Based on original work by Casey Leedom <leedom@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27	enic: do notify_check before returning credits	Sujith Sankar	1	-2/+2
	We should complete notify_check before returning the credits. Once we return the credits, adaptor may access the notify data. Signed-off-by: Sujith Sankar <ssujith@cisco.com> Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-26	mac80211: Send EAPOL frames at lowest rate	Jouni Malinen	1	-0/+1
	The current minstrel_ht rate control behavior is somewhat optimistic in trying to find optimum TX rate. While this is usually fine for normal Data frames, there are cases where a more conservative set of retry parameters would be beneficial to make the connection more robust. EAPOL frames are critical to the authentication and especially the EAPOL-Key message 4/4 (the last message in the 4-way handshake) is important to get through to the AP. If that message is lost, the only recovery mechanism in many cases is to reassociate with the AP and start from scratch. This can often be avoided by trying to send the frame with more conservative rate and/or with more link layer retries. In most cases, minstrel_ht is currently using the initial EAPOL-Key frames for probing higher rates and this results in only five link layer transmission attempts (one at high(ish) MCS and four at MCS0). While this works with most APs, it looks like there are some deployed APs that may have issues with the EAPOL frames using HT MCS immediately after association. Similarly, there may be issues in cases where the signal strength or radio environment is not good enough to be able to get frames through even at couple of MCS 0 tries. The best approach for this would likely to be to reduce the TX rate for the last rate (3rd rate parameter in the set) to a low basic rate (say, 6 Mbps on 5 GHz and 2 or 5.5 Mbps on 2.4 GHz), but doing that cleanly requires some more effort. For now, we can start with a simple one-liner that forces the minimum rate to be used for EAPOL frames similarly how the TX rate is selected for the IEEE 802.11 Management frames. This does result in a small extra latency added to the cases where the AP would be able to receive the higher rate, but taken into account how small number of EAPOL frames are used, this is likely to be insignificant. A future optimization in the minstrel_ht design can also allow this patch to be reverted to get back to the more optimized initial TX rate. It should also be noted that many drivers that do not use minstrel as the rate control algorithm are already doing similar workarounds by forcing the lowest TX rate to be used for EAPOL frames. Cc: stable@vger.kernel.org Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-02-26	i40e: check pointers before use	Shannon Nelson	1	-1/+1
	Make sure we don't try to dereference NULL pointers when returning values from the AdminQ calls. Change-ID: Ia6694f2f415d50acf0aba063c863568742799aff Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-02-26	i40e: catch NVM write semaphore timeout and retry	Shannon Nelson	1	-0/+35
	In some circumstances, a multi-write transaction takes longer than the default 3 minute timeout on the write semaphore. If the write failed with an EBUSY status, this is likely the problem, so here we try to reacquire the semaphore then retry the write. We only do one retry, then give up. Change-ID: I1c8be60688acc2f39573839579baf601207c4a36 Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-02-26	i40e: stop flow director on shutdown	Mitch A Williams	1	-0/+1
	In some cases, the hardware would continue to try to access the FDIR ring after entering D3Hot state, which would cause either PCIe errors or NMIs, depending upon system configuration. Explicitly stop FDIR in our shutdown routine to eliminate this possibility. Change-ID: I1bd9fc7fd8f151fe24cad132ac9adddab923e3af Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-02-26	i40e: disconnect irqs on shutdown	Shannon Nelson	1	-6/+8
	Combine the ICR0 shutdown with the standard interrupt shutdown, and add the interrupt clearing to the PCI shutdown path. This prevents the driver from allowing stray interrupts or causing system logs from un-handled interrupts. Change-ID: I48f6ab95cad7f8ca77c1f26c92a51cc1034ced43 Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-02-26	i40evf: TCP/IPv6 over Vxlan Tx checksum offload fix	Anjali Singhai	1	-12/+12
	We were checking the outer Protocol flags and deciding the flow for inner header. This patch fixes that. This fixes the Tx checksum offload for TCP/IPv6 over vxlan. Change-ID: I837aaea921d34f71b24c2bc32aaadea5001ddf78 Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-02-26	i40e: Issue a PF reset if Tx queue disable timeout	Parikh, Neerav	1	-1/+7
	As part of DCB reconfiguration flow if the Tx queue disable times out then issue a PF reset to do some level of recovery. Change-ID: I7550021c55bff355351c0365e61e1f05fcaff46d Signed-off-by: Neerav Parikh <neerav.parikh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-02-26	i40e: Fix the Tx ring qset handle when DCB reconfigures	Parikh, Neerav	1	-2/+9
	When DCB is reconfigured to single TC the driver did not reset the Tx ring Qset handle to the correct mapping; which caused Tx queue disable timeouts. Change-ID: I4da5915ec92a83c281b478d653fae6ef1b72edfe Signed-off-by: Neerav Parikh <neerav.parikh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-02-26	i40e: Fix the case where per TC queue count was higher than queues enabled	Anjali Singhai	1	-1/+6
	When the driver or hardware gets less interrupt vectors than the actual number of CPU cores, limit the queue count for the priority queue traffic class (TC) queues. This will fix a warning with multiple function mode where systems regularly have more cores than vectors. Also add extra comment for readability. Change-ID: I4f02226263aa3995e1f5ee5503eac0cd6ee12fbd Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-02-26	i40e: fix race in hang check	Jesse Brandeburg	2	-48/+60
	The driver was having some issues with false Tx hang detection. This makes the driver a little more direct with the checks for progress forward by directly checking the head write back address and tail register when determining progress. This avoids Tx hangs where the software gets behind, because we are directly checking hardware state when determining hang state. Change-ID: I774f0e861c9e8ab5ccb213634100fe15440ae24a Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-02-26	i40e: Fix TSO with more than 8 frags per segment issue	Anjali Singhai	4	-0/+132
	The hardware has some limitations the driver needs to adhere to, that we found in extended testing. 1) no more than 8 descriptors per packet on the wire 2) no header can span more than 3 descriptors If one of these events occurs, the hardware will generate an internal error and freeze the Tx queue. This patch linearizes the skb to avoid these situations. Change-ID: I37dab7d3966e14895a9663ec4d0aaa8eb0d9e115 Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-02-26	i40e: Don't check for Tx hang when PF down	Akeem G Abodunrin	1	-1/+2
	This patch adds check to bail out if device is already down when checking for Tx hang subtask. Change-ID: I3853fb7a6d11cb9a4c349b687cb25c15b19977a0 Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-02-26	i40e: fix shift precedence issue	Shannon Nelson	2	-3/+4
	Add parens to make sure the shift and bitwise precedences don't work backwards for us. Change-ID: I60c10ef4fad6bc654522b9d8a53da2e270a0f268 Reported-by: Joe Perches <joe@perches.com> Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-02-26	i40e: Fix memory leak at failure path in i40e_dbg_command_write()	Alexey Khoroshilov	1	-1/+3
	The patch fixes a leak of 'cmd_buf' when copy_from_user() failed in i40e_dbg_command_write(). Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-02-25	MAINTAINERS: update my email address	Andy Gospodarek	1	-1/+1
	I have been signing off on patches with this address so I'll change it. Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-25	amd-xgbe-phy: PHY KX/KR mode differences	Tom Lendacky	2	-2/+84
	The PHY requires different settings for the Decision Feedback Analyzer (DFE) when running in KX mode vs. KR mode. Update the code to change these settings when changing modes in order to provide a more stable link. Additionally, adjust the 10GbE PQ skew default setting to a more sane value. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-24	r8169: Fix trivial typo in rtl_check_firmware	Yannick Guerrini	1	-1/+1
	Change 'firwmare' to 'firmware' Signed-off-by: Yannick Guerrini <yguerrini@tomshardware.fr> Signed-off-by: David S. Miller <davem@davemloft.net>