summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2015-12-18xen/pciback: Return error on XEN_PCI_OP_enable_msi when device has MSI or ↵Konrad Rzeszutek Wilk1-1/+6
MSI-X enabled The guest sequence of: a) XEN_PCI_OP_enable_msi b) XEN_PCI_OP_enable_msi c) XEN_PCI_OP_disable_msi results in hitting an BUG_ON condition in the msi.c code. The MSI code uses an dev->msi_list to which it adds MSI entries. Under the above conditions an BUG_ON() can be hit. The device passed in the guest MUST have MSI capability. The a) adds the entry to the dev->msi_list and sets msi_enabled. The b) adds a second entry but adding in to SysFS fails (duplicate entry) and deletes all of the entries from msi_list and returns (with msi_enabled is still set). c) pci_disable_msi passes the msi_enabled checks and hits: BUG_ON(list_empty(dev_to_msi_list(&dev->dev))); and blows up. The patch adds a simple check in the XEN_PCI_OP_enable_msi to guard against that. The check for msix_enabled is not stricly neccessary. This is part of XSA-157. CC: stable@vger.kernel.org Reviewed-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2015-12-18xen/pciback: Save xen_pci_op commands before processing itKonrad Rzeszutek Wilk2-1/+15
Double fetch vulnerabilities that happen when a variable is fetched twice from shared memory but a security check is only performed the first time. The xen_pcibk_do_op function performs a switch statements on the op->cmd value which is stored in shared memory. Interestingly this can result in a double fetch vulnerability depending on the performed compiler optimization. This patch fixes it by saving the xen_pci_op command before processing it. We also use 'barrier' to make sure that the compiler does not perform any optimization. This is part of XSA155. CC: stable@vger.kernel.org Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2015-12-18xen-scsiback: safely copy requestsDavid Vrabel1-1/+1
The copy of the ring request was lacking a following barrier(), potentially allowing the compiler to optimize the copy away. Use RING_COPY_REQUEST() to ensure the request is copied to local memory. This is part of XSA155. CC: stable@vger.kernel.org Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2015-12-18xen-blkback: read from indirect descriptors only onceRoger Pau Monné1-5/+10
Since indirect descriptors are in memory shared with the frontend, the frontend could alter the first_sect and last_sect values after they have been validated but before they are recorded in the request. This may result in I/O requests that overflow the foreign page, possibly overwriting local pages when the I/O request is executed. When parsing indirect descriptors, only read first_sect and last_sect once. This is part of XSA155. CC: stable@vger.kernel.org Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2015-12-18xen-blkback: only read request operation from shared ring onceRoger Pau Monné1-4/+4
A compiler may load a switch statement value multiple times, which could be bad when the value is in memory shared with the frontend. When converting a non-native request to a native one, ensure that src->operation is only loaded once by using READ_ONCE(). This is part of XSA155. CC: stable@vger.kernel.org Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2015-12-18xen-netback: use RING_COPY_REQUEST() throughoutDavid Vrabel1-16/+14
Instead of open-coding memcpy()s and directly accessing Tx and Rx requests, use the new RING_COPY_REQUEST() that ensures the local copy is correct. This is more than is strictly necessary for guest Rx requests since only the id and gref fields are used and it is harmless if the frontend modifies these. This is part of XSA155. CC: stable@vger.kernel.org Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2015-12-18xen-netback: don't use last request to determine minimum Tx creditDavid Vrabel1-3/+1
The last from guest transmitted request gives no indication about the minimum amount of credit that the guest might need to send a packet since the last packet might have been a small one. Instead allow for the worst case 128 KiB packet. This is part of XSA155. CC: stable@vger.kernel.org Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2015-12-18xen: Add RING_COPY_REQUEST()David Vrabel1-0/+14
Using RING_GET_REQUEST() on a shared ring is easy to use incorrectly (i.e., by not considering that the other end may alter the data in the shared ring while it is being inspected). Safe usage of a request generally requires taking a local copy. Provide a RING_COPY_REQUEST() macro to use instead of RING_GET_REQUEST() and an open-coded memcpy(). This takes care of ensuring that the copy is done correctly regardless of any possible compiler optimizations. Use a volatile source to prevent the compiler from reordering or omitting the copy. This is part of XSA155. CC: stable@vger.kernel.org Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2015-12-18powerpc/opal-irqchip: Fix deadlock introduced by "Fix double endian conversion"Alistair Popple1-1/+13
Commit 25642e1459ac ("powerpc/opal-irqchip: Fix double endian conversion") fixed an endian bug by calling opal_handle_events() in opal_event_unmask(). However this introduced a deadlock if we find an event is active during unmasking and call opal_handle_events() again. The bad call sequence is: opal_interrupt() -> opal_handle_events() -> generic_handle_irq() -> handle_level_irq() -> raw_spin_lock(&desc->lock) handle_irq_event(desc) unmask_irq(desc) -> opal_event_unmask() -> opal_handle_events() -> generic_handle_irq() -> handle_level_irq() -> raw_spin_lock(&desc->lock) (BOOM) When generating multiple opal events in quick succession this would lead to the following stall warnings: EEH: Fenced PHB#0 detected, location: U78C9.001.WZS09XA-P1-C32 INFO: rcu_sched detected stalls on CPUs/tasks: 12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=2065 15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=2065 (detected by 13, t=2102 jiffies, g=1325, c=1324, q=602) NMI watchdog: BUG: soft lockup - CPU#18 stuck for 22s! [irqbalance:2696] INFO: rcu_sched detected stalls on CPUs/tasks: 12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=8371 15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=8371 (detected by 20, t=8407 jiffies, g=1325, c=1324, q=1290) This patch corrects the problem by queuing the work if an event is active during unmasking, which is similar to the pre-endian fix behaviour. Fixes: 25642e1459ac ("powerpc/opal-irqchip: Fix double endian conversion") Signed-off-by: Alistair Popple <alistair@popple.id.au> Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-18Fix remove_and_add_spares removes drive added as spare in slot_storeGoldwyn Rodrigues1-3/+10
Commit 2910ff17d154baa5eb50e362a91104e831eb2bb6 introduced a regression which would remove a recently added spare via slot_store. Revert part of the patch which touches slot_store() and add the disk directly using pers->hot_add_disk() Fixes: 2910ff17d154 ("md: remove_and_add_spares() to activate specific rdev") Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com> Signed-off-by: NeilBrown <neilb@suse.com>
2015-12-18md: fix bug due to nested suspendMikulas Patocka1-3/+4
The patch c7bfced9a6716ff66c9d61f934bb60af08d4688c committed to 4.4-rc causes crash in LVM test shell/lvchange-raid.sh. The kernel crashes with this BUG, the reason is that we attempt to suspend a device that is already suspended. See also https://bugzilla.redhat.com/show_bug.cgi?id=1283491 This patch fixes the bug by changing functions mddev_suspend and mddev_resume to always nest. The number of nested calls to mddev_nested_suspend is kept in the variable mddev->suspended. [neilb: made mddev_suspend() always nest instead of introduce mddev_nested_suspend] kernel BUG at drivers/md/md.c:317! CPU: 3 PID: 32754 Comm: lvm Not tainted 4.4.0-rc2 #1 task: 0000000047076040 ti: 0000000047014000 task.ti: 0000000047014000 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001000000000000001111 Not tainted r00-03 000000000804000f 00000000102c5280 0000000010c7522c 000000007e3d1810 r04-07 0000000010c6f000 000000004ef37f20 000000007e3d1dd0 000000007e3d1810 r08-11 000000007c9f1600 0000000000000000 0000000000000001 ffffffffffffffff r12-15 0000000010c1d000 0000000000000041 00000000f98d63c8 00000000f98e49e4 r16-19 00000000f98e49e4 00000000c138fd06 00000000f98d63c8 0000000000000001 r20-23 0000000000000002 000000004ef37f00 00000000000000b0 00000000000001d1 r24-27 00000000424783a0 000000007e3d1dd0 000000007e3d1810 00000000102b2000 r28-31 0000000000000001 0000000047014840 0000000047014930 0000000000000001 sr00-03 0000000007040800 0000000000000000 0000000000000000 0000000007040800 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000102c538c 00000000102c5390 IIR: 03ffe01f ISR: 0000000000000000 IOR: 00000000102b2748 CPU: 3 CR30: 0000000047014000 CR31: 0000000000000000 ORIG_R28: 00000000000000b0 IAOQ[0]: mddev_suspend+0x10c/0x160 [md_mod] IAOQ[1]: mddev_suspend+0x110/0x160 [md_mod] RP(r2): raid1_add_disk+0xd4/0x2c0 [raid1] Backtrace: [<0000000010c7522c>] raid1_add_disk+0xd4/0x2c0 [raid1] [<0000000010c20078>] raid_resume+0x390/0x418 [dm_raid] [<00000000105833e8>] dm_table_resume_targets+0xc0/0x188 [dm_mod] [<000000001057f784>] dm_resume+0x144/0x1e0 [dm_mod] [<0000000010587dd4>] dev_suspend+0x1e4/0x568 [dm_mod] [<0000000010589278>] ctl_ioctl+0x1e8/0x428 [dm_mod] [<0000000010589518>] dm_compat_ctl_ioctl+0x18/0x68 [dm_mod] [<0000000040377b88>] compat_SyS_ioctl+0xd0/0x1558 Fixes: c7bfced9a671 ("md: suspend i/o during runtime blk_integrity_unregister") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com>
2015-12-18MD: change journal disk role to disk 0Shaohua Li2-3/+7
Neil pointed out setting journal disk role to raid_disks will confuse reshape if we support reshape eventually. Switching the role to 0 (we should be fine as long as the value >=0) and skip sysfs file creation to avoid error. Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
2015-12-18md/raid10: fix data corruption and crash during resyncArtur Paszkiewicz1-1/+3
The commit c31df25f20e3 ("md/raid10: make sync_request_write() call bio_copy_data()") replaced manual data copying with bio_copy_data() but it doesn't work as intended. The source bio (fbio) is already processed, so its bvec_iter has bi_size == 0 and bi_idx == bi_vcnt. Because of this, bio_copy_data() either does not copy anything, or worse, copies data from the ->bi_next bio if it is set. This causes wrong data to be written to drives during resync and sometimes lockups/crashes in bio_copy_data(): [ 517.338478] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [md126_raid10:3319] [ 517.347324] Modules linked in: raid10 xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter ip_tables x86_pkg_temp_thermal coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul cryptd shpchp pcspkr ipmi_si ipmi_msghandler tpm_crb acpi_power_meter acpi_cpufreq ext4 mbcache jbd2 sr_mod cdrom sd_mod e1000e ax88179_178a usbnet mii ahci ata_generic crc32c_intel libahci ptp pata_acpi libata pps_core wmi sunrpc dm_mirror dm_region_hash dm_log dm_mod [ 517.440555] CPU: 0 PID: 3319 Comm: md126_raid10 Not tainted 4.3.0-rc6+ #1 [ 517.448384] Hardware name: Intel Corporation PURLEY/PURLEY, BIOS PLYDCRB1.86B.0055.D14.1509221924 09/22/2015 [ 517.459768] task: ffff880153773980 ti: ffff880150df8000 task.ti: ffff880150df8000 [ 517.468529] RIP: 0010:[<ffffffff812e1888>] [<ffffffff812e1888>] bio_copy_data+0xc8/0x3c0 [ 517.478164] RSP: 0018:ffff880150dfbc98 EFLAGS: 00000246 [ 517.484341] RAX: ffff880169356688 RBX: 0000000000001000 RCX: 0000000000000000 [ 517.492558] RDX: 0000000000000000 RSI: ffffea0001ac2980 RDI: ffffea0000d835c0 [ 517.500773] RBP: ffff880150dfbd08 R08: 0000000000000001 R09: ffff880153773980 [ 517.508987] R10: ffff880169356600 R11: 0000000000001000 R12: 0000000000010000 [ 517.517199] R13: 000000000000e000 R14: 0000000000000000 R15: 0000000000001000 [ 517.525412] FS: 0000000000000000(0000) GS:ffff880174a00000(0000) knlGS:0000000000000000 [ 517.534844] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 517.541507] CR2: 00007f8a044d5fed CR3: 0000000169504000 CR4: 00000000001406f0 [ 517.549722] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 517.557929] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 517.566144] Stack: [ 517.568626] ffff880174a16bc0 ffff880153773980 ffff880169356600 0000000000000000 [ 517.577659] 0000000000000001 0000000000000001 ffff880153773980 ffff88016a61a800 [ 517.586715] ffff880150dfbcf8 0000000000000001 ffff88016dd209e0 0000000000001000 [ 517.595773] Call Trace: [ 517.598747] [<ffffffffa043ef95>] raid10d+0xfc5/0x1690 [raid10] [ 517.605610] [<ffffffff816697ae>] ? __schedule+0x29e/0x8e2 [ 517.611987] [<ffffffff814ff206>] md_thread+0x106/0x140 [ 517.618072] [<ffffffff810c1d80>] ? wait_woken+0x80/0x80 [ 517.624252] [<ffffffff814ff100>] ? super_1_load+0x520/0x520 [ 517.630817] [<ffffffff8109ef89>] kthread+0xc9/0xe0 [ 517.636506] [<ffffffff8109eec0>] ? flush_kthread_worker+0x70/0x70 [ 517.643653] [<ffffffff8166d99f>] ret_from_fork+0x3f/0x70 [ 517.649929] [<ffffffff8109eec0>] ? flush_kthread_worker+0x70/0x70 Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Reviewed-by: Shaohua Li <shli@kernel.org> Cc: stable@vger.kernel.org (v4.2+) Fixes: c31df25f20e3 ("md/raid10: make sync_request_write() call bio_copy_data()") Signed-off-by: NeilBrown <neilb@suse.com>
2015-12-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds115-671/+1045
Pull networking fixes from David Miller: 1) Fix uninitialized variable warnings in nfnetlink_queue, a lot of people reported this... From Arnd Bergmann. 2) Don't init mutex twice in i40e driver, from Jesse Brandeburg. 3) Fix spurious EBUSY in rhashtable, from Herbert Xu. 4) Missing DMA unmaps in mvpp2 driver, from Marcin Wojtas. 5) Fix race with work structure access in pppoe driver causing corruptions, from Guillaume Nault. 6) Fix OOPS due to sh_eth_rx() not checking whether netdev_alloc_skb() actually succeeded or not, from Sergei Shtylyov. 7) Don't lose flags when settifn IFA_F_OPTIMISTIC in ipv6 code, from Bjørn Mork. 8) VXLAN_HD_RCO defined incorrectly, fix from Jiri Benc. 9) Fix clock source used for cookies in SCTP, from Marcelo Ricardo Leitner. 10) aurora driver needs HAS_DMA dependency, from Geert Uytterhoeven. 11) ndo_fill_metadata_dst op of vxlan has to handle ipv6 tunneling properly as well, from Jiri Benc. 12) Handle request sockets properly in xfrm layer, from Eric Dumazet. 13) Double stats update in ipv6 geneve transmit path, fix from Pravin B Shelar. 14) sk->sk_policy[] needs RCU protection, and as a result xfrm_policy_destroy() needs to free policies using an RCU grace period, from Eric Dumazet. 15) SCTP needs to clone ipv6 tx options in order to avoid use after free, from Eric Dumazet. 16) Missing kbuild export if ila.h, from Stephen Hemminger. 17) Missing mdiobus_alloc() return value checking in mdio-mux.c, from Tobias Klauser. 18) Validate protocol value range in ->create() methods, from Hannes Frederic Sowa. 19) Fix early socket demux races that result in illegal dst reuse, from Eric Dumazet. 20) Validate socket address length in pptp code, from WANG Cong. 21) skb_reorder_vlan_header() uses incorrect offset and can corrupt packets, from Vlad Yasevich. 22) Fix memory leaks in nl80211 registry code, from Ola Olsson. 23) Timeout loop count handing fixes in mISDN, xgbe, qlge, sfc, and qlcnic. From Dan Carpenter. 24) msg.msg_iocb needs to be cleared in recvfrom() otherwise, for example, AF_ALG will interpret it as an async call. From Tadeusz Struk. 25) inetpeer_set_addr_v4 forgets to initialize the 'vif' field, from Eric Dumazet. 26) rhashtable enforces the minimum table size not early enough, breaking how we calculate the per-cpu lock allocations. From Herbert Xu. 27) Fix FCC port lockup in 82xx driver, from Martin Roth. 28) FOU sockets need to be freed using RCU, from Hannes Frederic Sowa. 29) Fix out-of-bounds access in __skb_complete_tx_timestamp() and sock_setsockopt() wrt. timestamp handling. From WANG Cong. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (117 commits) net: check both type and procotol for tcp sockets drivers: net: xgene: fix Tx flow control tcp: restore fastopen with no data in SYN packet af_unix: Revert 'lock_interruptible' in stream receive code fou: clean up socket with kfree_rcu 82xx: FCC: Fixing a bug causing to FCC port lock-up gianfar: Don't enable RX Filer if not supported net: fix warnings in 'make htmldocs' by moving macro definition out of field declaration rhashtable: Fix walker list corruption rhashtable: Enforce minimum size on initial hash table inet: tcp: fix inetpeer_set_addr_v4() ipv6: automatically enable stable privacy mode if stable_secret set net: fix uninitialized variable issue bluetooth: Validate socket address length in sco_sock_bind(). net_sched: make qdisc_tree_decrease_qlen() work for non mq ser_gigaset: remove unnecessary kfree() calls from release method ser_gigaset: fix deallocation of platform device structure ser_gigaset: turn nonsense checks into WARN_ON ser_gigaset: fix up NULL checks qlcnic: fix a timeout loop ...
2015-12-17net: check both type and procotol for tcp socketsWANG Cong2-2/+4
Dmitry reported the following out-of-bound access: Call Trace: [<ffffffff816cec2e>] __asan_report_load4_noabort+0x3e/0x40 mm/kasan/report.c:294 [<ffffffff84affb14>] sock_setsockopt+0x1284/0x13d0 net/core/sock.c:880 [< inline >] SYSC_setsockopt net/socket.c:1746 [<ffffffff84aed7ee>] SyS_setsockopt+0x1fe/0x240 net/socket.c:1729 [<ffffffff85c18c76>] entry_SYSCALL_64_fastpath+0x16/0x7a arch/x86/entry/entry_64.S:185 This is because we mistake a raw socket as a tcp socket. We should check both sk->sk_type and sk->sk_protocol to ensure it is a tcp socket. Willem points out __skb_complete_tx_timestamp() needs to fix as well. Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-17drivers: net: xgene: fix Tx flow controlIyappan Subramanian2-18/+24
Currently the Tx flow control is based on reading the hardware state, which is not accurate since it may not reflect the descriptors that are not yet reached the memory. To accurately control the Tx flow, changing it to be software based. Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-17tcp: restore fastopen with no data in SYN packetEric Dumazet1-11/+12
Yuchung tracked a regression caused by commit 57be5bdad759 ("ip: convert tcp_sendmsg() to iov_iter primitives") for TCP Fast Open. Some Fast Open users do not actually add any data in the SYN packet. Fixes: 57be5bdad759 ("ip: convert tcp_sendmsg() to iov_iter primitives") Reported-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-17af_unix: Revert 'lock_interruptible' in stream receive codeRainer Weikusat1-10/+3
With b3ca9b02b00704053a38bfe4c31dbbb9c13595d0, the AF_UNIX SOCK_STREAM receive code was changed from using mutex_lock(&u->readlock) to mutex_lock_interruptible(&u->readlock) to prevent signals from being delayed for an indefinite time if a thread sleeping on the mutex happened to be selected for handling the signal. But this was never a problem with the stream receive code (as opposed to its datagram counterpart) as that never went to sleep waiting for new messages with the mutex held and thus, wouldn't cause secondary readers to block on the mutex waiting for the sleeping primary reader. As the interruptible locking makes the code more complicated in exchange for no benefit, change it back to using mutex_lock. Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-17Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds5-11/+8
Pull drm fixes from Dave Airlie: "Some i915 fixes, one omap fix, one core regression fix. Not even enough fixes for a twelve days of xmas song, which seemms good" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm: Don't overwrite UNVERFIED mode status to OK drm/omap: fix fbdev pix format to support all platforms drm/i915: Do a better job at disabling primary plane in the noatomic case. drm/i915/skl: Double RC6 WRL always on drm/i915/skl: Disable coarse power gating up until F0 drm/i915: Remove incorrect warning in context cleanup
2015-12-17locking/osq: Fix ordering of node initialisation in osq_lockWill Deacon1-3/+5
The Cavium guys reported a soft lockup on their arm64 machine, caused by commit c55a6ffa6285 ("locking/osq: Relax atomic semantics"): mutex_optimistic_spin+0x9c/0x1d0 __mutex_lock_slowpath+0x44/0x158 mutex_lock+0x54/0x58 kernfs_iop_permission+0x38/0x70 __inode_permission+0x88/0xd8 inode_permission+0x30/0x6c link_path_walk+0x68/0x4d4 path_openat+0xb4/0x2bc do_filp_open+0x74/0xd0 do_sys_open+0x14c/0x228 SyS_openat+0x3c/0x48 el0_svc_naked+0x24/0x28 This is because in osq_lock we initialise the node for the current CPU: node->locked = 0; node->next = NULL; node->cpu = curr; and then publish the current CPU in the lock tail: old = atomic_xchg_acquire(&lock->tail, curr); Once the update to lock->tail is visible to another CPU, the node is then live and can be both read and updated by concurrent lockers. Unfortunately, the ACQUIRE semantics of the xchg operation mean that there is no guarantee the contents of the node will be visible before lock tail is updated. This can lead to lock corruption when, for example, a concurrent locker races to set the next field. Fixes: c55a6ffa6285 ("locking/osq: Relax atomic semantics"): Reported-by: David Daney <ddaney@caviumnetworks.com> Reported-by: Andrew Pinski <andrew.pinski@caviumnetworks.com> Tested-by: Andrew Pinski <andrew.pinski@caviumnetworks.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/1449856001-21177-1-git-send-email-will.deacon@arm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-17Merge branch 'libnvdimm-fixes' of ↵Linus Torvalds7-10/+11
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: - Two bug fixes for misuse of PAGE_MASK in scatterlist and dma-debug. These are tagged for -stable. The scatterlist impact is potentially corrupted dma addresses on HIGHMEM enabled platforms. - A minor locking fix for the NFIT hot-add implementation that is new in 4.4-rc. This would only trigger in the case a hot-add raced driver removal. * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: dma-debug: Fix dma_debug_entry offset calculation Revert "scatterlist: use sg_phys()" nfit: acpi_nfit_notify(): Do not leave device locked
2015-12-17Merge remote-tracking branch 'mkp-scsi/4.4/scsi-fixes' into fixesJames Bottomley1-10/+10
2015-12-17gpio: revert get() to non-errorprogating behaviourLinus Walleij1-1/+7
commit e20538b82f1f ("gpio: Propagate errors from chip->get()") started to propagate errors from the .get() functions since we can get errors from the infrastructure of e.g. slowbus GPIO expanders. However it turns out a bunch of drivers relied on the core to clamp the value, so we need to revert to the old behaviour and go over all drivers and fix them to conform to the expectations of the core before we go back to propagating the error code. Cc: stable@vger.kernel.org # 4.3+ Cc: Bjorn Andersson <bjorn.andersson@sonymobile.com> Cc: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com> Fixes: e20538b82f1f ("gpio: Propagate errors from chip->get()") Reported-by: Michael Trimarchi <michael@amarulasolutions.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-12-17gpio: generic: clamp values from bgpio_get_set()Linus Walleij1-2/+2
The bgpio_get_set() call should return a value clamped to [0,1], the current code will return a negative value if reading bit 31, which turns the value negative as this is a signed value and thus gets interpreted as an error by the gpiolib core. Found on the gpio-mxc but applies to any MMIO driver. Cc: stable@vger.kernel.org # 4.3+ Cc: kernel@pengutronix.de Cc: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com> Fixes: e20538b82f1f ("gpio: Propagate errors from chip->get()") Reported-by: Clemens Gruber <clemens.gruber@pqgruber.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-12-17powerpc/powernv: pr_warn_once on unsupported OPAL_MSG typeStewart Smith1-1/+1
When running on newer OPAL firmware that supports sending extra OPAL_MSG types, we would print a warning on *every* message received. This could be a problem for kernels that don't support OPAL_MSG_OCC on machines that are running real close to thermal limits and the OCC is throttling the chip. For a kernel that is paying attention to the message queue, we could get these notifications quite often. Conceivably, future message types could also come fairly often, and printing that we didn't understand them 10,000 times provides no further information than printing them once. Cc: stable@vger.kernel.org Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17ARC: smp: Rename platform hook @init_cpu_smp -> @init_per_cpuVineet Gupta3-6/+6
Makes it similar to smp_ops which also has callback with same name Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-17ARC: rename smp operation init_irq_cpu() to init_per_cpu()Noam Camus4-7/+7
This will better reflect its description i.e. "any needed setup..." and not just do an "IPI request". Signed-off-by: Noam Camus <noamc@ezchip.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-17ARC: dw2 unwind: Ignore CIE version !=1 gracefully instead of bailingVineet Gupta1-4/+9
ARC dwarf unwinder only supports CIE version == 1 The boot time dwarf sanitizer (part of binary lookup table constructor) would simply bail if it saw CIE version == 3, rendering unwinder with a NULL lookup table. It seems libgcc linked with kernel does have such entries. With fallback linear search removed, and a NULL binary lookup table, unwinder fails to generate any stack trace. So allow graceful ignoring of unsupported CIE entries. This problem was initially seen in Alexey's setup (and not mine) as he was using buildroot built toolchain (libgcc) which doesn't get built with CFLAGS_FOR_TARGET="-gdwarf-2 which is my default Fixes STAR 9000985048: "kernel unwinder broken with stock tools" Fixes: 2e22502c080f ARC: dw2 unwind: Remove falllback linear search thru FDE entries Reported-by Alexey Brodkin <abrodkin@synopsys.com> Cc: <stable@vger.kernel.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-17ARC: dw2 unwind: Reinstante unwinding out of modulesVineet Gupta3-19/+26
The fix which removed linear searching of dwarf (because binary lookup data always exists) missed out on the fact that modules don't get the binary lookup tables info. This caused unwinding out of modules to stop working. So add binary lookup header setup (equivalent of eh_frame_hdr setup) to modules as well. While at it, confine the header setup to within unwinder code, reducing one API exposed out of unwinder code. Fixes: 2e22502c080f ARC: dw2 unwind: Remove falllback linear search thru FDE entries Cc: <stable@vger.kernel.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-17ARC: [plat-sim] unbork non default CONFIG_LINUX_LINK_BASEVineet Gupta3-2/+6
HIGHMEM support bumped the default memory size for nsim platform to 1G. Thus total memory ended at the very edge of start of peripherals address space. With linux link base shifted, memory started bleeding into peripheral space which caused early boot bad_page spew ! Fixes: 29e332261d2 ("ARC: mm: HIGHMEM: populate high memory from DT") Reported-by: Anton Kolesov <akolesov@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-16fou: clean up socket with kfree_rcuHannes Frederic Sowa1-1/+2
fou->udp_offloads is managed by RCU. As it is actually included inside the fou sockets, we cannot let the memory go out of scope before a grace period. We either can synchronize_rcu or switch over to kfree_rcu to manage the sockets. kfree_rcu seems appropriate as it is used by vxlan and geneve. Fixes: 23461551c00628c ("fou: Support for foo-over-udp RX path") Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-16Merge tag 'mac80211-for-davem-2015-12-15' of ↵David S. Miller9-74/+92
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Another set of fixes: * memory leak fixes (from Ola) * operating mode notification spec compliance fix (from Eyal) * copy rfkill names in case pointer becomes invalid (myself) * two hardware restart fixes (myself) * get rid of "limiting TX power" log spam (myself) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-1682xx: FCC: Fixing a bug causing to FCC port lock-upMartin Roth1-1/+1
The patch fixes FCC port lock-up, which occurs as a result of a bug during underrun/collision handling. Within the tx_startup() function in mac-fcc.c, the address of last BD is not calculated correctly. As a result of wrong calculation of the last BD address, the next transmitted BD may be set to an area out of the transmit BD ring. This actually causes to port lock-up and it is not recoverable. Signed-off-by: Martin Roth <martin.roth@motorolasolutions.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-16gianfar: Don't enable RX Filer if not supportedHamish Martin2-3/+6
After commit 15bf176db1fb ("gianfar: Don't enable the Filer w/o the Parser"), 'TSEC' model controllers (for example as seen on MPC8541E) always have 8 bytes stripped from the front of received frames. Only 'eTSEC' gianfar controllers have the RX Filer capability (amongst other enhancements). Previously this was treated as always enabled for both 'TSEC' and 'eTSEC' controllers. In commit 15bf176db1fb ("gianfar: Don't enable the Filer w/o the Parser") a subtle change was made to the setting of 'uses_rxfcb' to effectively always set it (since 'rx_filer_enable' was always true). This had the side-effect of always stripping 8 bytes from the front of received frames on 'TSEC' type controllers. We now only enable the RX Filer capability on controller types that support it, thereby avoiding the issue for 'TSEC' type controllers. Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> Signed-off-by: Hamish Martin <hamish.martin@alliedtelesis.co.nz> Reviewed-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-16dma-debug: Fix dma_debug_entry offset calculationDaniel Mentz1-2/+2
dma-debug uses struct dma_debug_entry to keep track of dma coherent memory allocation requests. The virtual address is converted into a pfn and an offset. Previously, the offset was calculated using an incorrect bit mask. As a result, we saw incorrect error messages from dma-debug like the following: "DMA-API: exceeded 7 overlapping mappings of cacheline 0x03e00000" Cacheline 0x03e00000 does not exist on our platform. Cc: <stable@vger.kernel.org> Fixes: 0abdd7a81b7e ("dma-debug: introduce debug_dma_assert_idle()") Signed-off-by: Daniel Mentz <danielmentz@google.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-12-16Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-armLinus Torvalds7-68/+138
Pull ARM fixes from Russell King: "Further ARM fixes: - Anson Huang noticed that we were corrupting a register we shouldn't be during suspend on some CPUs. - Shengjiu Wang spotted a bug in the 'swp' instruction emulation. - Will Deacon fixed a bug in the ASID allocator. - Laura Abbott fixed the kernel permission protection to apply to all threads running in the system. - I've fixed two bugs with the domain access control register handling, one to do with printing an appropriate value at oops time, and the other to further fix the uaccess_with_memcpy code" * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: ARM: 8475/1: SWP emulation: Restore original *data when failed ARM: 8471/1: need to save/restore arm register(r11) when it is corrupted ARM: fix uaccess_with_memcpy() with SW_DOMAIN_PAN ARM: report proper DACR value in oops dumps ARM: 8464/1: Update all mm structures with section adjustments ARM: 8465/1: mm: keep reserved ASIDs in sync with mm after multiple rollovers
2015-12-16net: fix warnings in 'make htmldocs' by moving macro definition out of field ↵Hannes Frederic Sowa1-1/+1
declaration Docbook does not like the definition of macros inside a field declaration and adds a warning. Move the definition out. Fixes: 79462ad02e86180 ("net: add validation for the socket syscall protocol argument") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-16rhashtable: Fix walker list corruptionHerbert Xu1-9/+7
The commit ba7c95ea3870fe7b847466d39a049ab6f156aa2c ("rhashtable: Fix sleeping inside RCU critical section in walk_stop") introduced a new spinlock for the walker list. However, it did not convert all existing users of the list over to the new spin lock. Some continued to use the old mutext for this purpose. This obviously led to corruption of the list. The fix is to use the spin lock everywhere where we touch the list. This also allows us to do rcu_rad_lock before we take the lock in rhashtable_walk_start. With the old mutex this would've deadlocked but it's safe with the new spin lock. Fixes: ba7c95ea3870 ("rhashtable: Fix sleeping inside RCU...") Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-16rhashtable: Enforce minimum size on initial hash tableHerbert Xu1-3/+3
William Hua <william.hua@canonical.com> wrote: > > I wasn't aware there was an enforced minimum size. I simply set the > nelem_hint in the rhastable_params struct to 1, expecting it to grow as > needed. This caused a segfault afterwards when trying to insert an > element. OK we're doing the size computation before we enforce the limit on min_size. ---8<--- We need to do the initial hash table size computation after we have obtained the correct min_size/max_size parameters. Otherwise we may end up with a hash table whose size is outside the allowed envelope. Fixes: a998f712f77e ("rhashtable: Round up/down min/max_size to...") Reported-by: William Hua <william.hua@canonical.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-16Merge remote-tracking branches 'spi/fix/dspi' and 'spi/fix/spidev' into ↵Mark Brown2-7/+7
spi-linus
2015-12-16Merge remote-tracking branch 'spi/fix/core' into spi-linusMark Brown1-1/+1
2015-12-16spi: fix parent-device reference leakJohan Hovold1-1/+1
Fix parent-device reference leak due to SPI-core taking an unnecessary reference to the parent when allocating the master structure, a reference that was never released. Note that driver core takes its own reference to the parent when the master device is registered. Fixes: 49dce689ad4e ("spi doesn't need class_device") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org
2015-12-16spi: spidev: Hold spi_lock over all defererences of spi in release()Mark Brown1-1/+1
We use the spi_lock spinlock to protect against races between the device being removed and file operations on the spidev. This means that in the removal path all references to the device need to be done under lock as in removal we dropping references to the device. Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2015-12-16Partial revert of "powerpc: Individual System V IPC system calls"Michael Ellerman2-24/+12
This partially reverts commit a34236155afb1cc41945e58388ac988431bcb0b8. While reviewing the glibc patch to exploit the individual IPC calls, Arnd & Andreas noticed that we were still requiring userspace to pass IPC_64 in order to get the new style IPC API. With a bit of cleanup in the kernel we can drop that requirement, and instead only provide the new style API, which will simplify things for userspace. Rather than try and sneak that patch into 4.4, instead we will drop the individual IPC calls for powerpc, and merge them again in 4.5 once the cleanup patch has gone in. Because we've already added sys_mlock2() as syscall #378, we don't do a full revert of the IPC calls. Instead we drop the __NR #defines, and send those now undefined syscall numbers to sys_ni_syscall(). This leaves a gap in the syscall numbers, but we'll reuse them when we merge the individual IPC calls. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Arnd Bergmann <arnd@arndb.de>
2015-12-16inet: tcp: fix inetpeer_set_addr_v4()Eric Dumazet1-0/+1
David Ahern added a vif field in the a4 part of inetpeer_addr struct. This broke IPv4 TCP fast open client side and more generally tcp metrics cache, because inetpeer_addr_cmp() is now comparing two u32 instead of one. inetpeer_set_addr_v4() needs to properly init vif field, otherwise the comparison result depends on uninitialized data. Fixes: 192132b9a034 ("net: Add support for VRFs to inetpeer cache") Reported-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15ipv6: automatically enable stable privacy mode if stable_secret setHannes Frederic Sowa1-0/+6
Bjørn reported that while we switch all interfaces to privacy stable mode when setting the secret, we don't set this mode for new interfaces. This does not make sense, so change this behaviour. Fixes: 622c81d57b392cc ("ipv6: generation of stable privacy addresses for link-local and autoconf") Reported-by: Bjørn Mork <bjorn@mork.no> Cc: Bjørn Mork <bjorn@mork.no> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15Revert "scatterlist: use sg_phys()"Dan Williams5-7/+8
commit db0fa0cb0157 "scatterlist: use sg_phys()" did replacements of the form: phys_addr_t phys = page_to_phys(sg_page(s)); phys_addr_t phys = sg_phys(s) & PAGE_MASK; However, this breaks platforms where sizeof(phys_addr_t) > sizeof(unsigned long). Revert for 4.3 and 4.4 to make room for a combined helper in 4.5. Cc: <stable@vger.kernel.org> Cc: Jens Axboe <axboe@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Fixes: db0fa0cb0157 ("scatterlist: use sg_phys()") Suggested-by: Joerg Roedel <joro@8bytes.org> Reported-by: Vitaly Lavrov <vel21ripn@gmail.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-12-15net: fix uninitialized variable issuetadeusz.struk@intel.com1-0/+1
msg_iocb needs to be initialized on the recv/recvfrom path. Otherwise afalg will wrongly interpret it as an async call. Cc: stable@vger.kernel.org Reported-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15bluetooth: Validate socket address length in sco_sock_bind().David S. Miller1-0/+3
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15Input: elan_i2c - set input device's vendor and product IDsCharlie Mooney1-0/+3
Previously the "vendor" and "product" IDs for the elan_i2c driver simply reported 0000. This patch modifies the elan_i2c driver to include the Elan vendor ID and the touchpad's product id under input/input*/{vendor,product}. Specifically, this is to allow us to apply a generic Elan gestures config that will apply to all Elan touchpads on ChromeOS. These configs match to input devices in various ways, but one major way is by matching on vendor ID. Adding this patch allows the default Elan touchpad config to be applied to Elan touchpads in this kernel by matching on devices that have vendor ID 04f3. Note that product ID is also available via custom sysfs entry "product_id" as well. Signed-off-by: Charlie Mooney <charliemooney@chromium.org> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>