summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2019-11-16Merge branch 'akpm' (patches from Andrew)Linus Torvalds12-86/+133
Merge misc fixes from Andrew Morton: "11 fixes" MM fixes and one xz decompressor fix. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/debug.c: PageAnon() is true for PageKsm() pages mm/debug.c: __dump_page() prints an extra line mm/page_io.c: do not free shared swap slots mm/memory_hotplug: fix try_offline_node() mm,thp: recheck each page before collapsing file THP mm: slub: really fix slab walking for init_on_free mm: hugetlb: switch to css_tryget() in hugetlb_cgroup_charge_cgroup() mm: memcg: switch to css_tryget() in get_mem_cgroup_from_mm() lib/xz: fix XZ_DYNALLOC to avoid useless memory reallocations mm: fix trying to reclaim unevictable lru page when calling madvise_pageout mm: mempolicy: fix the wrong return value and potential pages leak of mbind
2019-11-15Merge branch 'for-linus' of ↵Linus Torvalds3-0/+11
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull more input fixes from Dmitry Torokhov: "A couple of fixes in driver teardown paths and another ID for Synaptics RMI mode" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: synaptics - enable RMI mode for X1 Extreme 2nd Generation Input: synaptics-rmi4 - destroy F54 poller workqueue when removing Input: ff-memless - kill timer in destroy()
2019-11-15mm/debug.c: PageAnon() is true for PageKsm() pagesRalph Campbell1-3/+3
PageAnon() and PageKsm() use the low two bits of the page->mapping pointer to indicate the page type. PageAnon() only checks the LSB while PageKsm() checks the least significant 2 bits are equal to 3. Therefore, PageAnon() is true for KSM pages. __dump_page() incorrectly will never print "ksm" because it checks PageAnon() first. Fix this by checking PageKsm() first. Link: http://lkml.kernel.org/r/20191113000651.20677-1-rcampbell@nvidia.com Fixes: 1c6fb1d89e73 ("mm: print more information about mapping in __dump_page") Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Jerome Glisse <jglisse@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-15mm/debug.c: __dump_page() prints an extra lineRalph Campbell1-12/+15
When dumping struct page information, __dump_page() prints the page type with a trailing blank followed by the page flags on a separate line: anon flags: 0x100000000090034(uptodate|lru|active|head|swapbacked) It looks like the intent was to use pr_cont() for printing "flags:" but pr_cont() usage is discouraged so fix this by extending the format to include the flags into a single line: anon flags: 0x100000000090034(uptodate|lru|active|head|swapbacked) If the page is file backed, the name might be long so use two lines: shmem_aops name:"dev/zero" flags: 0x10000000008000c(uptodate|dirty|swapbacked) Eliminate pr_conf() usage as well for appending compound_mapcount. Link: http://lkml.kernel.org/r/20191112012608.16926-1-rcampbell@nvidia.com Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Jerome Glisse <jglisse@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-15mm/page_io.c: do not free shared swap slotsVinayak Menon1-3/+3
The following race is observed due to which a processes faulting on a swap entry, finds the page neither in swapcache nor swap. This causes zram to give a zero filled page that gets mapped to the process, resulting in a user space crash later. Consider parent and child processes Pa and Pb sharing the same swap slot with swap_count 2. Swap is on zram with SWP_SYNCHRONOUS_IO set. Virtual address 'VA' of Pa and Pb points to the shared swap entry. Pa Pb fault on VA fault on VA do_swap_page do_swap_page lookup_swap_cache fails lookup_swap_cache fails Pb scheduled out swapin_readahead (deletes zram entry) swap_free (makes swap_count 1) Pb scheduled in swap_readpage (swap_count == 1) Takes SWP_SYNCHRONOUS_IO path zram enrty absent zram gives a zero filled page Fix this by making sure that swap slot is freed only when swap count drops down to one. Link: http://lkml.kernel.org/r/1571743294-14285-1-git-send-email-vinmenon@codeaurora.org Fixes: aa8d22a11da9 ("mm: swap: SWP_SYNCHRONOUS_IO: skip swapcache only if swapped page has no other reference") Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> Suggested-by: Minchan Kim <minchan@google.com> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-15mm/memory_hotplug: fix try_offline_node()David Hildenbrand3-16/+64
try_offline_node() is pretty much broken right now: - The node span is updated when onlining memory, not when adding it. We ignore memory that was mever onlined. Bad. - We touch possible garbage memmaps. The pfn_to_nid(pfn) can easily trigger a kernel panic. Bad for memory that is offline but also bad for subsection hotadd with ZONE_DEVICE, whereby the memmap of the first PFN of a section might contain garbage. - Sections belonging to mixed nodes are not properly considered. As memory blocks might belong to multiple nodes, we would have to walk all pageblocks (or at least subsections) within present sections. However, we don't have a way to identify whether a memmap that is not online was initialized (relevant for ZONE_DEVICE). This makes things more complicated. Luckily, we can piggy pack on the node span and the nid stored in memory blocks. Currently, the node span is grown when calling move_pfn_range_to_zone() - e.g., when onlining memory, and shrunk when removing memory, before calling try_offline_node(). Sysfs links are created via link_mem_sections(), e.g., during boot or when adding memory. If the node still spans memory or if any memory block belongs to the nid, we don't set the node offline. As memory blocks that span multiple nodes cannot get offlined, the nid stored in memory blocks is reliable enough (for such online memory blocks, the node still spans the memory). Introduce for_each_memory_block() to efficiently walk all memory blocks. Note: We will soon stop shrinking the ZONE_DEVICE zone and the node span when removing ZONE_DEVICE memory to fix similar issues (access of garbage memmaps) - until we have a reliable way to identify whether these memmaps were properly initialized. This implies later, that once a node had ZONE_DEVICE memory, we won't be able to set a node offline - which should be acceptable. Since commit f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") memory that is added is not assoziated with a zone/node (memmap not initialized). The introducing commit 60a5a19e7419 ("memory-hotplug: remove sysfs file of node") already missed that we could have multiple nodes for a section and that the zone/node span is updated when onlining pages, not when adding them. I tested this by hotplugging two DIMMs to a memory-less and cpu-less NUMA node. The node is properly onlined when adding the DIMMs. When removing the DIMMs, the node is properly offlined. Masayoshi Mizuma reported: : Without this patch, memory hotplug fails as panic: : : BUG: kernel NULL pointer dereference, address: 0000000000000000 : ... : Call Trace: : remove_memory_block_devices+0x81/0xc0 : try_remove_memory+0xb4/0x130 : __remove_memory+0xa/0x20 : acpi_memory_device_remove+0x84/0x100 : acpi_bus_trim+0x57/0x90 : acpi_bus_trim+0x2e/0x90 : acpi_device_hotplug+0x2b2/0x4d0 : acpi_hotplug_work_fn+0x1a/0x30 : process_one_work+0x171/0x380 : worker_thread+0x49/0x3f0 : kthread+0xf8/0x130 : ret_from_fork+0x35/0x40 [david@redhat.com: v3] Link: http://lkml.kernel.org/r/20191102120221.7553-1-david@redhat.com Link: http://lkml.kernel.org/r/20191028105458.28320-1-david@redhat.com Fixes: 60a5a19e7419 ("memory-hotplug: remove sysfs file of node") Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") # visiable after d0dc12e86b319 Signed-off-by: David Hildenbrand <david@redhat.com> Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Keith Busch <keith.busch@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Nayna Jain <nayna@linux.ibm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-15mm,thp: recheck each page before collapsing file THPSong Liu1-12/+16
In collapse_file(), for !is_shmem case, current check cannot guarantee the locked page is up-to-date. Specifically, xas_unlock_irq() should not be called before lock_page() and get_page(); and it is necessary to recheck PageUptodate() after locking the page. With this bug and CONFIG_READ_ONLY_THP_FOR_FS=y, madvise(HUGE)'ed .text may contain corrupted data. This is because khugepaged mistakenly collapses some not up-to-date sub pages into a huge page, and assumes the huge page is up-to-date. This will NOT corrupt data in the disk, because the page is read-only and never written back. Fix this by properly checking PageUptodate() after locking the page. This check replaces "VM_BUG_ON_PAGE(!PageUptodate(page), page);". Also, move PageDirty() check after locking the page. Current khugepaged should not try to collapse dirty file THP, because it is limited to read-only .text. The only case we hit a dirty page here is when the page hasn't been written since write. Bail out and retry when this happens. syzbot reported bug on previous version of this patch. Link: http://lkml.kernel.org/r/20191106060930.2571389-2-songliubraving@fb.com Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS") Signed-off-by: Song Liu <songliubraving@fb.com> Reported-by: syzbot+efb9e48b9fbdc49bb34a@syzkaller.appspotmail.com Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-15mm: slub: really fix slab walking for init_on_freeLaura Abbott1-30/+9
Commit 1b7e816fc80e ("mm: slub: Fix slab walking for init_on_free") fixed one problem with the slab walking but missed a key detail: When walking the list, the head and tail pointers need to be updated since we end up reversing the list as a result. Without doing this, bulk free is broken. One way this is exposed is a NULL pointer with slub_debug=F: ============================================================================= BUG skbuff_head_cache (Tainted: G T): Object already free ----------------------------------------------------------------------------- INFO: Slab 0x000000000d2d2f8f objects=16 used=3 fp=0x0000000064309071 flags=0x3fff00000000201 BUG: kernel NULL pointer dereference, address: 0000000000000000 Oops: 0000 [#1] PREEMPT SMP PTI RIP: 0010:print_trailer+0x70/0x1d5 Call Trace: <IRQ> free_debug_processing.cold.37+0xc9/0x149 __slab_free+0x22a/0x3d0 kmem_cache_free_bulk+0x415/0x420 __kfree_skb_flush+0x30/0x40 net_rx_action+0x2dd/0x480 __do_softirq+0xf0/0x246 irq_exit+0x93/0xb0 do_IRQ+0xa0/0x110 common_interrupt+0xf/0xf </IRQ> Given we're now almost identical to the existing debugging code which correctly walks the list, combine with that. Link: https://lkml.kernel.org/r/20191104170303.GA50361@gandi.net Link: http://lkml.kernel.org/r/20191106222208.26815-1-labbott@redhat.com Fixes: 1b7e816fc80e ("mm: slub: Fix slab walking for init_on_free") Signed-off-by: Laura Abbott <labbott@redhat.com> Reported-by: Thibaut Sautereau <thibaut.sautereau@clip-os.org> Acked-by: David Rientjes <rientjes@google.com> Tested-by: Alexander Potapenko <glider@google.com> Acked-by: Alexander Potapenko <glider@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <clipos@ssi.gouv.fr> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-15mm: hugetlb: switch to css_tryget() in hugetlb_cgroup_charge_cgroup()Roman Gushchin1-1/+1
An exiting task might belong to an offline cgroup. In this case an attempt to grab a cgroup reference from the task can end up with an infinite loop in hugetlb_cgroup_charge_cgroup(), because neither the cgroup will become online, neither the task will be migrated to a live cgroup. Fix this by switching over to css_tryget(). As css_tryget_online() can't guarantee that the cgroup won't go offline, in most cases the check doesn't make sense. In this particular case users of hugetlb_cgroup_charge_cgroup() are not affected by this change. A similar problem is described by commit 18fa84a2db0e ("cgroup: Use css_tryget() instead of css_tryget_online() in task_get_css()"). Link: http://lkml.kernel.org/r/20191106225131.3543616-2-guro@fb.com Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-15mm: memcg: switch to css_tryget() in get_mem_cgroup_from_mm()Roman Gushchin1-1/+1
We've encountered a rcu stall in get_mem_cgroup_from_mm(): rcu: INFO: rcu_sched self-detected stall on CPU rcu: 33-....: (21000 ticks this GP) idle=6c6/1/0x4000000000000002 softirq=35441/35441 fqs=5017 (t=21031 jiffies g=324821 q=95837) NMI backtrace for cpu 33 <...> RIP: 0010:get_mem_cgroup_from_mm+0x2f/0x90 <...> __memcg_kmem_charge+0x55/0x140 __alloc_pages_nodemask+0x267/0x320 pipe_write+0x1ad/0x400 new_sync_write+0x127/0x1c0 __kernel_write+0x4f/0xf0 dump_emit+0x91/0xc0 writenote+0xa0/0xc0 elf_core_dump+0x11af/0x1430 do_coredump+0xc65/0xee0 get_signal+0x132/0x7c0 do_signal+0x36/0x640 exit_to_usermode_loop+0x61/0xd0 do_syscall_64+0xd4/0x100 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The problem is caused by an exiting task which is associated with an offline memcg. We're iterating over and over in the do {} while (!css_tryget_online()) loop, but obviously the memcg won't become online and the exiting task won't be migrated to a live memcg. Let's fix it by switching from css_tryget_online() to css_tryget(). As css_tryget_online() cannot guarantee that the memcg won't go offline, the check is usually useless, except some rare cases when for example it determines if something should be presented to a user. A similar problem is described by commit 18fa84a2db0e ("cgroup: Use css_tryget() instead of css_tryget_online() in task_get_css()"). Johannes: : The bug aside, it doesn't matter whether the cgroup is online for the : callers. It used to matter when offlining needed to evacuate all charges : from the memcg, and so needed to prevent new ones from showing up, but we : don't care now. Link: http://lkml.kernel.org/r/20191106225131.3543616-1-guro@fb.com Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Shakeel Butt <shakeeb@google.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Koutn <mkoutny@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-15lib/xz: fix XZ_DYNALLOC to avoid useless memory reallocationsLasse Collin1-0/+1
s->dict.allocated was initialized to 0 but never set after a successful allocation, thus the code always thought that the dictionary buffer has to be reallocated. Link: http://lkml.kernel.org/r/20191104185107.3b6330df@tukaani.org Signed-off-by: Lasse Collin <lasse.collin@tukaani.org> Reported-by: Yu Sun <yusun2@cisco.com> Acked-by: Daniel Walker <danielwa@cisco.com> Cc: "Yixia Si (yisi)" <yisi@cisco.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-15mm: fix trying to reclaim unevictable lru page when calling madvise_pageoutzhong jiang1-4/+12
Recently, I hit the following issue when running upstream. kernel BUG at mm/vmscan.c:1521! invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 0 PID: 23385 Comm: syz-executor.6 Not tainted 5.4.0-rc4+ #1 RIP: 0010:shrink_page_list+0x12b6/0x3530 mm/vmscan.c:1521 Call Trace: reclaim_pages+0x499/0x800 mm/vmscan.c:2188 madvise_cold_or_pageout_pte_range+0x58a/0x710 mm/madvise.c:453 walk_pmd_range mm/pagewalk.c:53 [inline] walk_pud_range mm/pagewalk.c:112 [inline] walk_p4d_range mm/pagewalk.c:139 [inline] walk_pgd_range mm/pagewalk.c:166 [inline] __walk_page_range+0x45a/0xc20 mm/pagewalk.c:261 walk_page_range+0x179/0x310 mm/pagewalk.c:349 madvise_pageout_page_range mm/madvise.c:506 [inline] madvise_pageout+0x1f0/0x330 mm/madvise.c:542 madvise_vma mm/madvise.c:931 [inline] __do_sys_madvise+0x7d2/0x1600 mm/madvise.c:1113 do_syscall_64+0x9f/0x4c0 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe madvise_pageout() accesses the specified range of the vma and isolates them, then runs shrink_page_list() to reclaim its memory. But it also isolates the unevictable pages to reclaim. Hence, we can catch the cases in shrink_page_list(). The root cause is that we scan the page tables instead of specific LRU list. and so we need to filter out the unevictable lru pages from our end. Link: http://lkml.kernel.org/r/1572616245-18946-1-git-send-email-zhongjiang@huawei.com Fixes: 1a4e58cce84e ("mm: introduce MADV_PAGEOUT") Signed-off-by: zhong jiang <zhongjiang@huawei.com> Suggested-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-15mm: mempolicy: fix the wrong return value and potential pages leak of mbindYang Shi1-5/+9
Commit d883544515aa ("mm: mempolicy: make the behavior consistent when MPOL_MF_MOVE* and MPOL_MF_STRICT were specified") fixed the return value of mbind() for a couple of corner cases. But, it altered the errno for some other cases, for example, mbind() should return -EFAULT when part or all of the memory range specified by nodemask and maxnode points outside your accessible address space, or there was an unmapped hole in the specified memory range specified by addr and len. Fix this by preserving the errno returned by queue_pages_range(). And, the pagelist may be not empty even though queue_pages_range() returns error, put the pages back to LRU since mbind_range() is not called to really apply the policy so those pages should not be migrated, this is also the old behavior before the problematic commit. Link: http://lkml.kernel.org/r/1572454731-3925-1-git-send-email-yang.shi@linux.alibaba.com Fixes: d883544515aa ("mm: mempolicy: make the behavior consistent when MPOL_MF_MOVE* and MPOL_MF_STRICT were specified") Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> Reported-by: Li Xinhai <lixinhai.lxh@gmail.com> Reviewed-by: Li Xinhai <lixinhai.lxh@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: <stable@vger.kernel.org> [4.19 and 5.2+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-15Input: synaptics - enable RMI mode for X1 Extreme 2nd GenerationLyude Paul1-0/+1
Just got one of these for debugging some unrelated issues, and noticed that Lenovo seems to have gone back to using RMI4 over smbus with Synaptics touchpads on some of their new systems, particularly this one. So, let's enable RMI mode for the X1 Extreme 2nd Generation. Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://lore.kernel.org/r/20191115221814.31903-1-lyude@redhat.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2019-11-15Merge tag 'for-linus-20191115' of git://git.kernel.dk/linux-blockLinus Torvalds5-17/+59
Pull block fixes from Jens Axboe: "A few fixes that should make it into this release. This contains: - io_uring: - The timeout command assumes sequence == 0 means that we want one completion, but this kind of overloading is unfortunate as it prevents users from doing a pure time based wait. Since this operation was introduced in this cycle, let's correct it now, while we can. (me) - One-liner to fix an issue with dependent links and fixed buffer reads. The actual IO completed fine, but the link got severed since we stored the wrong expected value. (me) - Add TIMEOUT to list of opcodes that don't need a file. (Pavel) - rsxx missing workqueue destry calls. Old bug. (Chuhong) - Fix blk-iocost active list check (Jiufei) - Fix impossible-to-hit overflow merge condition, that still hit some folks very rarely (Junichi) - Fix bfq hang issue from 5.3. This didn't get marked for stable, but will go into stable post this merge (Paolo)" * tag 'for-linus-20191115' of git://git.kernel.dk/linux-block: rsxx: add missed destroy_workqueue calls in remove iocost: check active_list of all the ancestors in iocg_activate() block, bfq: deschedule empty bfq_queues not referred by any process io_uring: ensure registered buffer import returns the IO length io_uring: Fix getting file for timeout block: check bi_size overflow before merge io_uring: make timeout sequence == 0 mean no sequence
2019-11-15Input: synaptics-rmi4 - destroy F54 poller workqueue when removingChuhong Yuan1-0/+1
The driver forgets to destroy workqueue in remove() similarly to what is done when probe() fails. Add a call to destroy_workqueue() to fix it. Since unregistration will wait for the work to finish, we do not need to cancel/flush the work instance in remove(). Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20191114023405.31477-1-hslester96@gmail.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2019-11-15Input: ff-memless - kill timer in destroy()Oliver Neukum1-0/+9
No timer must be left running when the device goes away. Signed-off-by: Oliver Neukum <oneukum@suse.com> Reported-and-tested-by: syzbot+b6c55daa701fc389e286@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1573726121.17351.3.camel@suse.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2019-11-15Merge tag 'ceph-for-5.4-rc8' of git://github.com/ceph/ceph-clientLinus Torvalds2-8/+23
Pull ceph fixes from Ilya Dryomov: "Two fixes for the buffered reads and O_DIRECT writes serialization patch that went into -rc1 and a fixup for a bogus warning on older gcc versions" * tag 'ceph-for-5.4-rc8' of git://github.com/ceph/ceph-client: rbd: silence bogus uninitialized warning in rbd_object_map_update_finish() ceph: increment/decrement dio counter on async requests ceph: take the inode lock before acquiring cap refs
2019-11-15afs: Fix race in commit bulk status fetchDavid Howells1-1/+6
When a lookup is done, the afs filesystem will perform a bulk status-fetch operation on the requested vnode (file) plus the next 49 other vnodes from the directory list (in AFS, directory contents are downloaded as blobs and parsed locally). When the results are received, it will speculatively populate the inode cache from the extra data. However, if the lookup races with another lookup on the same directory, but for a different file - one that's in the 49 extra fetches, then if the bulk status-fetch operation finishes first, it will try and update the inode from the other lookup. If this other inode is still in the throes of being created, however, this will cause an assertion failure in afs_apply_status(): BUG_ON(test_bit(AFS_VNODE_UNSET, &vnode->flags)); on or about fs/afs/inode.c:175 because it expects data to be there already that it can compare to. Fix this by skipping the update if the inode is being created as the creator will presumably set up the inode with the same information. Fixes: 39db9815da48 ("afs: Fix application of the results of a inline bulk status fetch") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-15Merge tag 'arm64-fixes' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fix from Will Deacon: "One trivial fix for -rc8/final that ensures that the script used to detect RELR relocation support in the toolchain works correctly when $CC contains quotes. Although it fails safely (by failing to detect the support when it exists), it would be nice to have this fixed in 5.4 given that it was only introduced in the last merge window. Summary: - Handle CC variables containing quotes in tools-support-relr.sh script" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: scripts/tools-support-relr.sh: un-quote variables
2019-11-15Merge tag 'mips_fixes_5.4_4' of ↵Linus Torvalds4-27/+6
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Paul Burton: "A fix and simplification for SGI IP27 exception handlers, and a small MAINTAINERS update for Broadcom MIPS systems" * tag 'mips_fixes_5.4_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MAINTAINERS: Remove Kevin as maintainer of BMIPS generic platforms MIPS: SGI-IP27: fix exception handler replication
2019-11-15Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds4-6/+27
Pull more KVM fixes from Paolo Bonzini: - fixes for CONFIG_KVM_COMPAT=n - two updates to the IFU erratum - selftests build fix - brown paper bag fix * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Add a comment describing the /dev/kvm no_compat handling KVM: x86/mmu: Take slots_lock when using kvm_mmu_zap_all_fast() KVM: Forbid /dev/kvm being opened by a compat task when CONFIG_KVM_COMPAT=n KVM: X86: Reset the three MSR list number variables to 0 in kvm_init_msr_list() selftests: kvm: fix build with glibc >= 2.30 kvm: x86: disable shattered huge page recovery for PREEMPT_RT.
2019-11-15Merge tag 'mmc-v5.4-rc7' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fix from Ulf Hansson: "Don't overwrite quirk flags in sdhci-of-at91 host driver" * tag 'mmc-v5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci-of-at91: fix quirk2 overwrite
2019-11-15Merge tag 'sound-5.4-rc8' of ↵Linus Torvalds7-9/+23
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A few small last-minute fixes for USB-audio and HD-audio as well as for PCM core: - A race fix for PCM core between stopping and closing a stream - USB-audio regressions in the recent descriptor validation code and relevant changes - A read of uninitialized value in USB-audio spotted by fuzzer - A fix for USB-audio race at stopping a stream - Intel HD-audio platform fixes" * tag 'sound-5.4-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: usb-audio: Fix incorrect size check for processing/extension units ALSA: usb-audio: Fix incorrect NULL check in create_yamaha_midi_quirk() ALSA: pcm: Fix stream lock usage in snd_pcm_period_elapsed() ALSA: usb-audio: not submit urb for stopped endpoint ALSA: hda: hdmi - fix pin setup on Tigerlake ALSA: hda: Add Cometlake-S PCI ID ALSA: usb-audio: Fix missing error check at mixer resolution test
2019-11-15Merge tag 'drm-fixes-2019-11-15' of git://anongit.freedesktop.org/drm/drmLinus Torvalds6-37/+23
Pull drm fixes from Dave Airlie: "Here is this weeks non-intel hw vuln fixes pull. Three drivers, all small fixes. i915: - MOCS table fixes for EHL and TGL - Update Display's rawclock on resume - GVT's dmabuf reference drop fix amdgpu: - Fix a potential crash in firmware parsing sun4i: - One fix to the dotclock dividers range for sun4i" * tag 'drm-fixes-2019-11-15' of git://anongit.freedesktop.org/drm/drm: drm/amdgpu: fix null pointer deref in firmware header printing drm/i915/tgl: MOCS table update Revert "drm/i915/ehl: Update MOCS table for EHL" drm/sun4i: tcon: Set min division of TCON0_DCLK to 1. drm/i915: update rawclk also on resume drm/i915/gvt: fix dropping obj reference twice
2019-11-15Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds7-61/+91
Pull misc vfs fixes from Al Viro: "Assorted fixes all over the place; some of that is -stable fodder, some regressions from the last window" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ecryptfs_lookup_interpose(): lower_dentry->d_parent is not stable either ecryptfs_lookup_interpose(): lower_dentry->d_inode is not stable ecryptfs: fix unlink and rmdir in face of underlying fs modifications audit_get_nd(): don't unlock parent too early exportfs_decode_fh(): negative pinned may become positive without the parent locked cgroup: don't put ERR_PTR() into fc->root autofs: fix a leak in autofs_expire_indirect() aio: Fix io_pgetevents() struct __compat_aio_sigset layout fs/namespace.c: fix use-after-free of mount in mnt_warn_timestamp_expiry()
2019-11-15KVM: Add a comment describing the /dev/kvm no_compat handlingMarc Zyngier1-0/+7
Add a comment explaining the rational behind having both no_compat open and ioctl callbacks to fend off compat tasks. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-15Merge tag 'drm-fixes-5.4-2019-11-14' of ↵Dave Airlie1-22/+16
git://people.freedesktop.org/~agd5f/linux into drm-fixes drm-fixes-5.4-2019-11-14: amdgpu: - Fix a potential crash in firmware parsing Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191114221354.3914-1-alexander.deucher@amd.com
2019-11-15Merge tag 'drm-misc-fixes-2019-11-13' of ↵Dave Airlie1-1/+1
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes - One fix to the dotclock dividers range for sun4i Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20191113142645.GA967172@gilmour.lan
2019-11-15Merge tag 'drm-intel-fixes-2019-11-13' of ↵Dave Airlie4-14/+6
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - MOCS table fixes for EHL and TGL - Update Display's rawclock on resume - GVT's dmabuf reference drop fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191114055302.GA3564@intel.com
2019-11-14drm/amdgpu: fix null pointer deref in firmware header printingXiaojie Yuan1-22/+16
v2: declare as (struct common_firmware_header *) type because struct xxx_firmware_header inherits from it When CE's ucode_id(8) is used to get sdma_hdr, we will be accessing an unallocated amdgpu_firmware_info instance. This issue appears on rhel7.7 with gcc 4.8.5. Newer compilers might have optimized out such 'defined but not referenced' variable. [ 1120.798564] BUG: unable to handle kernel NULL pointer dereference at 000000000000000a [ 1120.806703] IP: [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu] [ 1120.813693] PGD 80000002603ff067 PUD 271b8d067 PMD 0 [ 1120.818931] Oops: 0000 [#1] SMP [ 1120.822245] Modules linked in: amdgpu(OE+) amdkcl(OE) amd_iommu_v2 amdttm(OE) amd_sched(OE) xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun bridge stp llc devlink ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc dm_mirror dm_region_hash dm_log dm_mod intel_pmc_core intel_powerclamp coretemp intel_rapl joydev kvm_intel eeepc_wmi asus_wmi kvm sparse_keymap iTCO_wdt irqbypass rfkill crc32_pclmul snd_hda_codec_realtek mxm_wmi ghash_clmulni_intel intel_wmi_thunderbolt iTCO_vendor_support snd_hda_codec_generic snd_hda_codec_hdmi aesni_intel lrw gf128mul glue_helper ablk_helper sg cryptd pcspkr snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd pinctrl_sunrisepoint pinctrl_intel soundcore acpi_pad mei_me wmi mei i2c_i801 pcc_cpufreq ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic i915 i2c_algo_bit iosf_mbi drm_kms_helper e1000e syscopyarea sysfillrect sysimgblt fb_sys_fops ahci libahci drm ptp libata crct10dif_pclmul crct10dif_common crc32c_intel serio_raw pps_core drm_panel_orientation_quirks video i2c_hid [ 1120.954136] CPU: 4 PID: 2426 Comm: modprobe Tainted: G OE ------------ 3.10.0-1062.el7.x86_64 #1 [ 1120.964390] Hardware name: System manufacturer System Product Name/Z170-A, BIOS 1302 11/09/2015 [ 1120.973321] task: ffff991ef1e3c1c0 ti: ffff991ee625c000 task.ti: ffff991ee625c000 [ 1120.981020] RIP: 0010:[<ffffffffc0e3c9b3>] [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu] [ 1120.990483] RSP: 0018:ffff991ee625f950 EFLAGS: 00010202 [ 1120.995935] RAX: 0000000000000002 RBX: ffff991edf6b2d38 RCX: ffff991edf6a0000 [ 1121.003391] RDX: 0000000000000000 RSI: ffff991f01d13898 RDI: ffffffffc110afb3 [ 1121.010706] RBP: ffff991ee625f9b0 R08: 0000000000000000 R09: 0000000000000000 [ 1121.018029] R10: 00000000000004c4 R11: ffff991ee625f64e R12: ffff991edf6b3220 [ 1121.025353] R13: ffff991edf6a0000 R14: 0000000000000008 R15: ffff991edf6b2d30 [ 1121.032666] FS: 00007f97b0c0b740(0000) GS:ffff991f01d00000(0000) knlGS:0000000000000000 [ 1121.041000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1121.046880] CR2: 000000000000000a CR3: 000000025e604000 CR4: 00000000003607e0 [ 1121.054239] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1121.061631] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1121.068938] Call Trace: [ 1121.071494] [<ffffffffc0e3dba8>] psp_hw_init+0x218/0x270 [amdgpu] [ 1121.077886] [<ffffffffc0da3188>] amdgpu_device_fw_loading+0xe8/0x160 [amdgpu] [ 1121.085296] [<ffffffffc0e3b34c>] ? vega10_ih_irq_init+0x4bc/0x730 [amdgpu] [ 1121.092534] [<ffffffffc0da5c75>] amdgpu_device_init+0x1495/0x1c90 [amdgpu] [ 1121.099675] [<ffffffffc0da9cab>] amdgpu_driver_load_kms+0x8b/0x2f0 [amdgpu] [ 1121.106888] [<ffffffffc01b25cf>] drm_dev_register+0x12f/0x1d0 [drm] [ 1121.113419] [<ffffffffa4dcdfd8>] ? pci_enable_device_flags+0xe8/0x140 [ 1121.120183] [<ffffffffc0da260a>] amdgpu_pci_probe+0xca/0x170 [amdgpu] [ 1121.126919] [<ffffffffa4dcf97a>] local_pci_probe+0x4a/0xb0 [ 1121.132622] [<ffffffffa4dd10c9>] pci_device_probe+0x109/0x160 [ 1121.138607] [<ffffffffa4eb4205>] driver_probe_device+0xc5/0x3e0 [ 1121.144766] [<ffffffffa4eb4603>] __driver_attach+0x93/0xa0 [ 1121.150507] [<ffffffffa4eb4570>] ? __device_attach+0x50/0x50 [ 1121.156422] [<ffffffffa4eb1da5>] bus_for_each_dev+0x75/0xc0 [ 1121.162213] [<ffffffffa4eb3b7e>] driver_attach+0x1e/0x20 [ 1121.167771] [<ffffffffa4eb3620>] bus_add_driver+0x200/0x2d0 [ 1121.173590] [<ffffffffa4eb4c94>] driver_register+0x64/0xf0 [ 1121.179345] [<ffffffffa4dd0905>] __pci_register_driver+0xa5/0xc0 [ 1121.185593] [<ffffffffc099f000>] ? 0xffffffffc099efff [ 1121.190914] [<ffffffffc099f0a4>] amdgpu_init+0xa4/0xb0 [amdgpu] [ 1121.197101] [<ffffffffa4a0210a>] do_one_initcall+0xba/0x240 [ 1121.202901] [<ffffffffa4b1c90a>] load_module+0x271a/0x2bb0 [ 1121.208598] [<ffffffffa4dad740>] ? ddebug_proc_write+0x100/0x100 [ 1121.214894] [<ffffffffa4b1ce8f>] SyS_init_module+0xef/0x140 [ 1121.220698] [<ffffffffa518bede>] system_call_fastpath+0x25/0x2a [ 1121.226870] Code: b4 01 60 a2 00 00 31 c0 e8 83 60 33 e4 41 8b 47 08 48 8b 4d d0 48 c7 c7 b3 af 10 c1 48 69 c0 68 07 00 00 48 8b 84 01 60 a2 00 00 <48> 8b 70 08 31 c0 48 89 75 c8 e8 56 60 33 e4 48 8b 4d d0 48 c7 [ 1121.247422] RIP [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu] [ 1121.254432] RSP <ffff991ee625f950> [ 1121.258017] CR2: 000000000000000a [ 1121.261427] ---[ end trace e98b35387ede75bd ]--- Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Fixes: c5fb912653dae3f878 ("drm/amdgpu: add firmware header printing for psp fw loading (v2)") Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-14rsxx: add missed destroy_workqueue calls in removeChuhong Yuan1-0/+2
The driver misses calling destroy_workqueue in remove like what is done when probe fails. Add the missed calls to fix it. Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-14iocost: check active_list of all the ancestors in iocg_activate()Jiufei Xue1-2/+6
There is a bug that checking the same active_list over and over again in iocg_activate(). The intention of the code was checking whether all the ancestors and self have already been activated. So fix it. Fixes: 7caa47151ab2 ("blkcg: implement blk-iocost") Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-14rbd: silence bogus uninitialized warning in rbd_object_map_update_finish()Ilya Dryomov1-1/+1
Some versions of gcc (so far 6.3 and 7.4) throw a warning: drivers/block/rbd.c: In function 'rbd_object_map_callback': drivers/block/rbd.c:2124:21: warning: 'current_state' may be used uninitialized in this function [-Wmaybe-uninitialized] (current_state == OBJECT_EXISTS && state == OBJECT_EXISTS_CLEAN)) drivers/block/rbd.c:2092:23: note: 'current_state' was declared here u8 state, new_state, current_state; ^~~~~~~~~~~~~ It's bogus because all current_state accesses are guarded by has_current_state. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
2019-11-14ceph: increment/decrement dio counter on async requestsJeff Layton1-0/+4
Ceph can in some cases issue an async DIO request, in which case we can end up calling ceph_end_io_direct before the I/O is actually complete. That may allow buffered operations to proceed while DIO requests are still in flight. Fix this by incrementing the i_dio_count when issuing an async DIO request, and decrement it when tearing down the aio_req. Fixes: 321fe13c9398 ("ceph: add buffered/direct exclusionary locking for reads and writes") Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-11-14ceph: take the inode lock before acquiring cap refsJeff Layton1-7/+18
Most of the time, we (or the vfs layer) takes the inode_lock and then acquires caps, but ceph_read_iter does the opposite, and that can lead to a deadlock. When there are multiple clients treading over the same data, we can end up in a situation where a reader takes caps and then tries to acquire the inode_lock. Another task holds the inode_lock and issues a request to the MDS which needs to revoke the caps, but that can't happen until the inode_lock is unwedged. Fix this by having ceph_read_iter take the inode_lock earlier, before attempting to acquire caps. Fixes: 321fe13c9398 ("ceph: add buffered/direct exclusionary locking for reads and writes") Link: https://tracker.ceph.com/issues/36348 Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-11-14ALSA: usb-audio: Fix incorrect size check for processing/extension unitsTakashi Iwai1-3/+3
The recently introduced unit descriptor validation had some bug for processing and extension units, it counts a bControlSize byte twice so it expected a bigger size than it should have been. This seems resulting in a probe error on a few devices. Fix the calculation for proper checks of PU and EU. Fixes: 57f8770620e9 ("ALSA: usb-audio: More validations of descriptor units") Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20191114165613.7422-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-11-14Merge tag 'kbuild-fixes-v5.4-3' of ↵Linus Torvalds2-2/+5
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - fix build error when compiling SPARC VDSO with CONFIG_COMPAT=y - pass correct --arch option to Sparse * tag 'kbuild-fixes-v5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: tell sparse about the $ARCH sparc: vdso: fix build error of vdso32
2019-11-14Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds7-39/+46
Pull RDMA fixes from Jason Gunthorpe: "Bug fixes for old bugs in the hns and hfi1 drivers: - Calculate various values in hns properly to avoid over/underflows in some cases - Fix an oops, PCI negotiation on Gen4 systems, and bugs related to retries" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/hns: Correct the value of srq_desc_size RDMA/hns: Correct the value of HNS_ROCE_HEM_CHUNK_LEN IB/hfi1: TID RDMA WRITE should not return IB_WC_RNR_RETRY_EXC_ERR IB/hfi1: Calculate flow weight based on QP MTU for TID RDMA IB/hfi1: Ensure r_tid_ack is valid before building TID RDMA ACK packet IB/hfi1: Ensure full Gen3 speed in a Gen4 system
2019-11-14KVM: x86/mmu: Take slots_lock when using kvm_mmu_zap_all_fast()Sean Christopherson1-3/+2
Acquire the per-VM slots_lock when zapping all shadow pages as part of toggling nx_huge_pages. The fast zap algorithm relies on exclusivity (via slots_lock) to identify obsolete vs. valid shadow pages, because it uses a single bit for its generation number. Holding slots_lock also obviates the need to acquire a read lock on the VM's srcu. Failing to take slots_lock when toggling nx_huge_pages allows multiple instances of kvm_mmu_zap_all_fast() to run concurrently, as the other user, KVM_SET_USER_MEMORY_REGION, does not take the global kvm_lock. (kvm_mmu_zap_all_fast() does take kvm->mmu_lock, but it can be temporarily dropped by kvm_zap_obsolete_pages(), so it is not enough to enforce exclusivity). Concurrent fast zap instances causes obsolete shadow pages to be incorrectly identified as valid due to the single bit generation number wrapping, which results in stale shadow pages being left in KVM's MMU and leads to all sorts of undesirable behavior. The bug is easily confirmed by running with CONFIG_PROVE_LOCKING and toggling nx_huge_pages via its module param. Note, until commit 4ae5acbc4936 ("KVM: x86/mmu: Take slots_lock when using kvm_mmu_zap_all_fast()", 2019-11-13) the fast zap algorithm used an ulong-sized generation instead of relying on exclusivity for correctness, but all callers except the recently added set_nx_huge_pages() needed to hold slots_lock anyways. Therefore, this patch does not have to be backported to stable kernels. Given that toggling nx_huge_pages is by no means a fast path, force it to conform to the current approach instead of reintroducing the previous generation count. Fixes: b8e8c8303ff28 ("kvm: mmu: ITLB_MULTIHIT mitigation", but NOT FOR STABLE) Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-15kbuild: tell sparse about the $ARCHLuc Van Oostenryck1-0/+3
Sparse uses the same executable for all archs and uses flags like -m64, -mbig-endian or -D__arm__ for arch-specific parameters. But Sparse also uses value from the host machine used to build Sparse as default value for the target machine. This works, of course, well for native build but can create problems when cross-compiling, like defining both '__i386__' and '__arm__' when cross-compiling for arm on a x86-64 machine. Fix this by explicitely telling sparse the target architecture. Reported-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-11-15sparc: vdso: fix build error of vdso32Masahiro Yamada1-2/+2
Since commit 54b8ae66ae1a ("kbuild: change *FLAGS_<basetarget>.o to take the path relative to $(obj)"), sparc allmodconfig fails to build as follows: CC arch/sparc/vdso/vdso32/vclock_gettime.o unrecognized e_machine 18 arch/sparc/vdso/vdso32/vclock_gettime.o arch/sparc/vdso/vdso32/vclock_gettime.o: failed The cause of the breakage is that -pg flag not being dropped. The vdso32 files are located in the vdso32/ subdirectory, but I missed to update the Makefile. I removed the meaningless CFLAGS_REMOVE_vdso-note.o since it is only effective for C file. vdso-note.o is compiled from assembly file: arch/sparc/vdso/vdso-note.S arch/sparc/vdso/vdso32/vdso-note.S Fixes: 54b8ae66ae1a ("kbuild: change *FLAGS_<basetarget>.o to take the path relative to $(obj)") Reported-by: Anatoly Pugachev <matorola@gmail.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Tested-by: Anatoly Pugachev <matorola@gmail.com> Acked-by: David S. Miller <davem@davemloft.net>
2019-11-14block, bfq: deschedule empty bfq_queues not referred by any processPaolo Valente1-6/+26
Since commit 3726112ec731 ("block, bfq: re-schedule empty queues if they deserve I/O plugging"), to prevent the service guarantees of a bfq_queue from being violated, the bfq_queue may be left busy, i.e., scheduled for service, even if empty (see comments in __bfq_bfqq_expire() for details). But, if no process will send requests to the bfq_queue any longer, then there is no point in keeping the bfq_queue scheduled for service. In addition, keeping the bfq_queue scheduled for service, but with no process reference any longer, may cause the bfq_queue to be freed when descheduled from service. But this is assumed to never happen, and causes a UAF if it happens. This, in turn, caused crashes [1, 2]. This commit fixes this issue by descheduling an empty bfq_queue when it remains with not process reference. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1767539 [2] https://bugzilla.kernel.org/show_bug.cgi?id=205447 Fixes: 3726112ec731 ("block, bfq: re-schedule empty queues if they deserve I/O plugging") Reported-by: Chris Evich <cevich@redhat.com> Reported-by: Patrick Dung <patdung100@gmail.com> Reported-by: Thorsten Schubert <tschubert@bafh.org> Tested-by: Thorsten Schubert <tschubert@bafh.org> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-14mmc: sdhci-of-at91: fix quirk2 overwriteEugen Hristev1-1/+1
The quirks2 are parsed and set (e.g. from DT) before the quirk for broken HS200 is set in the driver. The driver needs to enable just this flag, not rewrite the whole quirk set. Fixes: 7871aa60ae00 ("mmc: sdhci-of-at91: add quirk for broken HS200") Signed-off-by: Eugen Hristev <eugen.hristev@microchip.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2019-11-14ALSA: usb-audio: Fix incorrect NULL check in create_yamaha_midi_quirk()Takashi Iwai1-2/+2
The commit 60849562a5db ("ALSA: usb-audio: Fix possible NULL dereference at create_yamaha_midi_quirk()") added NULL checks in create_yamaha_midi_quirk(), but there was an overlook. The code allows one of either injd or outjd is NULL, but the second if check made returning -ENODEV if any of them is NULL. Fix it in a proper form. Fixes: 60849562a5db ("ALSA: usb-audio: Fix possible NULL dereference at create_yamaha_midi_quirk()") Reported-by: Pavel Machek <pavel@denx.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20191113111259.24123-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-11-13io_uring: ensure registered buffer import returns the IO lengthJens Axboe1-1/+1
A test case was reported where two linked reads with registered buffers failed the second link always. This is because we set the expected value of a request in req->result, and if we don't get this result, then we fail the dependent links. For some reason the registered buffer import returned -ERROR/0, while the normal import returns -ERROR/length. This broke linked commands with registered buffers. Fix this by making io_import_fixed() correctly return the mapped length. Cc: stable@vger.kernel.org # v5.3 Reported-by: 李通洲 <carter.li@eoitek.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-13io_uring: Fix getting file for timeoutPavel Begunkov1-0/+1
For timeout requests io_uring tries to grab a file with specified fd, which is usually stdin/fd=0. Update io_op_needs_file() Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-13drm/i915/tgl: MOCS table updateMatt Roper1-1/+1
The bspec was just updated with a minor correction to entry 61 (it shouldn't have had the SCF bit set). v2: - Add a MOCS_ENTRY_UNUSED() and use it to declare the explicitly-reserved MOCS entries. (Lucas) - Move the warning suppression from the Makefile to a #pragma that only affects the TGL table. (Lucas) v3: - Entries 16 and 17 are identical to ICL now, so no need to explicitly adjust them (or mess with compiler warning overrides). Bspec: 45101 Fixes: 2ddf992179c4 ("drm/i915/tgl: Define MOCS entries for Tigerlake") Cc: Tomasz Lis <tomasz.lis@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Francisco Jerez <francisco.jerez.plata@intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191112224757.25116-2-matthew.d.roper@intel.com Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Tomasz Lis <tomasz.lis@intel.com> (cherry picked from commit bfb0e8e63d865559cc97af235aea583b7dcc235f) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-11-13Revert "drm/i915/ehl: Update MOCS table for EHL"Matt Roper1-8/+0
This reverts commit f4071997f1de016780ec6b79c63d90cd5886ee83. These extra EHL entries won't behave as expected without a bit more work on the kernel side so let's drop them until that kernel work has had a chance to land. Userspace trying to use these new entries won't get the advantage of the new functionality these entries are meant to provide, but at least it won't misbehave. When we do add these back in the future, we'll probably want to explicitly use separate tables for ICL and EHL so that userspace software that mistakenly uses these entries (which are undefined on ICL) sees the same behavior it sees with all the other undefined entries. Cc: Francisco Jerez <francisco.jerez.plata@intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: <stable@vger.kernel.org> # v5.3+ Fixes: f4071997f1de ("drm/i915/ehl: Update MOCS table for EHL") Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191112224757.25116-1-matthew.d.roper@intel.com Reviewed-by: Francisco Jerez <currojerez@riseup.net> (cherry picked from commit 046091758b50a5fff79726a31c1391614a3d84c8) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-11-13Merge branch 'for-linus' of ↵Linus Torvalds4-19/+33
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: "Fixes to the Synaptics RMI4 driver and fix for use after free in error path handling of the Cypress TTSP driver" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: cyttsp4_core - fix use after free bug Input: synaptics-rmi4 - clear IRQ enables for F54 Input: synaptics-rmi4 - remove unused result_bits mask Input: synaptics-rmi4 - do not consume more data than we have (F11, F12) Input: synaptics-rmi4 - disable the relative position IRQ in the F12 driver Input: synaptics-rmi4 - fix video buffer size