From ddeaab32a89f04b7e2a2df8771583a719c4ac6b7 Mon Sep 17 00:00:00 2001 From: Mike Kravetz Date: Tue, 8 Jan 2019 15:23:36 -0800 Subject: hugetlbfs: revert "use i_mmap_rwsem for more pmd sharing synchronization" This reverts b43a9990055958e70347c56f90ea2ae32c67334c The reverted commit caused issues with migration and poisoning of anon huge pages. The LTP move_pages12 test will cause an "unable to handle kernel NULL pointer" BUG would occur with stack similar to: RIP: 0010:down_write+0x1b/0x40 Call Trace: migrate_pages+0x81f/0xb90 __ia32_compat_sys_migrate_pages+0x190/0x190 do_move_pages_to_node.isra.53.part.54+0x2a/0x50 kernel_move_pages+0x566/0x7b0 __x64_sys_move_pages+0x24/0x30 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The purpose of the reverted patch was to fix some long existing races with huge pmd sharing. It used i_mmap_rwsem for this purpose with the idea that this could also be used to address truncate/page fault races with another patch. Further analysis has determined that i_mmap_rwsem can not be used to address all these hugetlbfs synchronization issues. Therefore, revert this patch while working an another approach to the underlying issues. Link: http://lkml.kernel.org/r/20190103235452.29335-2-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz Reported-by: Jan Stancek Cc: Michal Hocko Cc: Hugh Dickins Cc: Naoya Horiguchi Cc: "Aneesh Kumar K . V" Cc: Andrea Arcangeli Cc: "Kirill A . Shutemov" Cc: Davidlohr Bueso Cc: Prakash Sangappa Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/rmap.c | 4 ---- 1 file changed, 4 deletions(-) (limited to 'mm/rmap.c') diff --git a/mm/rmap.c b/mm/rmap.c index 21a26cf51114..68a1a5b869a5 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -25,7 +25,6 @@ * page->flags PG_locked (lock_page) * hugetlbfs_i_mmap_rwsem_key (in huge_pmd_share) * mapping->i_mmap_rwsem - * hugetlb_fault_mutex (hugetlbfs specific page fault mutex) * anon_vma->rwsem * mm->page_table_lock or pte_lock * zone_lru_lock (in mark_page_accessed, isolate_lru_page) @@ -1379,9 +1378,6 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, /* * If sharing is possible, start and end will be adjusted * accordingly. - * - * If called for a huge page, caller must hold i_mmap_rwsem - * in write mode as it is possible to call huge_pmd_unshare. */ adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end); -- cgit v1.2.3 From ba422731316dde1e22dcc84b83c7349dc0ce1c3c Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Wed, 9 Jan 2019 16:51:17 -0800 Subject: mm/mmu_notifier: mm/rmap.c: Fix a mmu_notifier range bug in try_to_unmap_one MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The conversion to use a structure for mmu_notifier_invalidate_range_*() unintentionally changed the usage in try_to_unmap_one() to init the 'struct mmu_notifier_range' with vma->vm_start instead of @address, i.e. it invalidates the wrong address range. Revert to the correct address range. Manifests as KVM use-after-free WARNINGs and subsequent "BUG: Bad page state in process X" errors when reclaiming from a KVM guest due to KVM removing the wrong pages from its own mappings. Reported-by: leozinho29_eu@hotmail.com Reported-by: Mike Galbraith Reported-and-tested-by: Adam Borowski Reviewed-by: Jérôme Glisse Reviewed-by: Pankaj gupta Cc: Christian König Cc: Jan Kara Cc: Matthew Wilcox Cc: Ross Zwisler Cc: Dan Williams Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Michal Hocko Cc: Felix Kuehling Cc: Ralph Campbell Cc: John Hubbard Cc: Andrew Morton Fixes: ac46d4f3c432 ("mm/mmu_notifier: use structure for invalidate_range_start/end calls v2") Signed-off-by: Sean Christopherson Signed-off-by: Linus Torvalds --- mm/rmap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'mm/rmap.c') diff --git a/mm/rmap.c b/mm/rmap.c index 68a1a5b869a5..0454ecc29537 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1371,8 +1371,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, * Note that the page can not be free in this function as call of * try_to_unmap() must hold a reference on the page. */ - mmu_notifier_range_init(&range, vma->vm_mm, vma->vm_start, - min(vma->vm_end, vma->vm_start + + mmu_notifier_range_init(&range, vma->vm_mm, address, + min(vma->vm_end, address + (PAGE_SIZE << compound_order(page)))); if (PageHuge(page)) { /* -- cgit v1.2.3