summaryrefslogtreecommitdiffstats
path: root/mm/memory.c
diff options
context:
space:
mode:
authorJohannes Weiner <hannes@cmpxchg.org>2020-06-03 16:02:17 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2020-06-03 20:09:48 -0700
commit4c6355b25e8bb83c3cb455f532b7542089699d98 (patch)
tree796e786492f8d00015c92b3390792bb3b1553f1b /mm/memory.c
parent2d1c498072de69e2857b849ee197ba2aa7de53a3 (diff)
downloadlinux-4c6355b25e8bb83c3cb455f532b7542089699d98.tar.bz2
mm: memcontrol: charge swapin pages on instantiation
Right now, users that are otherwise memory controlled can easily escape their containment and allocate significant amounts of memory that they're not being charged for. That's because swap readahead pages are not being charged until somebody actually faults them into their page table. This can be exploited with MADV_WILLNEED, which triggers arbitrary readahead allocations without charging the pages. There are additional problems with the delayed charging of swap pages: 1. To implement refault/workingset detection for anonymous pages, we need to have a target LRU available at swapin time, but the LRU is not determinable until the page has been charged. 2. To implement per-cgroup LRU locking, we need page->mem_cgroup to be stable when the page is isolated from the LRU; otherwise, the locks change under us. But swapcache gets charged after it's already on the LRU, and even if we cannot isolate it ourselves (since charging is not exactly optional). The previous patch ensured we always maintain cgroup ownership records for swap pages. This patch moves the swapcache charging point from the fault handler to swapin time to fix all of the above problems. v2: simplify swapin error checking (Joonsoo) [hughd@google.com: fix livelock in __read_swap_cache_async()] Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2005212246080.8458@eggly.anvils Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Alex Shi <alex.shi@linux.alibaba.com> Cc: Hugh Dickins <hughd@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Michal Hocko <mhocko@suse.com> Cc: Roman Gushchin <guro@fb.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Rafael Aquini <aquini@redhat.com> Cc: Alex Shi <alex.shi@linux.alibaba.com> Link: http://lkml.kernel.org/r/20200508183105.225460-17-hannes@cmpxchg.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'mm/memory.c')
-rw-r--r--mm/memory.c15
1 files changed, 11 insertions, 4 deletions
diff --git a/mm/memory.c b/mm/memory.c
index 27e225bef5d0..9c886e4207a2 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3125,9 +3125,20 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma,
vmf->address);
if (page) {
+ int err;
+
__SetPageLocked(page);
__SetPageSwapBacked(page);
set_page_private(page, entry.val);
+
+ /* Tell memcg to use swap ownership records */
+ SetPageSwapCache(page);
+ err = mem_cgroup_charge(page, vma->vm_mm,
+ GFP_KERNEL, false);
+ ClearPageSwapCache(page);
+ if (err)
+ goto out_page;
+
lru_cache_add_anon(page);
swap_readpage(page, true);
}
@@ -3189,10 +3200,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
goto out_page;
}
- if (mem_cgroup_charge(page, vma->vm_mm, GFP_KERNEL, true)) {
- ret = VM_FAULT_OOM;
- goto out_page;
- }
cgroup_throttle_swaprate(page, GFP_KERNEL);
/*