From 09cbfeaf1a5a67bfb3201e0c83c810cecb2efa5a Mon Sep 17 00:00:00 2001 From: "Kirill A. Shutemov" Date: Fri, 1 Apr 2016 15:29:47 +0300 Subject: mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ; - >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov Acked-by: Michal Hocko Signed-off-by: Linus Torvalds --- mm/page-writeback.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) (limited to 'mm/page-writeback.c') diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 11ff8f758631..999792d35ccc 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -2176,8 +2176,8 @@ int write_cache_pages(struct address_space *mapping, cycled = 0; end = -1; } else { - index = wbc->range_start >> PAGE_CACHE_SHIFT; - end = wbc->range_end >> PAGE_CACHE_SHIFT; + index = wbc->range_start >> PAGE_SHIFT; + end = wbc->range_end >> PAGE_SHIFT; if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX) range_whole = 1; cycled = 1; /* ignore range_cyclic tests */ @@ -2382,14 +2382,14 @@ int write_one_page(struct page *page, int wait) wait_on_page_writeback(page); if (clear_page_dirty_for_io(page)) { - page_cache_get(page); + get_page(page); ret = mapping->a_ops->writepage(page, &wbc); if (ret == 0 && wait) { wait_on_page_writeback(page); if (PageError(page)) ret = -EIO; } - page_cache_release(page); + put_page(page); } else { unlock_page(page); } @@ -2431,7 +2431,7 @@ void account_page_dirtied(struct page *page, struct address_space *mapping) __inc_zone_page_state(page, NR_DIRTIED); __inc_wb_stat(wb, WB_RECLAIMABLE); __inc_wb_stat(wb, WB_DIRTIED); - task_io_account_write(PAGE_CACHE_SIZE); + task_io_account_write(PAGE_SIZE); current->nr_dirtied++; this_cpu_inc(bdp_ratelimits); } @@ -2450,7 +2450,7 @@ void account_page_cleaned(struct page *page, struct address_space *mapping, mem_cgroup_dec_page_stat(page, MEM_CGROUP_STAT_DIRTY); dec_zone_page_state(page, NR_FILE_DIRTY); dec_wb_stat(wb, WB_RECLAIMABLE); - task_io_account_cancelled_write(PAGE_CACHE_SIZE); + task_io_account_cancelled_write(PAGE_SIZE); } } -- cgit v1.2.3 From 74d369443325063a5f0260e63971decb950fd8fa Mon Sep 17 00:00:00 2001 From: Howard Cochran Date: Thu, 10 Mar 2016 01:12:39 -0500 Subject: writeback: Fix performance regression in wb_over_bg_thresh() Commit 947e9762a8dd ("writeback: update wb_over_bg_thresh() to use wb_domain aware operations") unintentionally changed this function's meaning from "are there more dirty pages than the background writeback threshold" to "are there more dirty pages than the writeback threshold". The background writeback threshold is typically half of the writeback threshold, so this had the effect of raising the number of dirty pages required to cause a writeback worker to perform background writeout. This can cause a very severe performance regression when a BDI uses BDI_CAP_STRICTLIMIT because balance_dirty_pages() and the writeback worker can now disagree on whether writeback should be initiated. For example, in a system having 1GB of RAM, a single spinning disk, and a "pass-through" FUSE filesystem mounted over the disk, application code mmapped a 128MB file on the disk and was randomly dirtying pages in that mapping. Because FUSE uses strictlimit and has a default max_ratio of only 1%, in balance_dirty_pages, thresh is ~200, bg_thresh is ~100, and the dirty_freerun_ceiling is the average of those, ~150. So, it pauses the dirtying processes when we have 151 dirty pages and wakes up a background writeback worker. But the worker tests the wrong threshold (200 instead of 100), so it does not initiate writeback and just returns. Thus, balance_dirty_pages keeps looping, sleeping and then waking up the worker who will do nothing. It remains stuck in this state until the few dirty pages that we have finally expire and we write them back for that reason. Then the whole process repeats, resulting in near-zero throughput through the FUSE BDI. The fix is to call the parameterized variant of wb_calc_thresh, so that the worker will do writeback if the bg_thresh is exceeded which was the behavior before the referenced commit. Fixes: 947e9762a8dd ("writeback: update wb_over_bg_thresh() to use wb_domain aware operations") Signed-off-by: Howard Cochran Acked-by: Tejun Heo Signed-off-by: Miklos Szeredi Cc: # v4.2+ Tested-by Sedat Dilek Signed-off-by: Jens Axboe --- mm/page-writeback.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) (limited to 'mm/page-writeback.c') diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 11ff8f758631..eeaa431ee4ec 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -1910,7 +1910,8 @@ bool wb_over_bg_thresh(struct bdi_writeback *wb) if (gdtc->dirty > gdtc->bg_thresh) return true; - if (wb_stat(wb, WB_RECLAIMABLE) > __wb_calc_thresh(gdtc)) + if (wb_stat(wb, WB_RECLAIMABLE) > + wb_calc_thresh(gdtc->wb, gdtc->bg_thresh)) return true; if (mdtc) { @@ -1924,7 +1925,8 @@ bool wb_over_bg_thresh(struct bdi_writeback *wb) if (mdtc->dirty > mdtc->bg_thresh) return true; - if (wb_stat(wb, WB_RECLAIMABLE) > __wb_calc_thresh(mdtc)) + if (wb_stat(wb, WB_RECLAIMABLE) > + wb_calc_thresh(mdtc->wb, mdtc->bg_thresh)) return true; } -- cgit v1.2.3