diff options
author | Huang Ying <ying.huang@intel.com> | 2018-04-05 16:23:20 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2018-04-05 21:36:25 -0700 |
commit | e92bb4dd9673945179b1fc738c9817dd91bfb629 (patch) | |
tree | 618a868ab40bcd5e2f5c2830341b47f6cffa6afe /mm/vmscan.c | |
parent | 5ad3509364a86461188184e2ae7ca517dca6f389 (diff) | |
download | linux-e92bb4dd9673945179b1fc738c9817dd91bfb629.tar.bz2 |
mm: fix races between address_space dereference and free in page_evicatable
When page_mapping() is called and the mapping is dereferenced in
page_evicatable() through shrink_active_list(), it is possible for the
inode to be truncated and the embedded address space to be freed at the
same time. This may lead to the following race.
CPU1 CPU2
truncate(inode) shrink_active_list()
... page_evictable(page)
truncate_inode_page(mapping, page);
delete_from_page_cache(page)
spin_lock_irqsave(&mapping->tree_lock, flags);
__delete_from_page_cache(page, NULL)
page_cache_tree_delete(..)
... mapping = page_mapping(page);
page->mapping = NULL;
...
spin_unlock_irqrestore(&mapping->tree_lock, flags);
page_cache_free_page(mapping, page)
put_page(page)
if (put_page_testzero(page)) -> false
- inode now has no pages and can be freed including embedded address_space
mapping_unevictable(mapping)
test_bit(AS_UNEVICTABLE, &mapping->flags);
- we've dereferenced mapping which is potentially already free.
Similar race exists between swap cache freeing and page_evicatable()
too.
The address_space in inode and swap cache will be freed after a RCU
grace period. So the races are fixed via enclosing the page_mapping()
and address_space usage in rcu_read_lock/unlock(). Some comments are
added in code to make it clear what is protected by the RCU read lock.
Link: http://lkml.kernel.org/r/20180212081227.1940-1-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'mm/vmscan.c')
-rw-r--r-- | mm/vmscan.c | 8 |
1 files changed, 7 insertions, 1 deletions
diff --git a/mm/vmscan.c b/mm/vmscan.c index cd5dc3faaa57..ca566640c448 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3877,7 +3877,13 @@ int node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned int order) */ int page_evictable(struct page *page) { - return !mapping_unevictable(page_mapping(page)) && !PageMlocked(page); + int ret; + + /* Prevent address_space of inode and swap cache from being freed */ + rcu_read_lock(); + ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page); + rcu_read_unlock(); + return ret; } #ifdef CONFIG_SHMEM |