diff options
author | Gerald Schaefer <gerald.schaefer@de.ibm.com> | 2016-04-15 16:38:40 +0200 |
---|---|---|
committer | Martin Schwidefsky <schwidefsky@de.ibm.com> | 2016-04-21 09:50:09 +0200 |
commit | 723cacbd9dc79582e562c123a0bacf8bfc69e72a (patch) | |
tree | cae93823c3e8a37ef012b85b38d48e9e1bc19762 /arch/s390/mm/mmap.c | |
parent | dba599091c191d209b1499511a524ad9657c0e5a (diff) | |
download | linux-723cacbd9dc79582e562c123a0bacf8bfc69e72a.tar.bz2 |
s390/mm: fix asce_bits handling with dynamic pagetable levels
There is a race with multi-threaded applications between context switch and
pagetable upgrade. In switch_mm() a new user_asce is built from mm->pgd and
mm->context.asce_bits, w/o holding any locks. A concurrent mmap with a
pagetable upgrade on another thread in crst_table_upgrade() could already
have set new asce_bits, but not yet the new mm->pgd. This would result in a
corrupt user_asce in switch_mm(), and eventually in a kernel panic from a
translation exception.
Fix this by storing the complete asce instead of just the asce_bits, which
can then be read atomically from switch_mm(), so that it either sees the
old value or the new value, but no mixture. Both cases are OK. Having the
old value would result in a page fault on access to the higher level memory,
but the fault handler would see the new mm->pgd, if it was a valid access
after the mmap on the other thread has completed. So as worst-case scenario
we would have a page fault loop for the racing thread until the next time
slice.
Also remove dead code and simplify the upgrade/downgrade path, there are no
upgrades from 2 levels, and only downgrades from 3 levels for compat tasks.
There are also no concurrent upgrades, because the mmap_sem is held with
down_write() in do_mmap, so the flush and table checks during upgrade can
be removed.
Reported-by: Michael Munday <munday@ca.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Diffstat (limited to 'arch/s390/mm/mmap.c')
-rw-r--r-- | arch/s390/mm/mmap.c | 6 |
1 files changed, 3 insertions, 3 deletions
diff --git a/arch/s390/mm/mmap.c b/arch/s390/mm/mmap.c index 45c4daa49930..89cf09e5f168 100644 --- a/arch/s390/mm/mmap.c +++ b/arch/s390/mm/mmap.c @@ -174,7 +174,7 @@ int s390_mmap_check(unsigned long addr, unsigned long len, unsigned long flags) if (!(flags & MAP_FIXED)) addr = 0; if ((addr + len) >= TASK_SIZE) - return crst_table_upgrade(current->mm, TASK_MAX_SIZE); + return crst_table_upgrade(current->mm); return 0; } @@ -191,7 +191,7 @@ s390_get_unmapped_area(struct file *filp, unsigned long addr, return area; if (area == -ENOMEM && !is_compat_task() && TASK_SIZE < TASK_MAX_SIZE) { /* Upgrade the page table to 4 levels and retry. */ - rc = crst_table_upgrade(mm, TASK_MAX_SIZE); + rc = crst_table_upgrade(mm); if (rc) return (unsigned long) rc; area = arch_get_unmapped_area(filp, addr, len, pgoff, flags); @@ -213,7 +213,7 @@ s390_get_unmapped_area_topdown(struct file *filp, const unsigned long addr, return area; if (area == -ENOMEM && !is_compat_task() && TASK_SIZE < TASK_MAX_SIZE) { /* Upgrade the page table to 4 levels and retry. */ - rc = crst_table_upgrade(mm, TASK_MAX_SIZE); + rc = crst_table_upgrade(mm); if (rc) return (unsigned long) rc; area = arch_get_unmapped_area_topdown(filp, addr, len, |