don't dump the threads that had been already exiting when zapped.

Coredump logics needs to report not only the registers of the dumping thread, but (since 2.5.43) those of other threads getting killed. Doing that might require extra state saved on the stack in asm glue at kernel entry; signal delivery logics does that (we need to be able to save sigcontext there, at the very least) and so does seccomp. That covers all callers of do_coredump(). Secondary threads get hit with SIGKILL and caught as soon as they reach exit_mm(), which normally happens in signal delivery, so those are also fine most of the time. Unfortunately, it is possible to end up with secondary zapped when it has already entered exit(2) (or, worse yet, is oopsing). In those cases we reach exit_mm() when mm->core_state is already set, but the stack contents is not what we would have in signal delivery. At least on two architectures (alpha and m68k) it leads to infoleaks - we end up with a chunk of kernel stack written into coredump, with the contents consisting of normal C stack frames of the call chain leading to exit_mm() instead of the expected copy of userland registers. In case of alpha we leak 312 bytes of stack. Other architectures (including the regset-using ones) might have similar problems - the normal user of regsets is ptrace and the state of tracee at the time of such calls is special in the same way signal delivery is. Note that had the zapper gotten to the exiting thread slightly later, it wouldn't have been included into coredump anyway - we skip the threads that have already cleared their ->mm. So let's pretend that zapper always loses the race. IOW, have exit_mm() only insert into the dumper list if we'd gotten there from handling a fatal signal[*] As the result, the callers of do_exit() that have *not* gone through get_signal() are not seen by coredump logics as secondary threads. Which excludes voluntary exit()/oopsen/traps/etc. The dumper thread itself is unaffected by that, so seccomp is fine. [*] originally I intended to add a new flag in tsk->flags, but ebiederman pointed out that PF_SIGNALED is already doing just what we need. Cc: stable@vger.kernel.org Fixes: d89f3847def4 ("[PATCH] thread-aware coredumps, 2.5.43-C3") History-tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
author: Al Viro <viro@zeniv.linux.org.uk> 2020-10-28 16:39:49 -0400
committer: Al Viro <viro@zeniv.linux.org.uk> 2020-10-28 16:39:49 -0400
commit: 77f6ab8b7768cf5e6bdd0e72499270a0671506ee (patch)
tree: 44881a354e6c72b89a804096f2fe8d67314ae125
parent: 3650b228f83adda7e5ee532e2b90429c03f7b9ec (diff)
download: linux-77f6ab8b7768cf5e6bdd0e72499270a0671506ee.tar.bz2
1 files changed, 4 insertions, 1 deletions
diff --git a/kernel/exit.c b/kernel/exit.c
index 87a2d515de0d..1f236ed375f8 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -454,7 +454,10 @@ static void exit_mm(void)
 		mmap_read_unlock(mm);
 
 		self.task = current;
-		self.next = xchg(&core_state->dumper.next, &self);
+		if (self.task->flags & PF_SIGNALED)
+			self.next = xchg(&core_state->dumper.next, &self);
+		else
+			self.task = NULL;
 		/*
 		 * Implies mb(), the result of xchg() must be visible
 		 * to core_state->dumper.
author	Al Viro <viro@zeniv.linux.org.uk>	2020-10-28 16:39:49 -0400
committer	Al Viro <viro@zeniv.linux.org.uk>	2020-10-28 16:39:49 -0400
commit	77f6ab8b7768cf5e6bdd0e72499270a0671506ee (patch)
tree	44881a354e6c72b89a804096f2fe8d67314ae125
parent	3650b228f83adda7e5ee532e2b90429c03f7b9ec (diff)
download	linux-77f6ab8b7768cf5e6bdd0e72499270a0671506ee.tar.bz2