summaryrefslogtreecommitdiffstats
path: root/kernel
AgeCommit message (Collapse)AuthorFilesLines
2008-04-30mm: bdi: add separate writeback accounting capabilityMiklos Szeredi1-1/+1
Add a new BDI capability flag: BDI_CAP_NO_ACCT_WB. If this flag is set, then don't update the per-bdi writeback stats from test_set_page_writeback() and test_clear_page_writeback(). Misc cleanups: - convert bdi_cap_writeback_dirty() and friends to static inline functions - create a flag that includes all three dirty/writeback related flags, since almst all users will want to have them toghether Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30pidns: make pid->level and pid_ns->level unsignedPavel Emelyanov1-1/+1
These values represent the nesting level of a namespace and pids living in it, and it's always non-negative. Turning this from int to unsigned int saves some space in pid.c (11 bytes on x86 and 64 on ia64) by letting the compiler optimize the pid_nr_ns a bit. E.g. on ia64 this removes the sign extension calls, which compiler adds to optimize access to pid->nubers[ns->level]. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30make marker_debug staticAdrian Bunk1-1/+1
With the needlessly global marker_debug being static gcc can optimize the unused code away. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30pids: sys_getpgid: fix unsafe *pid usage, s/tasklist/rcu/Oleg Nesterov1-15/+21
1. sys_getpgid() needs rcu_read_lock() to derive the pgrp _nr, even if the task is current, otherwise we can race with another thread which does sys_setpgid(). 2. Use rcu_read_lock() instead of tasklist_lock when pid != 0, make sure that we don't use the NULL pid if the task exits right after successful find_task_by_vpid(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30pids: sys_getsid: fix unsafe *pid usage, fix possible 0 instead of -ESRCHOleg Nesterov1-13/+20
1. sys_getsid() needs rcu_read_lock() to derive the session _nr, even if the task is current, otherwise we can race with another thread which does sys_setsid(). 2. The task can exit between find_task_by_vpid() and task_session_vnr(), in that unlikely case sys_getsid() returns 0 instead of -ESRCH. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30pids: __set_special_pids: use change_pid() helperOleg Nesterov1-4/+2
Use change_pid() instead of detach_pid() + attach_pid() in __set_special_pids(). This way task_session() is not NULL in between. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30pids: sys_setpgid: use change_pid() helperOleg Nesterov1-2/+1
Use change_pid() instead of detach_pid() + attach_pid() in sys_setpgid(). This way task_pgrp() is not NULL in between. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30pids: introduce change_pid() helperOleg Nesterov1-5/+16
Based on Eric W. Biederman's idea. Without tasklist_lock held task_session()/task_pgrp() can return NULL if the caller races with setprgp()/setsid() which does detach_pid() + attach_pid(). This can happen even if task == current. Intoduce the new helper, change_pid(), which should be used instead. This way the caller always sees the special pid != NULL, either old or new. Also change the prototype of attach_pid(), it always returns 0 and nobody check the returned value. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30pids: de_thread: don't clear session/pgrp pids for the old leaderOleg Nesterov1-1/+0
Based on Eric W. Biederman's idea. Unless task == current, without tasklist_lock held task_session()/task_pgrp() can return NULL if the caller races with de_thread() which switches the group leader. Change transfer_pid() to not clear old->pids[type].pid for the old leader. This means that its .pid can point to "nowhere", but this is already true for sub-threads, and the old leader is not group_leader() any longer. IOW, with or without this change we can't trust task's special pids unless it is the group leader. With this change the following code rcu_read_lock(); task = find_task_by_xxx(); do_something(task_pgrp(task), task_session(task)); rcu_read_unlock(); can't race with exec and hit the NULL pid. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30Deprecate find_task_by_pid()Pavel Emelyanov2-7/+1
There are some places that are known to operate on tasks' global pids only: * the rest_init() call (called on boot) * the kgdb's getthread * the create_kthread() (since the kthread is run in init ns) So use the find_task_by_pid_ns(..., &init_pid_ns) there and schedule the find_task_by_pid for removal. [sukadev@us.ibm.com: Fix warning in kernel/pid.c] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30Use find_task_by_vpid in taskstatsPavel Emelyanov1-3/+3
The pid to lookup a task by is passed inside taskstats code via genetlink message. Since netlink packets are now processed in the context of the sending task, this is correct to lookup the task with find_task_by_vpid() here. Besides, I fix the call to fill_pid() from taskstats_exit(), since the tsk->pid is not required in fill_pid() in this case, and the pid field on task_struct is going to be deprecated as well. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@engr.sgi.com> Cc: Jonathan Lim <jlim@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30free_pidmap: turn it into free_pidmap(struct upid *)Oleg Nesterov1-6/+7
The callers of free_pidmap() pass 2 members of "struct upid", we can just pass "struct upid *" instead. Shaves off 10 bytes from pid.o. Also, simplify the alloc_pid's "out_free:" error path a little bit. This way it looks more clear which subset of pid->numbers[] we are freeing. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc :Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30tty: The big operations reworkAlan Cox1-2/+2
- Operations are now a shared const function block as with most other Linux objects - Introduce wrappers for some optional functions to get consistent behaviour - Wrap put_char which used to be patched by the tty layer - Document which functions are needed/optional - Make put_char report success/fail - Cache the driver->ops pointer in the tty as tty->ops - Remove various surplus lock calls we no longer need - Remove proc_write method as noted by Alexey Dobriyan - Introduce some missing sanity checks where certain driver/ldisc combinations would oops as they didn't check needed methods were present [akpm@linux-foundation.org: fix fs/compat_ioctl.c build] [akpm@linux-foundation.org: fix isicom] [akpm@linux-foundation.org: fix arch/ia64/hp/sim/simserial.c build] [akpm@linux-foundation.org: fix kgdb] Signed-off-by: Alan Cox <alan@redhat.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Cc: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30ptrace: permit ptracing of /sbin/initOleg Nesterov1-8/+0
Afaics, currently there are no kernel problems with ptracing init, it can't lose SIGNAL_UNKILLABLE flag and be killed/stopped by accident. The ability to strace/debug init can be very useful if you try to figure out why it does not work as expected. However, admin should know what he does, "gdb /sbin/init 1" stops init, it can't reap orphaned zombies or take care of /etc/inittab until continued. It is even possible to crash init (and thus the whole system) if you wish, ptracer has full control. See also the long discussion: http://marc.info/?t=120628018600001 Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Roland McGrath <roland@redhat.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30ptrace: ptrace_attach: use send_sig_info() instead force_sig_specific()Oleg Nesterov1-2/+1
Nobody can block/ignore SIGSTOP, no need to use force_sig_specific() in ptrace_attach. Use the "regular" send_sig_info(). With this patch stracing of /sbin/init doesn't clear its SIGNAL_UNKILLABLE, but not that this makes ptracing of init safe. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30ptrace: __ptrace_unlink: use the ptrace_reparented() helperOleg Nesterov1-1/+1
Currently __ptrace_unlink() checks list_empty(->ptrace_list) to figure out whether the child was reparented. Change the code to use ptrace_reparented() to make this check more explicit and consistent. No functional changes. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30ptrace: introduce ptrace_reparented() helperOleg Nesterov1-5/+4
Add another trivial helper for the sake of grep. It also auto-documents the fact that ->parent != real_parent implies ->ptrace. No functional changes. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30document de_thread() with exit_notify() connectionOleg Nesterov1-0/+1
Add a couple of small comments, it is not easy to see what this code does. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30reparent_thread: use same_thread_group()Oleg Nesterov1-1/+1
Trivial, use same_thread_group() in reparent_thread(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30ptrace: introduce task_detached() helperOleg Nesterov1-20/+25
exit.c has numerous "->exit_signal == -1" comparisons, this check is subtle and deserves a helper. Imho makes the code more parseable for humans. At least it's surely more greppable. Also, a couple of whitespace cleanups. No functional changes. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: add set_restore_sigmaskRoland McGrath2-3/+2
This adds the set_restore_sigmask() inline in <linux/thread_info.h> and replaces every set_thread_flag(TIF_RESTORE_SIGMASK) with a call to it. No change, but abstracts the details of the flag protocol from all the calls. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: allow the kernel to actually kill /sbin/initOleg Nesterov1-1/+4
Currently the buggy /sbin/init hangs if SIGSEGV/etc happens. The kernel sends the signal, init dequeues it and ignores, returns from the exception, repeats the faulting instruction, and so on forever. Imho, such a behaviour is not good. I think that the explicit loud death of the buggy /sbin/init is better than the silent hang. Change force_sig_info() to clear SIGNAL_UNKILLABLE when the task should be really killed. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: fix /sbin/init protection from unwanted signalsOleg Nesterov1-3/+6
The global init has a lot of long standing problems with the unhandled fatal signals. - The "is_global_init(current)" check in get_signal_to_deliver() protects only the main thread. Sub-thread can dequee the fatal signal and shutdown the whole thread group except the main thread. If it dequeues SIGSTOP /sbin/init will be stopped, this is not right too. Note that we can't use is_global_init(->group_leader), this breaks exec and this can't solve other problems we have. - Even if afterwards ignored, the fatal signals sets SIGNAL_GROUP_EXIT on delivery. This breaks exec, has other bad implications, and this is just wrong. Introduce the new SIGNAL_UNKILLABLE flag to fix these problems. It also helps to solve some other problems addressed by the subsequent patches. Currently we use this flag for the global init only, but it could also be used by kthreads and (perhaps) by the sub-namespace inits. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: check_kill_permission: remove tasklist_lockOleg Nesterov1-2/+0
Now that task_session() can't return a false NULL, check_kill_permission() doesn't need tasklist_lock. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: check_kill_permission: check session under tasklist_lockOleg Nesterov1-5/+19
This wasn't documented, but as Atsushi Tsuji pointed out check_kill_permission() needs tasklist_lock for task_session_nr(). I missed this fact when removed tasklist from the callers. Change check_kill_permission() to take tasklist_lock for the SIGCONT case. Re-order security checks so that we take tasklist_lock only if/when it is actually needed. This is a minimal fix for now, tasklist will be removed later. Also change the code to use task_session() instead of task_session_nr(). Also, remove the SIGCONT check from cap_task_kill(), it is bogus (and the whole function is bogus. Serge, Eric, why it is still alive?). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Atsushi Tsuji <a-tsuji@bk.jp.nec.com> Cc: Roland McGrath <roland@redhat.com> Cc: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: send_signal: be paranoid about signalfd_notify()Oleg Nesterov1-7/+1
send_signal() shouldn't call signalfd_notify() if it then fails with -EAGAIN. Harmless, just a paranoid cleanup. Also remove the comment. It is obsolete, signalfd_notify() was simplified and does a simple wakeup. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: document CLD_CONTINUED notification mechanicsOleg Nesterov1-1/+10
A couple of small comments about how CLD_CONTINUED notification works. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: fold sig_ignored() into handle_stop_signal()Oleg Nesterov1-15/+17
Rename handle_stop_signal() to prepare_signal(), make it return a boolean, and move the callsites of sig_ignored() into it. No functional changes for now. But it would be nice to factor out the "should we drop this signal" checks as much as possible, before we try to fix the bugs with the sub-namespace init's signals (actually the global /sbin/init has some problems with signals too). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: cleanup the usage of print_fatal_signal()Oleg Nesterov1-2/+3
Move the callsite of print_fatal_signal() down, under "if (sig_kernel_coredump(signr))", so we don't need to check signr != SIGKILL. We are only interested in the sig_kernel_coredump() signals anyway, and due to the previous changes we almost never can see other fatal signals here except SIGKILL. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: handle_stop_signal: don't worry about SIGKILLOleg Nesterov1-6/+0
handle_stop_signal() clears SIGNAL_STOP_DEQUEUED when sig == SIGKILL. Remove this nasty special case. It was needed to prevent the race with group stop and exit caused by thread-specific SIGKILL. Now that we use complete_signal() for private signals too this is not needed, complete_signal() will notice SIGKILL and abort the soon-to-begin group stop. Except: the target thread is dead (has PF_EXITING). But in that case we should not just clear SIGNAL_STOP_DEQUEUED and nothing more. We should either kill the whole thread group, or silently ignore the signal. I suspect we are not right wrt zombie leaders, but this is another issue which and should be fixed separately. Note that this check can't abort the group stop if it was already started/finished, this check only adds a subtle side effect if we race with the thread which has already dequeued sig_kernel_stop() signal and temporary released ->siglock. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: join send_sigqueue() with send_group_sigqueue()Oleg Nesterov2-18/+3
We export send_sigqueue() and send_group_sigqueue() for the only user, posix_timer_event(). This is a bit silly, because both are just trivial helpers on top of do_send_sigqueue() and because the we pass the unused .si_signo parameter. Kill them both, rename do_send_sigqueue() to send_sigqueue(), and export it. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: unify send_sigqueue/send_group_sigqueue completelyOleg Nesterov1-37/+21
Suggested by Pavel Emelyanov. send_sigqueue/send_group_sigqueue are only differ in how they lock ->siglock. Unify them. send_group_sigqueue() uses spin_lock() because it knows the task can't exit, but in that case lock_task_sighand() can't fail and doesn't hurt. Note that the "sig" argument is ignored, it is always equal to ->si_signo. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: fold complete_signal() into send_signal/do_send_sigqueuePavel Emelyanov1-35/+11
Factor out complete_signal() callsites. This change completely unifies the helpers sending the specific/group signals. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: use __group_complete_signal() for the specific signals tooOleg Nesterov1-9/+6
Based on Pavel Emelyanov's suggestion. Rename __group_complete_signal() to complete_signal() and use it to process the specific signals too. To do this we simply add the "int group" argument. This allows us to greatly simply the signal-sending code and adds a useful behaviour change. We can avoid the unneeded wakeups for the private signals because wants_signal() is more clever than sigismember(blocked), but more importantly we now take into account the fatal specific signals too. The latter allows us to kill some subtle checks in handle_stop_signal() and makes the specific/group signal's behaviour more consistent. For example, currently sigtimedwait(FATAL_SIGNAL) behaves differently depending on was the signal sent by kill() or tkill() if the signal was not blocked. And. This allows us to tweak/fix the behaviour when the specific signal is sent to the dying/dead ->group_leader. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: change send_signal/do_send_sigqueue to take "boolean group" parameterOleg Nesterov1-9/+16
send_signal() is used either with ->pending or with ->signal->shared_pending. Change it to take "int group" instead, this argument will be re-used later. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: move the definition of __group_complete_signal() upOleg Nesterov1-96/+96
Move the unchanged definition of __group_complete_signal() so that send_signal can see it. To simplify the reading of the next patches. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: microoptimize the usage of ->curr_targetOleg Nesterov2-5/+1
Suggested by Roland McGrath. Initialize signal->curr_target in copy_signal(). This way ->curr_target is never == NULL, we can kill the check in __group_complete_signal's hot path. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: send_sig_info: don't take tasklist_lockOleg Nesterov1-10/+1
The comment in send_sig_info() is wrong, tasklist_lock can't help. The caller must ensure the task can't go away, otherwise ->sighand can be NULL even before we take the lock. p->sighand could be changed by exec(), but I can't imagine how it is possible to prevent exit(), but not exec(). Since the things seem to work, I assume all callers are correct. However, drm_vbl_send_signals() looks broken. block_all_signals() which is solely used by drm is definitely broken. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: do_tkill: don't use tasklist_lockOleg Nesterov1-5/+9
Convert do_tkill() to use rcu_read_lock() + lock_task_sighand() to avoid taking tasklist lock. Note that we don't return an error if lock_task_sighand() fails, we pretend the task dies after receiving the signal. Otherwise, we should fight with the nasty races with mt-exec without having any advantage. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: move handle_stop_signal() into send_signal()Oleg Nesterov1-8/+3
Move handle_stop_signal() into send_signal(). This factors out a couple of callsites and allows us to do further unifications. Also, with this change specific_send_sig_info() does handle_stop_signal(). Not that this is really important, we never send STOP/CONT via send_sig() and friends, but still this looks more consistent. The only (afaics) special case is get_signal_to_deliver(). If the traced task dequeues SIGCONT, it can re-send it to itself after ptrace_stop() if the signal was blocked by debugger. In that case handle_stop_signal() is unnecessary, but hopefully not a problem. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: send_group_sigqueue: don't take tasklist_lockOleg Nesterov1-2/+1
handle_stop_signal() was changed, now send_group_sigqueue() doesn't need tasklist_lock. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: __group_complete_signal: cache the value of p->signalOleg Nesterov1-8/+9
Cosmetic, cache p->signal to make the code a bit more readable. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: send_sigqueue: don't forget about handle_stop_signal()Oleg Nesterov1-2/+3
send_group_sigqueue() calls handle_stop_signal(), send_sigqueue() doesn't. This is not consistent and in fact I'd say this is (minor) bug. Move handle_stop_signal() from send_group_sigqueue() to do_send_sigqueue(), the latter is called by send_sigqueue() too. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: send_sigqueue: don't take rcu lockOleg Nesterov1-4/+0
lock_task_sighand() was changed, send_sigqueue() doesn't need rcu_read_lock() any longer. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30get_signal_to_deliver: use the cached ->signal/sighand valuesOleg Nesterov1-15/+15
Cache the values of current->signal/sighand. Shrinks .text a bit and makes the code more readable. Also, remove "sigset_t *mask", it is pointless because in fact we save the constant offset. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Cc: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30handle_stop_signal: use the cached p->signal valueOleg Nesterov1-15/+13
Cache the value of p->signal, and change the code to use while_each_thread() helper. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Cc: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30handle_stop_signal: unify partial/full stop handlingOleg Nesterov1-26/+19
Now that handle_stop_signal() doesn't drop ->siglock, we can't see both ->group_stop_count && SIGNAL_STOP_STOPPED. Merge two "if" branches. As Roland pointed out, we never actually needed 2 do_notify_parent_cldstop() calls. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Cc: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30kill_pid_info: don't take now unneeded tasklist_lockOleg Nesterov1-6/+1
Previously handle_stop_signal(SIGCONT) could drop ->siglock. That is why kill_pid_info(SIGCONT) takes tasklist_lock to make sure the target task can't go away after unlock. Not needed now. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Cc: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: re-assign CLD_CONTINUED notification from the sender to recieverOleg Nesterov1-10/+19
Based on discussion with Jiri and Roland. In short: currently handle_stop_signal(SIGCONT, p) sends the notification to p->parent, with this patch p itself notifies its parent when it becomes running. handle_stop_signal(SIGCONT) has to drop ->siglock temporary in order to notify the parent with do_notify_parent_cldstop(). This leads to multiple problems: - as Jiri Kosina pointed out, the stopped task can resume without actually seeing SIGCONT which may have a handler. - we race with another sig_kernel_stop() signal which may come in that window. - we race with sig_fatal() signals which may set SIGNAL_GROUP_EXIT in that window. - we can't avoid taking tasklist_lock() while sending SIGCONT. With this patch handle_stop_signal() just sets the new SIGNAL_CLD_CONTINUED flag in p->signal->flags and returns. The notification is sent by the first task which returns from finish_stop() (there should be at least one) or any other signalled thread from get_signal_to_deliver(). This is a user-visible change. Say, currently kill(SIGCONT, stopped_child) can't return without seeing SIGCHLD, with this patch SIGCHLD can be delayed unpredictably. Another difference is that if the child is ptraced by another process, CLD_CONTINUED may be delivered to ->real_parent after ptrace_detach() while currently it always goes to the tracer which doesn't actually need this notification. Hopefully not a problem. The patch asks for the futher obvious cleanups, I'll send them separately. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Cc: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: cleanup security_task_kill() usage/implementationOleg Nesterov1-13/+14
Every implementation of ->task_kill() does nothing when the signal comes from the kernel. This is correct, but means that check_kill_permission() should call security_task_kill() only for SI_FROMUSER() case, and we can remove the same check from ->task_kill() implementations. (sadly, check_kill_permission() is the last user of signal->session/__session but we can't s/task_session_nr/task_session/ here). NOTE: Eric W. Biederman pointed out cap_task_kill() should die, and I think he is very right. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Roland McGrath <roland@redhat.com> Cc: Casey Schaufler <casey@schaufler-ca.com> Cc: David Quigley <dpquigl@tycho.nsa.gov> Cc: Eric Paris <eparis@redhat.com> Cc: Harald Welte <laforge@gnumonks.org> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>