linux - Linux Kernel (branches are rebased on master from time to time)

Age	Commit message (Collapse)	Author	Files	Lines
2011-06-22	ptrace: kill trivial tracehooks	Tejun Heo	7	-171/+11
	At this point, tracehooks aren't useful to mainline kernel and mostly just add an extra layer of obfuscation. Although they have comments, without actual in-kernel users, it is difficult to tell what are their assumptions and they're actually trying to achieve. To mainline kernel, they just aren't worth keeping around. This patch kills the following trivial tracehooks. * Ones testing whether task is ptraced. Replace with ->ptrace test. tracehook_expect_breakpoints() tracehook_consider_ignored_signal() tracehook_consider_fatal_signal() * ptrace_event() wrappers. Call directly. tracehook_report_exec() tracehook_report_exit() tracehook_report_vfork_done() * ptrace_release_task() wrapper. Call directly. tracehook_finish_release_task() * noop tracehook_prepare_release_task() tracehook_report_death() This doesn't introduce any behavior change. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-22	ptrace: move SIGTRAP on exec(2) logic to ptrace_event()	Tejun Heo	2	-11/+9
	Move SIGTRAP on exec(2) logic from tracehook_report_exec() to ptrace_event(). This is part of changes to make ptrace_event() smarter and handle ptrace event related details in one place. This doesn't introduce any behavior change. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-22	ptrace: introduce ptrace_event_enabled() and simplify ptrace_event() and ↵	Tejun Heo	2	-27/+45
	tracehook_prepare_clone() This patch implements ptrace_event_enabled() which tests whether a given PTRACE_EVENT_* is enabled and use it to simplify ptrace_event() and tracehook_prepare_clone(). PT_EVENT_FLAG() macro is added which calculates PT_TRACE_* flag from PTRACE_EVENT_. This is used to define PT_TRACE_ flags and by ptrace_event_enabled() to find the matching flag. This is used to make ptrace_event() and tracehook_prepare_clone() simpler. * ptrace_event() callers were responsible for providing mask to test whether the event was enabled. This patch implements ptrace_event_enabled() and make ptrace_event() drop @mask and determine whether the event is enabled from @event. Note that @event is constant and this conversion doesn't add runtime overhead. All conversions except tracehook_report_clone_complete() are trivial. tracehook_report_clone_complete() used to use 0 for @mask (always enabled) but now tests whether the specified event is enabled. This doesn't cause any behavior difference as it's guaranteed that the event specified by @trace is enabled. * tracehook_prepare_clone() now only determines which event is applicable and use ptrace_event_enabled() for enable test. This doesn't introduce any behavior change. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-22	ptrace: kill task_ptrace()	Tejun Heo	5	-32/+20
	task_ptrace(task) simply dereferences task->ptrace and isn't even used consistently only adding confusion. Kill it and directly access ->ptrace instead. This doesn't introduce any behavior change. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-16	ptrace: implement PTRACE_LISTEN	Tejun Heo	5	-8/+53
	The previous patch implemented async notification for ptrace but it only worked while trace is running. This patch introduces PTRACE_LISTEN which is suggested by Oleg Nestrov. It's allowed iff tracee is in STOP trap and puts tracee into quasi-running state - tracee never really runs but wait(2) and ptrace(2) consider it to be running. While ptracer is listening, tracee is allowed to re-enter STOP to notify an async event. Listening state is cleared on the first notification. Ptracer can also clear it by issuing INTERRUPT - tracee will re-trap into STOP with listening state cleared. This allows ptracer to monitor group stop state without running tracee - use INTERRUPT to put tracee into STOP trap, issue LISTEN and then wait(2) to wait for the next group stop event. When it happens, PTRACE_GETSIGINFO provides information to determine the current state. Test program follows. #define PTRACE_SEIZE 0x4206 #define PTRACE_INTERRUPT 0x4207 #define PTRACE_LISTEN 0x4208 #define PTRACE_SEIZE_DEVEL 0x80000000 static const struct timespec ts1s = { .tv_sec = 1 }; int main(int argc, char *argv) { pid_t tracee, tracer; int i; tracee = fork(); if (!tracee) while (1) pause(); tracer = fork(); if (!tracer) { siginfo_t si; ptrace(PTRACE_SEIZE, tracee, NULL, (void )(unsigned long)PTRACE_SEIZE_DEVEL); ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL); repeat: waitid(P_PID, tracee, NULL, WSTOPPED); ptrace(PTRACE_GETSIGINFO, tracee, NULL, &si); if (!si.si_code) { printf("tracer: SIG %d\n", si.si_signo); ptrace(PTRACE_CONT, tracee, NULL, (void *)(unsigned long)si.si_signo); goto repeat; } printf("tracer: stopped=%d signo=%d\n", si.si_signo != SIGTRAP, si.si_signo); if (si.si_signo != SIGTRAP) ptrace(PTRACE_LISTEN, tracee, NULL, NULL); else ptrace(PTRACE_CONT, tracee, NULL, NULL); goto repeat; } for (i = 0; i < 3; i++) { nanosleep(&ts1s, NULL); printf("mother: SIGSTOP\n"); kill(tracee, SIGSTOP); nanosleep(&ts1s, NULL); printf("mother: SIGCONT\n"); kill(tracee, SIGCONT); } nanosleep(&ts1s, NULL); kill(tracer, SIGKILL); kill(tracee, SIGKILL); return 0; } This is identical to the program to test TRAP_NOTIFY except that tracee is PTRACE_LISTEN'd instead of PTRACE_CONT'd when group stopped. This allows ptracer to monitor when group stop ends without running tracee. # ./test-listen tracer: stopped=0 signo=5 mother: SIGSTOP tracer: SIG 19 tracer: stopped=1 signo=19 mother: SIGCONT tracer: stopped=0 signo=5 tracer: SIG 18 mother: SIGSTOP tracer: SIG 19 tracer: stopped=1 signo=19 mother: SIGCONT tracer: stopped=0 signo=5 tracer: SIG 18 mother: SIGSTOP tracer: SIG 19 tracer: stopped=1 signo=19 mother: SIGCONT tracer: stopped=0 signo=5 tracer: SIG 18 -v2: Moved JOBCTL_LISTENING check in wait_task_stopped() into task_stopped_code() as suggested by Oleg. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com>
2011-06-16	ptrace: implement TRAP_NOTIFY and use it for group stop events	Tejun Heo	2	-4/+38
	Currently there's no way for ptracer to find out whether group stop finished other than polling with INTERRUPT - GETSIGINFO - CONT sequence. This patch implements group stop notification for ptracer using STOP traps. When group stop state of a seized tracee changes, JOBCTL_TRAP_NOTIFY is set, which schedules a STOP trap which is sticky - it isn't cleared by other traps and at least one STOP trap will happen eventually. STOP trap is synchronization point for event notification and the tracer can determine the current group stop state by looking at the signal number portion of exit code (si_status from waitid(2) or si_code from PTRACE_GETSIGINFO). Notifications are generated both on start and end of group stops but, because group stop participation always happens before STOP trap, this doesn't cause an extra trap while tracee is participating in group stop. The symmetry will be useful later. Note that this notification works iff tracee is not trapped. Currently there is no way to be notified of group stop state changes while tracee is trapped. This will be addressed by a later patch. An example program follows. #define PTRACE_SEIZE 0x4206 #define PTRACE_INTERRUPT 0x4207 #define PTRACE_SEIZE_DEVEL 0x80000000 static const struct timespec ts1s = { .tv_sec = 1 }; int main(int argc, char *argv) { pid_t tracee, tracer; int i; tracee = fork(); if (!tracee) while (1) pause(); tracer = fork(); if (!tracer) { siginfo_t si; ptrace(PTRACE_SEIZE, tracee, NULL, (void )(unsigned long)PTRACE_SEIZE_DEVEL); ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL); repeat: waitid(P_PID, tracee, NULL, WSTOPPED); ptrace(PTRACE_GETSIGINFO, tracee, NULL, &si); if (!si.si_code) { printf("tracer: SIG %d\n", si.si_signo); ptrace(PTRACE_CONT, tracee, NULL, (void *)(unsigned long)si.si_signo); goto repeat; } printf("tracer: stopped=%d signo=%d\n", si.si_signo != SIGTRAP, si.si_signo); ptrace(PTRACE_CONT, tracee, NULL, NULL); goto repeat; } for (i = 0; i < 3; i++) { nanosleep(&ts1s, NULL); printf("mother: SIGSTOP\n"); kill(tracee, SIGSTOP); nanosleep(&ts1s, NULL); printf("mother: SIGCONT\n"); kill(tracee, SIGCONT); } nanosleep(&ts1s, NULL); kill(tracer, SIGKILL); kill(tracee, SIGKILL); return 0; } In the above program, tracer keeps tracee running and gets notification of each group stop state changes. # ./test-notify tracer: stopped=0 signo=5 mother: SIGSTOP tracer: SIG 19 tracer: stopped=1 signo=19 mother: SIGCONT tracer: stopped=0 signo=5 tracer: SIG 18 mother: SIGSTOP tracer: SIG 19 tracer: stopped=1 signo=19 mother: SIGCONT tracer: stopped=0 signo=5 tracer: SIG 18 mother: SIGSTOP tracer: SIG 19 tracer: stopped=1 signo=19 mother: SIGCONT tracer: stopped=0 signo=5 tracer: SIG 18 Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com>
2011-06-16	ptrace: implement PTRACE_INTERRUPT	Tejun Heo	2	-2/+28
	Currently, there's no way to trap a running ptracee short of sending a signal which has various side effects. This patch implements PTRACE_INTERRUPT which traps ptracee without any signal or job control related side effect. The implementation is almost trivial. It uses the group stop trap - SIGTRAP \| PTRACE_EVENT_STOP << 8. A new trap flag JOBCTL_TRAP_INTERRUPT is added, which is set on PTRACE_INTERRUPT and cleared when any trap happens. As INTERRUPT should be useable regardless of the current state of tracee, task_is_traced() test in ptrace_check_attach() is skipped for INTERRUPT. PTRACE_INTERRUPT is available iff tracee is attached with PTRACE_SEIZE. Test program follows. #define PTRACE_SEIZE 0x4206 #define PTRACE_INTERRUPT 0x4207 #define PTRACE_SEIZE_DEVEL 0x80000000 static const struct timespec ts100ms = { .tv_nsec = 100000000 }; static const struct timespec ts1s = { .tv_sec = 1 }; static const struct timespec ts3s = { .tv_sec = 3 }; int main(int argc, char *argv) { pid_t tracee; tracee = fork(); if (tracee == 0) { nanosleep(&ts100ms, NULL); while (1) { printf("tracee: alive pid=%d\n", getpid()); nanosleep(&ts1s, NULL); } } if (argc > 1) kill(tracee, SIGSTOP); nanosleep(&ts100ms, NULL); ptrace(PTRACE_SEIZE, tracee, NULL, (void )(unsigned long)PTRACE_SEIZE_DEVEL); if (argc > 1) { waitid(P_PID, tracee, NULL, WSTOPPED); ptrace(PTRACE_CONT, tracee, NULL, NULL); } nanosleep(&ts3s, NULL); printf("tracer: INTERRUPT and DETACH\n"); ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL); waitid(P_PID, tracee, NULL, WSTOPPED); ptrace(PTRACE_DETACH, tracee, NULL, NULL); nanosleep(&ts3s, NULL); printf("tracer: exiting\n"); kill(tracee, SIGKILL); return 0; } When called without argument, tracee is seized from running state, interrupted and then detached back to running state. # ./test-interrupt tracee: alive pid=4546 tracee: alive pid=4546 tracee: alive pid=4546 tracer: INTERRUPT and DETACH tracee: alive pid=4546 tracee: alive pid=4546 tracee: alive pid=4546 tracer: exiting When called with argument, tracee is seized from stopped state, continued, interrupted and then detached back to stopped state. # ./test-interrupt 1 tracee: alive pid=4548 tracee: alive pid=4548 tracee: alive pid=4548 tracer: INTERRUPT and DETACH tracer: exiting Before PTRACE_INTERRUPT, once the tracee was running, there was no way to trap tracee and do PTRACE_DETACH without causing side effect. -v2: Updated to use task_set_jobctl_pending() so that it doesn't end up scheduling TRAP_STOP if child is dying which may make the child unkillable. Spotted by Oleg. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com>
2011-06-16	ptrace: implement PTRACE_SEIZE	Tejun Heo	3	-15/+66
	PTRACE_ATTACH implicitly issues SIGSTOP on attach which has side effects on tracee signal and job control states. This patch implements a new ptrace request PTRACE_SEIZE which attaches a tracee without trapping it or affecting its signal and job control states. The usage is the same with PTRACE_ATTACH but it takes PTRACE_SEIZE_* flags in @data. Currently, the only defined flag is PTRACE_SEIZE_DEVEL which is a temporary flag to enable PTRACE_SEIZE. PTRACE_SEIZE will change ptrace behaviors outside of attach itself. The changes will be implemented gradually and the DEVEL flag is to prevent programs which expect full SEIZE behavior from using it before all the behavior modifications are complete while allowing unit testing. The flag will be removed once SEIZE behaviors are completely implemented. * PTRACE_SEIZE, unlike ATTACH, doesn't force tracee to trap. After attaching tracee continues to run unless a trap condition occurs. * PTRACE_SEIZE doesn't affect signal or group stop state. * If PTRACE_SEIZE'd, group stop uses PTRACE_EVENT_STOP trap which uses exit_code of (signr \| PTRACE_EVENT_STOP << 8) where signr is one of the stopping signals if group stop is in effect or SIGTRAP otherwise, and returns usual trap siginfo on PTRACE_GETSIGINFO instead of NULL. Seizing sets PT_SEIZED in ->ptrace of the tracee. This flag will be used to determine whether new SEIZE behaviors should be enabled. Test program follows. #define PTRACE_SEIZE 0x4206 #define PTRACE_SEIZE_DEVEL 0x80000000 static const struct timespec ts100ms = { .tv_nsec = 100000000 }; static const struct timespec ts1s = { .tv_sec = 1 }; static const struct timespec ts3s = { .tv_sec = 3 }; int main(int argc, char *argv) { pid_t tracee; tracee = fork(); if (tracee == 0) { nanosleep(&ts100ms, NULL); while (1) { printf("tracee: alive\n"); nanosleep(&ts1s, NULL); } } if (argc > 1) kill(tracee, SIGSTOP); nanosleep(&ts100ms, NULL); ptrace(PTRACE_SEIZE, tracee, NULL, (void )(unsigned long)PTRACE_SEIZE_DEVEL); if (argc > 1) { waitid(P_PID, tracee, NULL, WSTOPPED); ptrace(PTRACE_CONT, tracee, NULL, NULL); } nanosleep(&ts3s, NULL); printf("tracer: exiting\n"); return 0; } When the above program is called w/o argument, tracee is seized while running and remains running. When tracer exits, tracee continues to run and print out messages. # ./test-seize-simple tracee: alive tracee: alive tracee: alive tracer: exiting tracee: alive tracee: alive When called with an argument, tracee is seized from stopped state and continued, and returns to stopped state when tracer exits. # ./test-seize tracee: alive tracee: alive tracee: alive tracer: exiting # ps -el\|grep test-seize 1 T 0 4720 1 0 80 0 - 941 signal ttyS0 00:00:00 test-seize -v2: SEIZE doesn't schedule TRAP_STOP and leaves tracee running as Jan suggested. -v3: PTRACE_EVENT_STOP traps now report group stop state by signr. If group stop is in effect the stop signal number is returned as part of exit_code; otherwise, SIGTRAP. This was suggested by Denys and Oleg. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Cc: Denys Vlasenko <vda.linux@googlemail.com> Cc: Oleg Nesterov <oleg@redhat.com>
2011-06-16	job control: introduce JOBCTL_TRAP_STOP and use it for group stop trap	Tejun Heo	3	-33/+75
	do_signal_stop() implemented both normal group stop and trap for group stop while ptraced. This approach has been enough but scheduled changes require trap mechanism which can be used in more generic manner and using group stop trap for generic trap site simplifies both userland visible interface and implementation. This patch adds a new jobctl flag - JOBCTL_TRAP_STOP. When set, it triggers a trap site, which behaves like group stop trap, in get_signal_to_deliver() after checking for pending signals. While ptraced, do_signal_stop() doesn't stop itself. It initiates group stop if requested and schedules JOBCTL_TRAP_STOP and returns. The caller - get_signal_to_deliver() - is responsible for checking whether TRAP_STOP is pending afterwards and handling it. ptrace_attach() is updated to use JOBCTL_TRAP_STOP instead of JOBCTL_STOP_PENDING and __ptrace_unlink() to clear all pending trap bits and TRAPPING so that TRAP_STOP and future trap bits don't linger after detach. While at it, add proper function comment to do_signal_stop() and make it return bool. -v2: __ptrace_unlink() updated to clear JOBCTL_TRAP_MASK and TRAPPING instead of JOBCTL_PENDING_MASK. This avoids accidentally clearing JOBCTL_STOP_CONSUME. Spotted by Oleg. -v3: do_signal_stop() updated to return %false without dropping siglock while ptraced and TRAP_STOP check moved inside for(;;) loop after group stop participation. This avoids unnecessary relocking and also will help avoiding unnecessary traps by consuming group stop before handling pending traps. -v4: Jobctl trap handling moved into a separate function - do_jobctl_trap(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com>
2011-06-04	signal: remove three noop tracehooks	Tejun Heo	2	-82/+14
	Remove the following three noop tracehooks in signals.c. * tracehook_force_sigpending() * tracehook_get_signal() * tracehook_finish_jctl() The code area is about to be updated and these hooks don't do anything other than obfuscating the logic. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-04	ptrace: use bit_waitqueue for TRAPPING instead of wait_chldexit	Tejun Heo	2	-4/+9
	ptracer->signal->wait_chldexit was used to wait for TRAPPING; however, ->wait_chldexit was already complicated with waker-side filtering without adding TRAPPING wait on top of it. Also, it unnecessarily made TRAPPING clearing depend on the current ptrace relationship - if the ptracee is detached, wakeup is lost. There is no reason to use signal->wait_chldexit here. We're just waiting for JOBCTL_TRAPPING bit to clear and given the relatively infrequent use of ptrace, bit_waitqueue can serve it perfectly. This patch makes JOBCTL_TRAPPING wait use bit_waitqueue instead of signal->wait_chldexit. -v2: Use JOBCTL_*_BIT macros instead of ilog2() as suggested by Linus. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-04	job control: introduce task_set_jobctl_pending()	Tejun Heo	3	-9/+45
	task->jobctl currently hosts JOBCTL_STOP_PENDING and will host TRAP pending bits too. Setting pending conditions on a dying task may make the task unkillable. Currently, each setting site is responsible for checking for the condition but with to-be-added job control traps this becomes too fragile. This patch adds task_set_jobctl_pending() which should be used when setting task->jobctl bits to schedule a stop or trap. The function performs the followings to ease setting pending bits. * Sanity checks. * If fatal signal is pending or PF_EXITING is set, no bit is set. * STOP_SIGMASK is automatically cleared if new value is being set. do_signal_stop() and ptrace_attach() are updated to use task_set_jobctl_pending() instead of setting STOP_PENDING explicitly. The surrounding structures around setting are changed to fit task_set_jobctl_pending() better but there should be no userland visible behavior difference. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-04	job control: make task_clear_jobctl_pending() clear TRAPPING automatically	Tejun Heo	2	-6/+9
	JOBCTL_TRAPPING indicates that ptracer is waiting for tracee to (re)transit into TRACED. task_clear_jobctl_pending() must be called when either tracee enters TRACED or the transition is cancelled for some reason. The former is achieved by explicitly calling task_clear_jobctl_pending() in ptrace_stop() and the latter by calling it at the end of do_signal_stop(). Calling task_clear_jobctl_trapping() at the end of do_signal_stop() limits the scope TRAPPING can be used and is fragile in that seemingly unrelated changes to tracee's control flow can lead to stuck TRAPPING. We already have task_clear_jobctl_pending() calls on those cancelling events to clear JOBCTL_STOP_PENDING. Cancellations can be handled by making those call sites use JOBCTL_PENDING_MASK instead and updating task_clear_jobctl_pending() such that task_clear_jobctl_trapping() is called automatically if no stop/trap is pending. This patch makes the above changes and removes the fallback task_clear_jobctl_trapping() call from do_signal_stop(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-04	job control: introduce JOBCTL_PENDING_MASK and task_clear_jobctl_pending()	Tejun Heo	3	-12/+22
	This patch introduces JOBCTL_PENDING_MASK and replaces task_clear_jobctl_stop_pending() with task_clear_jobctl_pending() which takes an extra @mask argument. JOBCTL_PENDING_MASK is currently equal to JOBCTL_STOP_PENDING but future patches will add more bits. recalc_sigpending_tsk() is updated to use JOBCTL_PENDING_MASK instead. task_clear_jobctl_pending() takes @mask which in subset of JOBCTL_PENDING_MASK and clears the relevant jobctl bits. If JOBCTL_STOP_PENDING is set, other STOP bits are cleared together. All task_clear_jobctl_stop_pending() users are updated to call task_clear_jobctl_pending() with JOBCTL_STOP_PENDING which is functionally identical to task_clear_jobctl_stop_pending(). This patch doesn't cause any functional change. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-04	ptrace: relocate set_current_state(TASK_TRACED) in ptrace_stop()	Tejun Heo	1	-15/+13
	In ptrace_stop(), after arch hook is done, the task state and jobctl bits are updated while holding siglock. The ordering requirement there is that TASK_TRACED is set before JOBCTL_TRAPPING is cleared to prevent ptracer waiting on TRAPPING doesn't end up waking up TRACED is actually set and sees TASK_RUNNING in wait(2). Move set_current_state(TASK_TRACED) to the top of the block and reorganize comments. This makes the ordering more obvious (TASK_TRACED before other updates) and helps future updates to group stop participation. This patch doesn't cause any functional change. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-04	ptrace: ptrace_check_attach(): rename @kill to @ignore_state and add comments	Tejun Heo	2	-6/+20
	PTRACE_INTERRUPT is going to be added which should also skip task_is_traced() check in ptrace_check_attach(). Rename @kill to @ignore_state and make it bool. Add function comment while at it. This patch doesn't introduce any behavior difference. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-04	job control: rename signal->group_stop and flags to jobctl and update them	Tejun Heo	4	-60/+67
	signal->group_stop currently hosts mostly group stop related flags; however, it's gonna be used for wider purposes and the GROUP_STOP_ flag prefix becomes confusing. Rename signal->group_stop to signal->jobctl and rename all GROUP_STOP_* flags to JOBCTL_. Bit position macros JOBCTL__BIT are defined and JOBCTL_* flags are defined in terms of them to allow using bitops later. While at it, reassign JOBCTL_TRAPPING to bit 22 to better accomodate future additions. This doesn't cause any functional change. -v2: JOBCTL_*_BIT macros added as suggested by Linus. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-04	ptrace: remove silly wait_trap variable from ptrace_attach()	Tejun Heo	1	-3/+1
	Remove local variable wait_trap which determines whether to wait for !TRAPPING or not and simply wait for it if attach was successful. -v2: Oleg pointed out wait should happen iff attach was successful. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-04	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6	Linus Torvalds	2	-1/+2
	* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: [SCSI] Fix oops caused by queue refcounting failure
2011-06-04	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6	Linus Torvalds	60	-219/+380
	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (40 commits) tg3: Fix tg3_skb_error_unmap() net: tracepoint of net_dev_xmit sees freed skb and causes panic drivers/net/can/flexcan.c: add missing clk_put net: dm9000: Get the chip in a known good state before enabling interrupts drivers/net/davinci_emac.c: add missing clk_put af-packet: Add flag to distinguish VID 0 from no-vlan. caif: Fix race when conditionally taking rtnl lock usbnet/cdc_ncm: add missing .reset_resume hook vlan: fix typo in vlan_dev_hard_start_xmit() net/ipv4: Check for mistakenly passed in non-IPv4 address iwl4965: correctly validate temperature value bluetooth l2cap: fix locking in l2cap_global_chan_by_psm ath9k: fix two more bugs in tx power cfg80211: don't drop p2p probe responses Revert "net: fix section mismatches" drivers/net/usb/catc.c: Fix potential deadlock in catc_ctrl_run() sctp: stop pending timers and purge queues when peer restart asoc drivers/net: ks8842 Fix crash on received packet when in PIO mode. ip_options_compile: properly handle unaligned pointer iwlagn: fix incorrect PCI subsystem id for 6150 devices ...
2011-06-04	Merge branch 'for-linus' of git://git.kernel.dk/linux-block	Linus Torvalds	9	-28/+41
	* 'for-linus' of git://git.kernel.dk/linux-block: block: Use hlist_entry() for io_context.cic_list.first cfq-iosched: Remove bogus check in queue_fail path xen/blkback: potential null dereference in error handling xen/blkback: don't call vbd_size() if bd_disk is NULL block: blkdev_get() should access ->bd_disk only after success CFQ: Fix typo and remove unnecessary semicolon block: remove unwanted semicolons Revert "block: Remove extra discard_alignment from hd_struct." nbd: adjust 'max_part' according to part_shift nbd: limit module parameters to a sane value nbd: pass MSG_* flags to kernel_recvmsg() block: improve the bio_add_page() and bio_add_pc_page() descriptions
2011-06-04	Merge branch 'for-linus' of ↵	Linus Torvalds	1	-1/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/vapier/blackfin * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/vapier/blackfin: Blackfin: strncpy: fix handling of zero lengths
2011-06-04	Merge branch 'stable' of ↵	Linus Torvalds	2	-2/+4
	git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile * 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: asm-generic/unistd.h: support sendmmsg syscall tile: enable CONFIG_BUGVERBOSE
2011-06-04	Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6	Linus Torvalds	9	-97/+136
	* 'linux-next' of git://git.infradead.org/ubifs-2.6: UBIFS: fix-up free space earlier UBIFS: intialize LPT earlier UBIFS: assert no fixup when writing a node UBIFS: fix clean znode counter corruption in error cases UBIFS: fix memory leak on error path UBIFS: fix shrinker object count reports UBIFS: fix recovery broken by the previous recovery fix UBIFS: amend ubifs_recover_leb interface UBIFS: introduce a "grouped" journal head flag UBIFS: supress false error messages
2011-06-04	Merge branch 'for-linus' of ↵	Linus Torvalds	1	-4/+4
	git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-ktest * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-ktest: ktest: Ignore unset values of the minconfig in config_bisect ktest: Fix result of rebooting the kernel ktest: Fix off-by-one in config bisect result
2011-06-04	Merge branch 'rmobile-fixes-for-linus' of ↵	Linus Torvalds	3	-0/+141
	git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 * 'rmobile-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: ARM: mach-shmobile: add DMAC clock definitions on SH7372 ARM: arch-shmobile: support SDHI card detection on mackerel, using a GPIO sh_mobile_meram: MERAM platform data for LCDC
2011-06-04	Merge branch 'sh-fixes-for-linus' of ↵	Linus Torvalds	14	-45/+39
	git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 * 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: dmaengine: shdma: fix a regression: initialise DMA channels for memcpy dmaengine: shdma: Fix up fallout from runtime PM changes. Revert "clocksource: sh_cmt: Runtime PM support" Revert "clocksource: sh_tmu: Runtime PM support" sh: Fix up asm-generic/ptrace.h fallout. sh64: Move from P1SEG to CAC_ADDR for consistent sync. sh64: asm/pgtable.h needs asm/mmu.h sh: asm/tlb.h needs linux/swap.h sh: mark DMA slave ID 0 as invalid sh: Update shmin to reflect PIO dependency. sh: arch/sh/kernel/process_32.c needs linux/prefetch.h. sh: add MMCIF runtime PM support on ecovec sh: switch ap325rxa to dynamically manage the platform camera
2011-06-04	Revert "ASoC: Update cx20442 for TTY API change"	Linus Torvalds	1	-5/+3
	This reverts commit ed0bd2333cffc3d856db9beb829543c1dfc00982. Since we reverted the TTY API change, we should revert the ASoC update to it too. Cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Cc: Liam Girdwood <lrg@ti.com> Cc: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-04	Revert "tty: make receive_buf() return the amout of bytes received"	Linus Torvalds	20	-128/+115
	This reverts commit b1c43f82c5aa265442f82dba31ce985ebb7aa71c. It was broken in so many ways, and results in random odd pty issues. It re-introduced the buggy schedule_work() in flush_to_ldisc() that can cause endless work-loops (see commit a5660b41af6a: "tty: fix endless work loop when the buffer fills up"). It also used an "unsigned int" return value fo the ->receive_buf() function, but then made multiple functions return a negative error code, and didn't actually check for the error in the caller. And it didn't actually work at all. BenH bisected down odd tty behavior to it: "It looks like the patch is causing some major malfunctions of the X server for me, possibly related to PTYs. For example, cat'ing a large file in a gnome terminal hangs the kernel for -minutes- in a loop of what looks like flush_to_ldisc/workqueue code, (some ftrace data in the quoted bits further down). ... Some more data: It -looks- like what happens is that the flush_to_ldisc work queue entry constantly re-queues itself (because the PTY is full ?) and the workqueue thread will basically loop forver calling it without ever scheduling, thus starving the consumer process that could have emptied the PTY." which is pretty much exactly the problem we fixed in a5660b41af6a. Milton Miller pointed out the 'unsigned int' issue. Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reported-by: Milton Miller <miltonm@bga.com> Cc: Stefan Bigler <stefan.bigler@keymile.com> Cc: Toby Gray <toby.gray@realvnc.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-03	Merge branch 'master' of ↵	John W. Linville	15	-70/+155
	git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into for-davem
2011-06-03	UBIFS: fix-up free space earlier	Ben Gardiner	1	-8/+8
	The free space fixup is currently initiated during mount after the call to ubifs_write_master() which results in a write to PEBs; this has been observed with the patch 'assert no fixup when writing a node' applied: Move the free space fixup on mount to before the calls to ubifs_recover_inl_heads() and ubifs_write_master(). This results in no assertions with the previously mentioned patch applied. Artem: tweaked the patch a bit Signed-off-by: Ben Gardiner <bengardiner@nanometrics> Reviewed-by: Matthew L. Creech <mlcreech@gmail.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-06-03	UBIFS: intialize LPT earlier	Ben Gardiner	1	-13/+16
	The current 'mount_ubifs()' implementation does not initialize the LPT until the the master node is marked dirty. Move the LPT initialization to before marking the master node dirty. This is a preparation for the next patch which will move the free-space-fixup check to before marking the master node dirty, because we have to fix-up the free space before doing any writes. Artem: massaged the patch and commit message. Signed-off-by: Ben Gardiner <bengardiner@nanometrics.ca> Reviewed-by: Matthew L. Creech <mlcreech@gmail.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-06-03	UBIFS: assert no fixup when writing a node	Ben Gardiner	1	-0/+2
	The current free space fixup can result in some writing to the UBI volume when the space_fixup flag is set. To catch instances where UBIFS is writing to the NAND while the space_fixup flag is set, add an assert to ubifs_write_node(). Artem: tweaked the patch, added similar assertion to the write buffer write path. Signed-off-by: Ben Gardiner <bengardiner@nanometrics.ca> Reviewed-by: Matthew L. Creech <mlcreech@gmail.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-06-03	UBIFS: fix clean znode counter corruption in error cases	Artem Bityutskiy	1	-4/+5
	UBIFS maintains per-filesystem and global clean znode counters ('c->clean_zn_cnt' and 'ubifs_clean_zn_cnt'). It is important to maintain correct values there since the shrinker relies on 'ubifs_clean_zn_cnt'. However, in case of failures during commit the counters were corrupted. E.g., if a failure happens in the middle of 'write_index()', then some nodes in the commit list ('c->cnext') are marked as clean, and some are marked as dirty. And the 'ubifs_destroy_tnc_subtree()' frees does not retrun correct count, and we end up with non-zero 'c->clean_zn_cnt' when unmounting. This means that if we have 2 file-sytem and one of them fails, and we unmount it, 'ubifs_clean_zn_cnt' stays incorrect and confuses the shrinker. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-06-03	UBIFS: fix memory leak on error path	Artem Bityutskiy	1	-0/+1
	UBIFS leaks memory on error path in 'ubifs_jnl_update()' in case of write failure because it forgets to free the 'struct ubifs_dent_node *dent' object. Although the object is small, the alignment can make it large - e.g., 2KiB if the min. I/O unit is 2KiB. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Cc: stable@kernel.org
2011-06-03	UBIFS: fix shrinker object count reports	Artem Bityutskiy	1	-1/+5
	Sometimes VM asks the shrinker to return amount of objects it can shrink, and we return the ubifs_clean_zn_cnt in that case. However, it is possible that this counter is negative for a short period of time, due to the way UBIFS TNC code updates it. And I can observe the following warnings sometimes: shrink_slab: ubifs_shrinker+0x0/0x2b7 [ubifs] negative objects to delete nr=-8541616642706119788 This patch makes sure UBIFS never returns negative count of objects. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Cc: stable@kernel.org
2011-06-03	Blackfin: strncpy: fix handling of zero lengths	Steven Miao	1	-1/+1
	The jump to 4f will cause the NUL padding loop to run at least one time, so if string length is zero just jump to the end. Otherwise we wrongly write one NUL byte when size==0. Signed-off-by: Steven Miao <realmz6@gmail.com> Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2011-06-03	tg3: Fix tg3_skb_error_unmap()	Matt Carlson	1	-1/+1
	This function attempts to free one fragment beyond the number of fragments that were actually mapped. This patch brings back the limit to the correct spot. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-02	net: tracepoint of net_dev_xmit sees freed skb and causes panic	Koki Sanagi	2	-7/+12
	Because there is a possibility that skb is kfree_skb()ed and zero cleared after ndo_start_xmit, we should not see the contents of skb like skb->len and skb->dev->name after ndo_start_xmit. But trace_net_dev_xmit does that and causes panic by NULL pointer dereference. This patch fixes trace_net_dev_xmit not to see the contents of skb directly. If you want to reproduce this panic, 1. Get tracepoint of net_dev_xmit on 2. Create 2 guests on KVM 2. Make 2 guests use virtio_net 4. Execute netperf from one to another for a long time as a network burden 5. host will panic(It takes about 30 minutes) Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-02	asm-generic/unistd.h: support sendmmsg syscall	Chris Metcalf	1	-1/+3
	Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Acked-by: Arnd Bergmann <arnd@arndb.de>
2011-06-02	ktest: Ignore unset values of the minconfig in config_bisect	Steven Rostedt	1	-1/+1
	By ignoring the unset values of the minconfig in deciding what to test in the config_bisect can cause the problem config from being tested too. Just do not test the configs that are set in the minconfig. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-02	ktest: Fix result of rebooting the kernel	Steven Rostedt	1	-1/+1
	The command that is called that reboots the kernel may fail but the return code is not passed back to the ktest.pl script. This is because a ';' is used between the two commands and if the second command fails, only the first command's return code is returned. Using a '&&' between the two commands fixes this. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-02	ktest: Fix off-by-one in config bisect result	Steven Rostedt	1	-2/+2
	Because in perl the array size returned by $#arr, is the last index and not the actually size of the array, we end the config bisect early, thinking there is only one config left when there are in fact two. Thus the result has a 50% chance of picking the correct config that caused the problem. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-02	Merge branch 'for-jens/xen-blkback.fixes' of ↵	Jens Axboe	2	-6/+7
	git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-linus
2011-06-02	block: Use hlist_entry() for io_context.cic_list.first	Paul Bolle	1	-2/+2
	list_entry() and hlist_entry() are both simply aliases for container_of(), but since io_context.cic_list.first is an hlist_node one should at least use the correct alias. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-06-02	cfq-iosched: Remove bogus check in queue_fail path	Paul Bolle	1	-3/+0
	queue_fail can only be reached if cic is NULL, so its check for cic must be bogus. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-06-02	[SCSI] Fix oops caused by queue refcounting failure	James Bottomley	2	-1/+2
	In certain circumstances, we can get an oops from a torn down device. Most notably this is from CD roms trying to call scsi_ioctl. The root cause of the problem is the fact that after scsi_remove_device() has been called, the queue is fully torn down. This is actually wrong since the queue can be used until the sdev release function is called. Therefore, we add an extra reference to the queue which is released in sdev->release, so the queue always exists. Reported-by: Parag Warudkar <parag.lkml@gmail.com> Cc: stable@kernel.org Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-06-02	drivers/net/can/flexcan.c: add missing clk_put	Julia Lawall	1	-3/+2
	The failed_get label is used after the call to clk_get has succeeded, so it should be moved up above the call to clk_put. The failed_req labels doesn't do anything different than failed_get, so delete it. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r exists@ expression e1,e2; statement S; @@ e1 = clk_get@p1(...); ... when != e1 = e2 when != clk_put(e1) when any if (...) { ... when != clk_put(e1) when != if (...) { ... clk_put(e1) ... } * return@p3 ...; } else S // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-02	ARM: mach-shmobile: add DMAC clock definitions on SH7372	Guennadi Liakhovetski	1	-0/+7
	These definitions are needed to let the runtime PM subsystem turn off DMAC clocks, when it is suspended by the driver. Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2011-06-02	dmaengine: shdma: fix a regression: initialise DMA channels for memcpy	Guennadi Liakhovetski	1	-1/+1
	A recent patch has introduced a regression, where repeating a memcpy DMA test with shdma module unloading between them skips the DMA channel configuration. Fix this regression by always configuring the channel during its allocation. Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Signed-off-by: Paul Mundt <lethal@linux-sh.org>