diff options
Diffstat (limited to 'Documentation/RCU/stallwarn.txt')
-rw-r--r-- | Documentation/RCU/stallwarn.txt | 47 |
1 files changed, 28 insertions, 19 deletions
diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt index 68fe3ad27015..ed186a902d31 100644 --- a/Documentation/RCU/stallwarn.txt +++ b/Documentation/RCU/stallwarn.txt @@ -26,12 +26,6 @@ CONFIG_RCU_CPU_STALL_TIMEOUT Stall-warning messages may be enabled and disabled completely via /sys/module/rcupdate/parameters/rcu_cpu_stall_suppress. -CONFIG_RCU_CPU_STALL_VERBOSE - - This kernel configuration parameter causes the stall warning to - also dump the stacks of any tasks that are blocking the current - RCU-preempt grace period. - CONFIG_RCU_CPU_STALL_INFO This kernel configuration parameter causes the stall warning to @@ -56,8 +50,20 @@ RCU_STALL_RAT_DELAY two jiffies. (This is a cpp macro, not a kernel configuration parameter.) -When a CPU detects that it is stalling, it will print a message similar -to the following: +rcupdate.rcu_task_stall_timeout + + This boot/sysfs parameter controls the RCU-tasks stall warning + interval. A value of zero or less suppresses RCU-tasks stall + warnings. A positive value sets the stall-warning interval + in jiffies. An RCU-tasks stall warning starts wtih the line: + + INFO: rcu_tasks detected stalls on tasks: + + And continues with the output of sched_show_task() for each + task stalling the current RCU-tasks grace period. + +For non-RCU-tasks flavors of RCU, when a CPU detects that it is stalling, +it will print a message similar to the following: INFO: rcu_sched_state detected stall on CPU 5 (t=2500 jiffies) @@ -65,7 +71,7 @@ This message indicates that CPU 5 detected that it was causing a stall, and that the stall was affecting RCU-sched. This message will normally be followed by a stack dump of the offending CPU. On TREE_RCU kernel builds, RCU and RCU-sched are implemented by the same underlying mechanism, -while on TREE_PREEMPT_RCU kernel builds, RCU is instead implemented +while on PREEMPT_RCU kernel builds, RCU is instead implemented by rcu_preempt_state. On the other hand, if the offending CPU fails to print out a stall-warning @@ -77,7 +83,7 @@ INFO: rcu_bh_state detected stalls on CPUs/tasks: { 3 5 } (detected by 2, 2502 j This message indicates that CPU 2 detected that CPUs 3 and 5 were both causing stalls, and that the stall was affecting RCU-bh. This message will normally be followed by stack dumps for each CPU. Please note that -TREE_PREEMPT_RCU builds can be stalled by tasks as well as by CPUs, +PREEMPT_RCU builds can be stalled by tasks as well as by CPUs, and that the tasks will be indicated by PID, for example, "P3421". It is even possible for a rcu_preempt_state stall to be caused by both CPUs -and- tasks, in which case the offending CPUs and tasks will all @@ -174,8 +180,12 @@ o A CPU looping with preemption disabled. This condition can o A CPU looping with bottom halves disabled. This condition can result in RCU-sched and RCU-bh stalls. -o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel - without invoking schedule(). +o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the + kernel without invoking schedule(). Note that cond_resched() + does not necessarily prevent RCU CPU stall warnings. Therefore, + if the looping in the kernel is really expected and desirable + behavior, you might need to replace some of the cond_resched() + calls with calls to cond_resched_rcu_qs(). o A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might happen to preempt a low-priority task in the middle of an RCU @@ -189,10 +199,10 @@ o A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might o A CPU-bound real-time task in a CONFIG_PREEMPT_RT kernel that is running at a higher priority than the RCU softirq threads. This will prevent RCU callbacks from ever being invoked, - and in a CONFIG_TREE_PREEMPT_RCU kernel will further prevent + and in a CONFIG_PREEMPT_RCU kernel will further prevent RCU grace periods from ever completing. Either way, the system will eventually run out of memory and hang. In the - CONFIG_TREE_PREEMPT_RCU case, you might see stall-warning + CONFIG_PREEMPT_RCU case, you might see stall-warning messages. o A hardware or software issue shuts off the scheduler-clock @@ -208,11 +218,10 @@ o A hardware failure. This is quite unlikely, but has occurred This resulted in a series of RCU CPU stall warnings, eventually leading the realization that the CPU had failed. -The RCU, RCU-sched, and RCU-bh implementations have CPU stall warning. -SRCU does not have its own CPU stall warnings, but its calls to -synchronize_sched() will result in RCU-sched detecting RCU-sched-related -CPU stalls. Please note that RCU only detects CPU stalls when there is -a grace period in progress. No grace period, no CPU stall warnings. +The RCU, RCU-sched, RCU-bh, and RCU-tasks implementations have CPU stall +warning. Note that SRCU does -not- have CPU stall warnings. Please note +that RCU only detects CPU stalls when there is a grace period in progress. +No grace period, no CPU stall warnings. To diagnose the cause of the stall, inspect the stack traces. The offending function will usually be near the top of the stack. |