diff options
author | Peter Zijlstra <peterz@infradead.org> | 2019-12-10 09:34:54 +0100 |
---|---|---|
committer | Peter Zijlstra <peterz@infradead.org> | 2019-12-17 13:32:50 +0100 |
commit | 45178ac0cea853fe0e405bf11e101bdebea57b15 (patch) | |
tree | 1f753a380957127a552849c3d3b701b0a3914be4 /kernel/resource.c | |
parent | cde65194502778665c1b52afc5722cf7dbfaa399 (diff) | |
download | linux-45178ac0cea853fe0e405bf11e101bdebea57b15.tar.bz2 |
cpu/hotplug, stop_machine: Fix stop_machine vs hotplug order
Paul reported a very sporadic, rcutorture induced, workqueue failure.
When the planets align, the workqueue rescuer's self-migrate fails and
then triggers a WARN for running a work on the wrong CPU.
Tejun then figured that set_cpus_allowed_ptr()'s stop_one_cpu() call
could be ignored! When stopper->enabled is false, stop_machine will
insta complete the work, without actually doing the work. Worse, it
will not WARN about this (we really should fix this).
It turns out there is a small window where a freshly online'ed CPU is
marked 'online' but doesn't yet have the stopper task running:
BP AP
bringup_cpu()
__cpu_up(cpu, idle) --> start_secondary()
...
cpu_startup_entry()
bringup_wait_for_ap()
wait_for_ap_thread() <-- cpuhp_online_idle()
while (1)
do_idle()
... available to run kthreads ...
stop_machine_unpark()
stopper->enable = true;
Close this by moving the stop_machine_unpark() into
cpuhp_online_idle(), such that the stopper thread is ready before we
start the idle loop and schedule.
Reported-by: "Paul E. McKenney" <paulmck@kernel.org>
Debugged-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: "Paul E. McKenney" <paulmck@kernel.org>
Diffstat (limited to 'kernel/resource.c')
0 files changed, 0 insertions, 0 deletions