From e761b7725234276a802322549cee5255305a0930 Mon Sep 17 00:00:00 2001 From: Max Krasnyansky Date: Tue, 15 Jul 2008 04:43:49 -0700 Subject: cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2) This is based on Linus' idea of creating cpu_active_map that prevents scheduler load balancer from migrating tasks to the cpu that is going down. It allows us to simplify domain management code and avoid unecessary domain rebuilds during cpu hotplug event handling. Please ignore the cpusets part for now. It needs some more work in order to avoid crazy lock nesting. Although I did simplfy and unify domain reinitialization logic. We now simply call partition_sched_domains() in all the cases. This means that we're using exact same code paths as in cpusets case and hence the test below cover cpusets too. Cpuset changes to make rebuild_sched_domains() callable from various contexts are in the separate patch (right next after this one). This not only boots but also easily handles while true; do make clean; make -j 8; done and while true; do on-off-cpu 1; done at the same time. (on-off-cpu 1 simple does echo 0/1 > /sys/.../cpu1/online thing). Suprisingly the box (dual-core Core2) is quite usable. In fact I'm typing this on right now in gnome-terminal and things are moving just fine. Also this is running with most of the debug features enabled (lockdep, mutex, etc) no BUG_ONs or lockdep complaints so far. I believe I addressed all of the Dmitry's comments for original Linus' version. I changed both fair and rt balancer to mask out non-active cpus. And replaced cpu_is_offline() with !cpu_active() in the main scheduler code where it made sense (to me). Signed-off-by: Max Krasnyanskiy Acked-by: Linus Torvalds Acked-by: Peter Zijlstra Acked-by: Gregory Haskins Cc: dmitry.adamushko@gmail.com Cc: pj@sgi.com Signed-off-by: Ingo Molnar --- kernel/cpu.c | 30 ++++++++++++++++++++++++------ 1 file changed, 24 insertions(+), 6 deletions(-) (limited to 'kernel/cpu.c') diff --git a/kernel/cpu.c b/kernel/cpu.c index cfb1d43ab801..a1ac7ea245d7 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -64,6 +64,8 @@ void __init cpu_hotplug_init(void) cpu_hotplug.refcount = 0; } +cpumask_t cpu_active_map; + #ifdef CONFIG_HOTPLUG_CPU void get_online_cpus(void) @@ -291,11 +293,20 @@ int __ref cpu_down(unsigned int cpu) int err = 0; cpu_maps_update_begin(); - if (cpu_hotplug_disabled) + + if (cpu_hotplug_disabled) { err = -EBUSY; - else - err = _cpu_down(cpu, 0); + goto out; + } + + cpu_clear(cpu, cpu_active_map); + + err = _cpu_down(cpu, 0); + + if (cpu_online(cpu)) + cpu_set(cpu, cpu_active_map); +out: cpu_maps_update_done(); return err; } @@ -355,11 +366,18 @@ int __cpuinit cpu_up(unsigned int cpu) } cpu_maps_update_begin(); - if (cpu_hotplug_disabled) + + if (cpu_hotplug_disabled) { err = -EBUSY; - else - err = _cpu_up(cpu, 0); + goto out; + } + err = _cpu_up(cpu, 0); + + if (cpu_online(cpu)) + cpu_set(cpu, cpu_active_map); + +out: cpu_maps_update_done(); return err; } -- cgit v1.2.3 From 39b0fad7121eace85770e7a4c6dc35dfd2879768 Mon Sep 17 00:00:00 2001 From: Max Krasnyansky Date: Tue, 15 Jul 2008 20:56:26 -0700 Subject: cpu hotplug: Make cpu_active_map synchronization dependency clear This goes on top of the cpu_active_map (take 2) patch. Currently we depend on the stop_machine to provide nescessesary synchronization for the cpu_active_map updates. As Dmitry Adamushko pointed this is fragile and is not much clearer than the previous scheme. In other words we do not want to depend on the internal stop machine operation here. So make the synchronization rules clear by doing synchronize_sched() after clearing out cpu active bit. Tested on quad-Core2 with: while true; do for i in 1 2 3; do echo 0 > /sys/devices/system/cpu/cpu$i/online done for i in 1 2 3; do echo 1 > /sys/devices/system/cpu/cpu$i/online done done and stress -c 200 No lockdep, preempt or other complaints. Signed-off-by: Max Krasnyansky Acked-by: Peter Zijlstra Signed-off-by: Ingo Molnar --- kernel/cpu.c | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'kernel/cpu.c') diff --git a/kernel/cpu.c b/kernel/cpu.c index a1ac7ea245d7..033603c1d7c3 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -301,6 +301,16 @@ int __ref cpu_down(unsigned int cpu) cpu_clear(cpu, cpu_active_map); + /* + * Make sure the all cpus did the reschedule and are not + * using stale version of the cpu_active_map. + * This is not strictly necessary becuase stop_machine() + * that we run down the line already provides the required + * synchronization. But it's really a side effect and we do not + * want to depend on the innards of the stop_machine here. + */ + synchronize_sched(); + err = _cpu_down(cpu, 0); if (cpu_online(cpu)) -- cgit v1.2.3