summaryrefslogtreecommitdiffstats
path: root/kernel/sched/stats.h
diff options
context:
space:
mode:
authorJohannes Weiner <hannes@cmpxchg.org>2020-03-16 15:13:32 -0400
committerPeter Zijlstra <peterz@infradead.org>2020-03-20 13:06:19 +0100
commit36b238d5717279163859fb6ba0f4360abcafab83 (patch)
tree1b3282b27b593262f09686f704b1e00767ff76f6 /kernel/sched/stats.h
parentb05e75d611380881e73edc58a20fd8c6bb71720b (diff)
downloadlinux-36b238d5717279163859fb6ba0f4360abcafab83.tar.bz2
psi: Optimize switching tasks inside shared cgroups
When switching tasks running on a CPU, the psi state of a cgroup containing both of these tasks does not change. Right now, we don't exploit that, and can perform many unnecessary state changes in nested hierarchies, especially when most activity comes from one leaf cgroup. This patch implements an optimization where we only update cgroups whose state actually changes during a task switch. These are all cgroups that contain one task but not the other, up to the first shared ancestor. When both tasks are in the same group, we don't need to update anything at all. We can identify the first shared ancestor by walking the groups of the incoming task until we see TSK_ONCPU set on the local CPU; that's the first group that also contains the outgoing task. The new psi_task_switch() is similar to psi_task_change(). To allow code reuse, move the task flag maintenance code into a new function and the poll/avg worker wakeups into the shared psi_group_change(). Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200316191333.115523-3-hannes@cmpxchg.org
Diffstat (limited to 'kernel/sched/stats.h')
-rw-r--r--kernel/sched/stats.h9
1 files changed, 1 insertions, 8 deletions
diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
index 6ff0ac1a803f..1339f5bfe513 100644
--- a/kernel/sched/stats.h
+++ b/kernel/sched/stats.h
@@ -141,14 +141,7 @@ static inline void psi_sched_switch(struct task_struct *prev,
if (static_branch_likely(&psi_disabled))
return;
- /*
- * Clear the TSK_ONCPU state if the task was preempted. If
- * it's a voluntary sleep, dequeue will have taken care of it.
- */
- if (!sleep)
- psi_task_change(prev, TSK_ONCPU, 0);
-
- psi_task_change(next, 0, TSK_ONCPU);
+ psi_task_switch(prev, next, sleep);
}
static inline void psi_task_tick(struct rq *rq)