<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/mm/oom_kill.c, branch v2.6.31-rc2</title>
<subtitle>Linux Kernel (branches are rebased on master from time to time)</subtitle>
<id>https://sre.ring0.de/linux/atom?h=v2.6.31-rc2</id>
<link rel='self' href='https://sre.ring0.de/linux/atom?h=v2.6.31-rc2'/>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/'/>
<updated>2009-06-17T02:47:45Z</updated>
<entry>
<title>oom: only oom kill exiting tasks with attached memory</title>
<updated>2009-06-17T02:47:45Z</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2009-06-16T22:33:16Z</published>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/commit/?id=81236810226f71bd9ff77321c8e8276dae7efc61'/>
<id>urn:sha1:81236810226f71bd9ff77321c8e8276dae7efc61</id>
<content type='text'>
When a task is chosen for oom kill and is found to be PF_EXITING,
__oom_kill_task() is called to elevate the task's timeslice and give it
access to memory reserves so that it may quickly exit.

This privilege is unnecessary, however, if the task has already detached
its mm.  Although its possible for the mm to become detached later since
task_lock() is not held, __oom_kill_task() will simply be a no-op in such
circumstances.

Subsequently, it is no longer necessary to warn about killing mm-less
tasks since it is a no-op.

Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Balbir Singh &lt;balbir@linux.vnet.ibm.com&gt;
Cc: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Reviewed-by: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>oom: avoid unnecessary mm locking and scanning for OOM_DISABLE</title>
<updated>2009-06-17T02:47:43Z</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2009-06-16T22:32:57Z</published>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/commit/?id=4d8b9135c30ccbe46e621fefd862969819003fd6'/>
<id>urn:sha1:4d8b9135c30ccbe46e621fefd862969819003fd6</id>
<content type='text'>
This moves the check for OOM_DISABLE to the badness heuristic so it is
only necessary to hold task_lock() once.  If the mm is OOM_DISABLE, the
score is 0, which is also correctly exported via /proc/pid/oom_score.
This requires that tasks with badness scores of 0 are prohibited from
being oom killed, which makes sense since they would not allow for future
memory freeing anyway.

Since the oom_adj value is a characteristic of an mm and not a task, it is
no longer necessary to check the oom_adj value for threads sharing the
same memory (except when simply issuing SIGKILLs for threads in other
thread groups).

Cc: Nick Piggin &lt;npiggin@suse.de&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>oom: move oom_adj value from task_struct to mm_struct</title>
<updated>2009-06-17T02:47:43Z</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2009-06-16T22:32:56Z</published>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/commit/?id=2ff05b2b4eac2e63d345fc731ea151a060247f53'/>
<id>urn:sha1:2ff05b2b4eac2e63d345fc731ea151a060247f53</id>
<content type='text'>
The per-task oom_adj value is a characteristic of its mm more than the
task itself since it's not possible to oom kill any thread that shares the
mm.  If a task were to be killed while attached to an mm that could not be
freed because another thread were set to OOM_DISABLE, it would have
needlessly been terminated since there is no potential for future memory
freeing.

This patch moves oomkilladj (now more appropriately named oom_adj) from
struct task_struct to struct mm_struct.  This requires task_lock() on a
task to check its oom_adj value to protect against exec, but it's already
necessary to take the lock when dereferencing the mm to find the total VM
size for the badness heuristic.

This fixes a livelock if the oom killer chooses a task and another thread
sharing the same memory has an oom_adj value of OOM_DISABLE.  This occurs
because oom_kill_task() repeatedly returns 1 and refuses to kill the
chosen task while select_bad_process() will repeatedly choose the same
task during the next retry.

Taking task_lock() in select_bad_process() to check for OOM_DISABLE and in
oom_kill_task() to check for threads sharing the same memory will be
removed in the next patch in this series where it will no longer be
necessary.

Writing to /proc/pid/oom_adj for a kthread will now return -EINVAL since
these threads are immune from oom killing already.  They simply report an
oom_adj value of OOM_DISABLE.

Cc: Nick Piggin &lt;npiggin@suse.de&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>oom: fix possible oom_dump_tasks NULL pointer</title>
<updated>2009-05-29T15:40:01Z</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2009-05-28T21:34:19Z</published>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/commit/?id=6d2661ede5f20f968422e790af3334908c3bc857'/>
<id>urn:sha1:6d2661ede5f20f968422e790af3334908c3bc857</id>
<content type='text'>
When /proc/sys/vm/oom_dump_tasks is enabled, it is possible to get a NULL
pointer for tasks that have detached mm's since task_lock() is not held
during the tasklist scan.  Add the task_lock().

Acked-by: Nick Piggin &lt;npiggin@suse.de&gt;
Acked-by: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>oom: prevent livelock when oom_kill_allocating_task is set</title>
<updated>2009-05-06T23:36:09Z</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2009-05-06T23:02:55Z</published>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/commit/?id=184101bf143ac96d62b3dcc17e7b3550f98d3350'/>
<id>urn:sha1:184101bf143ac96d62b3dcc17e7b3550f98d3350</id>
<content type='text'>
When /proc/sys/vm/oom_kill_allocating_task is set for large systems that
want to avoid the lengthy tasklist scan, it's possible to livelock if
current is ineligible for oom kill.  This normally happens when it is set
to OOM_DISABLE, but is also possible if any threads are sharing the same
-&gt;mm with a different tgid.

So change __out_of_memory() to fall back to the full task-list scan if it
was unable to kill `current'.

Cc: Nick Piggin &lt;npiggin@suse.de&gt;
Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memcg: show memcg information during OOM</title>
<updated>2009-04-03T02:04:55Z</updated>
<author>
<name>Balbir Singh</name>
<email>balbir@linux.vnet.ibm.com</email>
</author>
<published>2009-04-02T23:57:39Z</published>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/commit/?id=e222432bfa7dcf6ec008622a978c9f284ed5e3a9'/>
<id>urn:sha1:e222432bfa7dcf6ec008622a978c9f284ed5e3a9</id>
<content type='text'>
Add RSS and swap to OOM output from memcg

Display memcg values like failcnt, usage and limit when an OOM occurs due
to memcg.

Thanks to Johannes Weiner, Li Zefan, David Rientjes, Kamezawa Hiroyuki,
Daisuke Nishimura and KOSAKI Motohiro for review.

Sample output
-------------

Task in /a/x killed as a result of limit of /a
memory: usage 1048576kB, limit 1048576kB, failcnt 4183
memory+swap: usage 1400964kB, limit 9007199254740991kB, failcnt 0

[akpm@linux-foundation.org: compilation fix]
[akpm@linux-foundation.org: fix kerneldoc and whitespace]
[akpm@linux-foundation.org: add printk facility level]
Signed-off-by: Balbir Singh &lt;balbir@linux.vnet.ibm.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Daisuke Nishimura &lt;nishimura@mxp.nes.nec.co.jp&gt;
Cc: Li Zefan &lt;lizf@cn.fujitsu.com&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>oom_kill: don't call for int_sqrt(0)</title>
<updated>2009-04-01T15:59:11Z</updated>
<author>
<name>Cyrill Gorcunov</name>
<email>gorcunov@gmail.com</email>
</author>
<published>2009-03-31T22:19:27Z</published>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/commit/?id=a12888f772dab4bf5e6f73668dc4f5f6026a7014'/>
<id>urn:sha1:a12888f772dab4bf5e6f73668dc4f5f6026a7014</id>
<content type='text'>
There is no need to call for int_sqrt if argument is 0.

Signed-off-by: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Pekka Enberg &lt;penberg@cs.helsinki.fi&gt;
Cc: Christoph Lameter &lt;cl@linux-foundation.org&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memcg: avoid deadlock caused by race between oom and cpuset_attach</title>
<updated>2009-01-08T16:31:09Z</updated>
<author>
<name>Daisuke Nishimura</name>
<email>nishimura@mxp.nes.nec.co.jp</email>
</author>
<published>2009-01-08T02:08:29Z</published>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/commit/?id=7f4d454dee2e0bdd21bafd413d1c53e443a26540'/>
<id>urn:sha1:7f4d454dee2e0bdd21bafd413d1c53e443a26540</id>
<content type='text'>
mpol_rebind_mm(), which can be called from cpuset_attach(), does
down_write(mm-&gt;mmap_sem).  This means down_write(mm-&gt;mmap_sem) can be
called under cgroup_mutex.

OTOH, page fault path does down_read(mm-&gt;mmap_sem) and calls
mem_cgroup_try_charge_xxx(), which may eventually calls
mem_cgroup_out_of_memory().  And mem_cgroup_out_of_memory() calls
cgroup_lock().  This means cgroup_lock() can be called under
down_read(mm-&gt;mmap_sem).

If those two paths race, deadlock can happen.

This patch avoid this deadlock by:
  - remove cgroup_lock() from mem_cgroup_out_of_memory().
  - define new mutex (memcg_tasklist) and serialize mem_cgroup_move_task()
    (-&gt;attach handler of memory cgroup) and mem_cgroup_out_of_memory.

Signed-off-by: Daisuke Nishimura &lt;nishimura@mxp.nes.nec.co.jp&gt;
Reviewed-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Acked-by: Balbir Singh &lt;balbir@linux.vnet.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memcg: avoid unnecessary system-wide-oom-killer</title>
<updated>2009-01-08T16:31:06Z</updated>
<author>
<name>KAMEZAWA Hiroyuki</name>
<email>kamezawa.hiroyu@jp.fujitsu.com</email>
</author>
<published>2009-01-08T02:08:08Z</published>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/commit/?id=a636b327f731143ccc544b966cfd8de6cb6d72c6'/>
<id>urn:sha1:a636b327f731143ccc544b966cfd8de6cb6d72c6</id>
<content type='text'>
Current mmtom has new oom function as pagefault_out_of_memory().  It's
added for select bad process rathar than killing current.

When memcg hit limit and calls OOM at page_fault, this handler called and
system-wide-oom handling happens.  (means kernel panics if panic_on_oom is
true....)

To avoid overkill, check memcg's recent behavior before starting
system-wide-oom.

And this patch also fixes to guarantee "don't accnout against process with
TIF_MEMDIE".  This is necessary for smooth OOM.

[akpm@linux-foundation.org: build fix]
Signed-off-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Li Zefan &lt;lizf@cn.fujitsu.com&gt;
Cc: Balbir Singh &lt;balbir@in.ibm.com&gt;
Cc: Daisuke Nishimura &lt;nishimura@mxp.nes.nec.co.jp&gt;
Cc: Badari Pulavarty &lt;pbadari@us.ibm.com&gt;
Cc: Jan Blunck &lt;jblunck@suse.de&gt;
Cc: Hirokazu Takahashi &lt;taka@valinux.co.jp&gt;
Cc: Nick Piggin &lt;nickpiggin@yahoo.com.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>oom: print triggering task's cpuset and mems allowed</title>
<updated>2009-01-06T23:58:59Z</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2009-01-06T22:39:01Z</published>
<link rel='alternate' type='text/html' href='https://sre.ring0.de/linux/commit/?id=75aa199410359dc5fbcf9025ff7af98a9d20f0d5'/>
<id>urn:sha1:75aa199410359dc5fbcf9025ff7af98a9d20f0d5</id>
<content type='text'>
When cpusets are enabled, it's necessary to print the triggering task's
set of allowable nodes so the subsequently printed meminfo can be
interpreted correctly.

We also print the task's cpuset name for informational purposes.

[rientjes@google.com: task lock current before dereferencing cpuset]
Cc: Paul Menage &lt;menage@google.com&gt;
Cc: Li Zefan &lt;lizf@cn.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
