summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2015-12-16perf tools: Document the fact that parse_options*() may exitJosh Poimboeuf1-0/+3
Generally, calling exit() from a library is bad practice. Eventually these functions might be redesigned so that they don't exit. For now, just document the fact that they do. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/97b1af06cc3b18dd0f49e655d6d659eaa64ecde5.1450193761.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-16perf tools: Move strlcpy() from perf to tools/lib/string.cJosh Poimboeuf4-23/+33
strlcpy() will be needed by the subcmd library. Move it to the shared tools/lib/string.c file which can be used by other tools. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/71e2804b973bf39ad3d3b9be10f99f2ea630be46.1450193761.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-16tools build: Fix feature Makefile issues with 'O='Josh Poimboeuf2-47/+48
When building perf binaries outside the source tree with 'make O=<dir>', the auto-detected features get re-tested for every build, which is unnecessary and inconsistent with the behavior seen when building directly in the source tree. Another issue is that 'make O=<dir> clean' doesn't remove the feature files from the object tree. Fix these problems by looking for the binaries in the $(OUTPUT) directory. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/113bd01530e9761778c60a75a96c65fc59860f68.1450193761.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-15perf record: Add record.build-id config optionNamhyung Kim2-1/+26
Post processing at 'perf record' takes a long time on big machines. What it does is to find the build-id of binaries found in the event stream, so that it can make sure, at 'report' time, that the symtabs (be it ELF, kallsyms, etc) being used to resolve symbols are the ones matching the binaries found at 'record' time. Sometimes we just want to skip this processing of events at the end of the session to get quicker results, making sure the binaries haven't changed from 'record' to 'report' time. Add a new config option to control this behavior. The record.build-id config variable can have one of the following values: - cache: post-process data and save/update the binaries into the build-id cache (in ~/.debug). This is the default. - no-cache: post-process the data but not update the build-id cache. Same effect as using the -N option. - skip: skip post-processing and do not update the cache. Same effect as using the -B option. Reported-and-Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Taeung Song <treeze.taeung@gmail.com> Link: http://lkml.kernel.org/r/1450144196-22957-1-git-send-email-namhyung@kernel.org [ Added some more text to the documentation ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-14perf record: Support custom vmlinux pathHe Kuang2-2/+24
Make perf-record command support --vmlinux option if BPF_PROLOGUE is on. 'perf record' needs vmlinux as the source of DWARF info to generate prologue for BPF programs, so path of vmlinux should be specified. Short name 'k' has been taken by 'clockid'. This patch skips the short option name and uses '--vmlinux' for vmlinux path. Documentation is also updated. Test result: In a production (or broken) environment: (by: # rm -rf ~/.debug/ # mv /lib/modules/`uname -r`/build/vmlinux /tmp/ ) # ./perf record -e ./test_bpf_base.c ls Failed to find the path for kernel: No such file or directory event syntax error: './test_bpf_base.c' \___ You need to check probing points in BPF file ... # ./perf record --vmlinux /tmp/vmlinux -e ./test_bpf_base.c ls ... [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.011 MB perf.data ] Help messages when build with NO_LIBBPF: # ./perf record -h --transaction sample transaction flags (special events only) --vmlinux <file> vmlinux pathname (not built-in because NO_LIBBPF=1) # ./perf record --vmlinux /tmp/vmlinux ls / Warning: option `vmlinux' is being ignored because NO_LIBBPF=1 ... [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.011 MB perf.data (11 samples) ] Help messages when build with NO_DWARF: # ./perf record -h --transaction sample transaction flags (special events only) --vmlinux <file> vmlinux pathname (not built-in because NO_DWARF=1) Signed-off-by: He Kuang <hekuang@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1450089563-122430-15-git-send-email-wangnan0@huawei.com Signed-off-by: Wang Nan <wangnan0@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-14perf tools: Make options always available, even if required libs not linkedWang Nan4-13/+134
This patch keeps options of perf builtins same in all conditions. If one option is disabled because of compiling options, users should be notified. Masami suggested another implementation in [1] that, by adding a OPTION_NEXT_DEPENDS option before those options in the 'struct option' array, options parser knows an option is disabled. However, in some cases this array is reordered (options__order()). In addition, in parse-option.c that array is const, so we can't simply merge information in decorator option into the affacted option. This patch chooses a simpler implementation that, introducing a set_option_nobuild() function and two option parsing flags. Builtins with such options should call set_option_nobuild() before option parsing. The complexity of this patch is because we want some of options can be skipped safely. In this case their arguments should also be consumed. Options in 'perf record' and 'perf probe' are fixed in this patch. [1] http://lkml.kernel.org/g/50399556C9727B4D88A595C8584AAB3752627CD4@GSjpTKYDCembx32.service.hitachi.net Test result: Normal case: # ./perf probe --vmlinux /tmp/vmlinux sys_write Added new event: probe:sys_write (on sys_write) You can now use it in all perf tools, such as: perf record -e probe:sys_write -aR sleep 1 Build with NO_DWARF=1: # ./perf probe -L sys_write Error: switch `L' is not available because NO_DWARF=1 Usage: perf probe [<options>] 'PROBEDEF' ['PROBEDEF' ...] or: perf probe [<options>] --add 'PROBEDEF' [--add 'PROBEDEF' ...] or: perf probe [<options>] --del '[GROUP:]EVENT' ... or: perf probe --list [GROUP:]EVENT ... or: perf probe [<options>] --funcs -L, --line <FUNC[:RLN[+NUM|-RLN2]]|SRC:ALN[+NUM|-ALN2]> Show source code lines. (not built-in because NO_DWARF=1) # ./perf probe -k /tmp/vmlinux sys_write Warning: switch `k' is being ignored because NO_DWARF=1 Added new event: probe:sys_write (on sys_write) You can now use it in all perf tools, such as: perf record -e probe:sys_write -aR sleep 1 # ./perf probe --vmlinux /tmp/vmlinux sys_write Warning: option `vmlinux' is being ignored because NO_DWARF=1 Added new event: [SNIP] # ./perf probe -l Usage: perf probe [<options>] 'PROBEDEF' ['PROBEDEF' ...] or: perf probe [<options>] --add 'PROBEDEF' [--add 'PROBEDEF' ...] ... -k, --vmlinux <file> vmlinux pathname (not built-in because NO_DWARF=1) -L, --line <FUNC[:RLN[+NUM|-RLN2]]|SRC:ALN[+NUM|-ALN2]> Show source code lines. (not built-in because NO_DWARF=1) ... -V, --vars <FUNC[@SRC][+OFF|%return|:RL|;PT]|SRC:AL|SRC;PT> Show accessible variables on PROBEDEF (not built-in because NO_DWARF=1) --externs Show external variables too (with --vars only) (not built-in because NO_DWARF=1) --no-inlines Don't search inlined functions (not built-in because NO_DWARF=1) --range Show variables location range in scope (with --vars only) (not built-in because NO_DWARF=1) Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1450089563-122430-14-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-14perf tools: Convert parse-options.c internal functions to staticJosh Poimboeuf2-18/+9
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/c027b5f47ec1055077f5650edb1c7ad37c191e6c.1449965119.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-14perf tools: Move help_unknown_cmd() to its own fileJosh Poimboeuf5-104/+110
help_unknown_cmd() is quite perf-specific because it relies on some perf_config*() functions. Move it and its supporting functions out into a separate file so that help.c can be moved to a library. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/562d918bcaaf340c1ae3e47586b3f0ae33b9918b.1449965119.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-14perf tools: Remove check for unused PERF_PAGER_IN_USEJosh Poimboeuf1-7/+1
PERF_PAGER_IN_USE doesn't seem to be used anywhere, so let's remove it. This will also make it easier to move pager.c into a separate library. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/ed9e8370db9811746dc590544cf48c36dcfb1731.1449965119.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-14perf tools: Create pager.hJosh Poimboeuf2-4/+8
Move the 'pager' function prototypes into a new pager.h so that the pager code can be moved out to a library. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/ba7c316474dd6bfc047e5c6dc4dcab39a982caf5.1449965119.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-14perf build: Rename LIB_PATH -> API_PATHJosh Poimboeuf1-4/+4
'LIB_PATH' is a misnomer because there are multiple library paths. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/c10df0b749a27f05cc531fe06b8dd71a329341fa.1449965119.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-14perf build: Fix 'make clean'Josh Poimboeuf1-3/+4
Add some missing files to the 'make clean' target. Reported-and-Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/8b1f5a5bd66a652be071d423e64aaa994254be31.1449965119.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-14perf test: Remove tarpkg at end of testJosh Poimboeuf1-1/+2
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/5e7e97a23e3ce11b59d1009b39ebb6d2813a0560.1449965119.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-14perf test: Add Build file to dependencies for llvm-src-*.cJosh Poimboeuf1-3/+3
Because the Build file writes source code to the generated llvm-src-*.c files, it should be listed as one of the dependencies, so that any future changes to the code being echoed won't require a 'make clean'. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/9b9886c295750dc83cbbb29a665d280f9c5e8b3e.1449965119.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-14perf build: Remove unnecessary line in Makefile.featureJosh Poimboeuf1-1/+0
This line always silently fails because it doesn't add the 'test-' prefix to the .bin file. And it seems to be unnecessary anyway: the line immediately after it does all the individual feature checks. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/554a05c18af564ba015c9e68f25730126e0f4acb.1449965119.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-14perf test: Fix hist testcases when kptr_restrict is onNamhyung Kim1-3/+2
Currently if kptr_restrict is enabled, all hist tests failed with segfaults. This is because machine__create_kernel_maps() in setup_fake_machine() failed in that situation, and it called machine__delete() on the error path. But outer callers again called machines__exit() causing double free for the host machine. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1450062673-22312-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-14perf evsel: Disable branch flags/cycles for --callgraph lbrAndi Kleen1-1/+13
[The kernel patch needed for this is in tip now (b16a5b52eb9 perf/x86: Add option to disable ...) So this user tools patch to make use of it should be merged now] Automatically disable collecting branch flags and cycles with --call-graph lbr. This allows avoiding a bunch of extra MSR reads in the PMI on Skylake. When the kernel doesn't support the new flags they are automatically cleared in the fallback code. v2: Switch to use branch_sample_type instead of sample_type. Adjust description. Fix the fallback logic. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1449879144-29074-1-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-14perf thread: Fix reference count initial stateArnaldo Carvalho de Melo3-11/+22
We should always return from thread__new(), the constructor, with the object with a reference count of one, so that: struct thread *thread = thread__new(); thread__put(thread); Will call thread__delete(). If any reference is made to that 'thread' variable, it better use thread__get(thread) to hold a reference. We were returning with thread->refcnt set to zero, fix it and some cases where thread__delete() was being called, which were not a problem because just one reference was being used, now that we set it to 1, use thread__put() instead. Reported-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-4b9mkuk66to4ecckpmpvqx6s@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-14perf test: Dump the stack when test segfaults when in verbose modeArnaldo Carvalho de Melo1-0/+3
E.g.: # perf test 26 26: Test mmap thread lookup : FAILED! # perf test -v 26 26: Test mmap thread lookup : --- start --- test child forked, pid 9269 tid = 9269, map = 0x7ff99ff0c000 tid = 9270, map = 0x7ff99ff0b000 tid = 9271, map = 0x7ff99ff0a000 tid = 9272, map = 0x7ff99ff09000 perf: Segmentation fault Obtained 13 stack frames. perf(sighandler_dump_stack+0x41) [0x4e3541] /lib64/libc.so.6(+0x34960) [0x7ff99d5f6960] perf(thread__put+0x5b) [0x4c6f6b] perf(machine__process_event+0x14e) [0x4bd37e] perf(perf_event__synthesize_threads+0x3aa) [0x48678a] perf(test__mmap_thread_lookup+0x20a) [0x474e0a] perf() [0x460d56] perf(cmd_test+0x589) [0x461319] perf() [0x47c641] perf(main+0x617) [0x422317] /lib64/libc.so.6(__libc_start_main+0xf0) [0x7ff99d5e1fe0] perf() [0x422429] [(nil)] test child interrupted ---- end ---- Test mmap thread lookup: FAILED! # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-sypazzsl4ptctrmlyi2zcmaj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-14perf tools: Use same signal handling strategy as 'record'Arnaldo Carvalho de Melo1-1/+2
I.e. don't exit with the signal number, instead set the signal handler to the default one and then raise it again. Noticed while trying to dump the stack at segfaults in the 'perf test' forked process used to run each test, that inspects signal info at each test. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-5x5r176wnoqxi5p6id05wv9w@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-14Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar5-36/+57
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Fix 'perf top' annotation in --stdio (Namhyung Kim) - Support hw breakpoint events (mem:0xAddress) in the default output mode in 'perf script' (Wang Nan) Infrastructure changes: - Do not hold the hists lock while emitting one specific warning (Namhyung Kim) - Fetch map names from correct strtab, worked so far because llvm/clang uses just one string table (Wang Nan) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-12-14Merge tag 'v4.4-rc5' into perf/core, to pick up fixesIngo Molnar788-5167/+9886
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-12-13Linux 4.4-rc5v4.4-rc5Linus Torvalds1-1/+1
2015-12-13sched/wait: Fix the signal handling fixPeter Zijlstra8-28/+28
Jan Stancek reported that I wrecked things for him by fixing things for Vladimir :/ His report was due to an UNINTERRUPTIBLE wait getting -EINTR, which should not be possible, however my previous patch made this possible by unconditionally checking signal_pending(). We cannot use current->state as was done previously, because the instruction after the store to that variable it can be changed. We must instead pass the initial state along and use that. Fixes: 68985633bccb ("sched/wait: Fix signal handling in bit wait helpers") Reported-by: Jan Stancek <jstancek@redhat.com> Reported-by: Chris Mason <clm@fb.com> Tested-by: Jan Stancek <jstancek@redhat.com> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> Tested-by: Chris Mason <clm@fb.com> Reviewed-by: Paul Turner <pjt@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: tglx@linutronix.de Cc: Oleg Nesterov <oleg@redhat.com> Cc: hpa@zytor.com Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-13Merge tag 'nfs-for-4.4-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds3-13/+14
Pull NFS client bugfix from Trond Myklebust: "SUNRPC: Fix a NFSv4.1 callback channel regression" * tag 'nfs-for-4.4-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: SUNRPC: Fix callback channel
2015-12-13Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds3-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixlets from Thomas Gleixner: "Two trivial fixes which add missing header fileas and forward declarations so the code will compile even when the magic include chains are different" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/gic-v3: Add missing include for barrier.h irqchip/gic-v3: Add missing struct device_node declaration
2015-12-13Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "A single fix to unbreak a clocksource driver which has more than 32bit counter width" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource: Mmio: remove artificial 32bit limitation
2015-12-13Merge tag 'char-misc-4.4-rc5' of ↵Linus Torvalds1-9/+4
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull fpga driver fixes from Greg KH: "Only two small fpga driver fixes here, both have been in linux-next for a while, and resolve some reported issues" * tag 'char-misc-4.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: fpga manager: Fix firmware resource leak on error fpga manager: remove label
2015-12-13Merge tag 'staging-4.4-rc5' of ↵Linus Torvalds7-16/+21
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging driver fixes from Greg KH: "Here are a few staging and IIO driver fixes for 4.4-rc5. All of them resolve reported problems and have been in linux-next for a while. Nothing major here, just small fixes where needed" * tag 'staging-4.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: lustre: echo_copy.._lsm() dereferences userland pointers directly iio: adc: spmi-vadc: add missing of_node_put iio: fix some warning messages iio: light: apds9960: correct ->last_busy count iio: lidar: return -EINVAL on invalid signal staging: iio: dummy: complete IIO events delivery to userspace
2015-12-13Merge tag 'usb-4.4-rc5' of ↵Linus Torvalds30-79/+200
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB driver fixes from Greg KH: "Here are a number of small USB fixes for 4.4-rc5. All of them have been in linux-next. The majority are gadget and phy issues, with a few new quirks and device ids added as well" * tag 'usb-4.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (32 commits) USB: add quirk for devices with broken LPM xhci: fix usb2 resume timing and races. usb: musb: fail with error when no DMA controller set usb: gadget: uvc: fix permissions of configfs attributes usb: musb: core: Fix pm runtime for deferred probe usb: phy: msm: fix a possible NULL dereference USB: host: ohci-at91: fix a crash in ohci_hcd_at91_overcurrent_irq usb: Quiet down false peer failure messages usb: xhci: fix config fail of FS hub behind a HS hub with MTT xhci: Fix memory leak in xhci_pme_acpi_rtd3_enable() usb: Use the USB_SS_MULT() macro to decode burst multiplier for log message USB: whci-hcd: add check for dma mapping error usb: core : hub: Fix BOS 'NULL pointer' kernel panic USB: quirks: Apply ALWAYS_POLL to all ELAN devices usb-storage: Fix scsi-sd failure "Invalid field in cdb" for USB adapter JMicron USB: quirks: Fix another ELAN touchscreen usb: dwc3: gadget: don't prestart interrupt endpoints USB: serial: Another Infineon flash loader USB ID USB: cdc_acm: Ignore Infineon Flash Loader utility USB: cp210x: Remove CP2110 ID from compatibility list ...
2015-12-12Merge tag 'fixes-for-linus' of ↵Linus Torvalds21-32/+76
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Arnd Bergmann: "Here are a bunch of small bug fixes for various ARM platforms, nothing really sticks out this week, most of either fixes bugs in code that was just added in 4.4, or that has been broken for many years without anyone noticing. at91/sama5d2: - fix sama5de hardware setup of sd/mmc interface - proper selection of pinctrl drivers. PIO4 is necessary for sama5d2 berlin: - fix incorrect clock input for SDIO exynos: - Fix potential NULL pointer dereference in Exynos PMU driver. imx: - Fix vf610 SAI clock configuration bug which is discovered by the newly added master mode support in SAI audio driver. - Fix buggy L2 cache latency values in vf610 device trees, which may cause system hang when cpu runs at a higher frequency. ixp4xx: - fix prototypes for readl/writel functions ls2080a: - use little-endian register access for GPIO and SDHCI omap: - Fix clock source for ARM TWD and global timers on am437x - Always select REGULATOR_FIXED_VOLTAGE for omap2+ instead of when MACH_OMAP3_PANDORA is selected - Fix SPI DMA handles for dm816x as only some were mapped - Fix up mbox cells for dm816x to make mailbox usable pxa: - use PWM lookup table for all ezx machines s3c24xx: - Remove incorrect __init annotation from s3c24xx cpufreq driver structures. versatile: - fix PCI IRQ mapping on Versatile PB" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ls2080a/dts: Add little endian property for GPIO IP block dt-bindings: define little-endian property for QorIQ GPIO ARM64: dts: ls2080a: fix eSDHC endianness ARM: dts: vf610: use reset values for L2 cache latencies ARM: pxa: use PWM lookup table for all machines ARM: dts: berlin: add 2nd clock for BG2Q sdhci0 and sdhci1 ARM: dts: berlin: correct BG2Q's sdhci2 2nd clock ARM: dts: am4372: fix clock source for arm twd and global timers ARM: at91: fix pinctrl driver selection ARM: at91/dt: add always-on to 1.8V regulator ARM: dts: vf610: fix clock definition for SAI2 ARM: imx: clk-vf610: fix SAI clock tree ARM: ixp4xx: fix read{b,w,l} return types irqchip/versatile-fpga: Fix PCI IRQ mapping on Versatile PB ARM: OMAP2+: enable REGULATOR_FIXED_VOLTAGE ARM: dts: add dm816x missing spi DT dma handles ARM: dts: add dm816x missing #mbox-cells cpufreq: s3c24xx: Do not mark s3c2410_plls_add as __init ARM: EXYNOS: Fix potential NULL pointer access in exynos_sys_powerdown_conf
2015-12-12Merge tag 'powerpc-4.4-4' of ↵Linus Torvalds4-48/+34
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - opal-irqchip: Fix double endian conversion from Alistair Popple - cxl: Set endianess of kernel contexts from Frederic Barrat - sbc8641: drop bogus PHY IRQ entries from DTS file from Paul Gortmaker - Revert "powerpc/eeh: Don't unfreeze PHB PE after reset" from Andrew Donnellan * tag 'powerpc-4.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: Revert "powerpc/eeh: Don't unfreeze PHB PE after reset" powerpc/sbc8641: drop bogus PHY IRQ entries from DTS file cxl: Set endianess of kernel contexts powerpc/opal-irqchip: Fix double endian conversion
2015-12-12Merge branch 'akpm' (patches from Andrew)Linus Torvalds18-62/+77
Merge misc fixes from Andrew Morton: "17 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: MIPS: fix DMA contiguous allocation sh64: fix __NR_fgetxattr ocfs2: fix SGID not inherited issue mm/oom_kill.c: avoid attempting to kill init sharing same memory drivers/base/memory.c: prohibit offlining of memory blocks with missing sections tmpfs: fix shmem_evict_inode() warnings on i_blocks mm/hugetlb.c: fix resv map memory leak for placeholder entries mm: hugetlb: call huge_pte_alloc() only if ptep is null kernel: remove stop_machine() Kconfig dependency mm: kmemleak: mark kmemleak_init prototype as __init mm: fix kerneldoc on mem_cgroup_replace_page osd fs: __r4w_get_page rely on PageUptodate for uptodate MAINTAINERS: make Vladimir co-maintainer of the memory controller mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make any progress mm: fix swapped Movable and Reclaimable in /proc/pagetypeinfo memcg: fix memory.high target mm: hugetlb: fix hugepage memory leak caused by wrong reserve count
2015-12-12Merge branch 'parisc-4.4-3' of ↵Linus Torvalds5-27/+13
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: "Fix the boot crash on Mako machines with Huge Pages, prevent a panic with SATA controllers (and others) by correctly calculating the IOMMU space, hook up the mlock2 syscall and drop unneeded code in the parisc pci code" * 'parisc-4.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Disable huge pages on Mako machines parisc: Wire up mlock2 syscall parisc: Remove unused pcibios_init_bus() parisc iommu: fix panic due to trying to allocate too large region
2015-12-12Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds9-89/+142
Pull block layer fixes from Jens Axboe: "A set of fixes for the current series. This contains: - A bunch of fixes for lightnvm, should be the last round for this series. From Matias and Wenwei. - A writeback detach inode fix from Ilya, also marked for stable. - A block (though it says SCSI) fix for an OOPS in SCSI runtime power management. - Module init error path fixes for null_blk from Minfei" * 'for-linus' of git://git.kernel.dk/linux-block: null_blk: Fix error path in module initialization lightnvm: do not compile in debugging by default lightnvm: prevent gennvm module unload on use lightnvm: fix media mgr registration lightnvm: replace req queue with nvmdev for lld lightnvm: comments on constants lightnvm: check mm before use lightnvm: refactor spin_unlock in gennvm_get_blk lightnvm: put blks when luns configure failed lightnvm: use flags in rrpc_get_blk block: detach bdev inode from its wb in __blkdev_put() SCSI: Fix NULL pointer dereference in runtime PM
2015-12-12Merge tag 'arm64-fixes' of ↵Linus Torvalds2-6/+11
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Update the linker script to use L1_CACHE_BYTES instead of hard-coded 64. We recently changed L1_CACHE_BYTES to 128 - Improve race condition reporting on set_pte_at() and change the BUG to WARN_ONCE. With hardware update of the accessed/dirty state, we need to ensure that set_pte_at() does not inadvertently override hardware updated state. The patch also makes the checks ignore !pte_valid() new entries * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Improve error reporting on set_pte_at() checks arm64: update linker script to increased L1_CACHE_BYTES value
2015-12-12MIPS: fix DMA contiguous allocationQais Yousef1-1/+1
Recent changes to how GFP_ATOMIC is defined seems to have broken the condition to use mips_alloc_from_contiguous() in mips_dma_alloc_coherent(). I couldn't bottom out the exact change but I think it's this commit d0164adc89f6 ("mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd"). GFP_ATOMIC has multiple bits set and the check for !(gfp & GFP_ATOMIC) isn't enough. The reason behind this condition is to check whether we can potentially do a sleeping memory allocation. Use gfpflags_allow_blocking() instead which should be more robust. Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12sh64: fix __NR_fgetxattrDmitry V. Levin1-1/+1
According to arch/sh/kernel/syscalls_64.S and common sense, __NR_fgetxattr has to be defined to 259, but it doesn't. Instead, it's defined to 269, which is of course used by another syscall, __NR_sched_setaffinity in this case. This bug was found by strace test suite. Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12ocfs2: fix SGID not inherited issueJunxiao Bi1-3/+1
Commit 8f1eb48758aa ("ocfs2: fix umask ignored issue") introduced an issue, SGID of sub dir was not inherited from its parents dir. It is because SGID is set into "inode->i_mode" in ocfs2_get_init_inode(), but is overwritten by "mode" which don't have SGID set later. Fixes: 8f1eb48758aa ("ocfs2: fix umask ignored issue") Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Acked-by: Srinivas Eeda <srinivas.eeda@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12mm/oom_kill.c: avoid attempting to kill init sharing same memoryChen Jie1-0/+2
It's possible that an oom killed victim shares an ->mm with the init process and thus oom_kill_process() would end up trying to kill init as well. This has been shown in practice: Out of memory: Kill process 9134 (init) score 3 or sacrifice child Killed process 9134 (init) total-vm:1868kB, anon-rss:84kB, file-rss:572kB Kill process 1 (init) sharing same memory ... Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 And this will result in a kernel panic. If a process is forked by init and selected for oom kill while still sharing init_mm, then it's likely this system is in a recoverable state. However, it's better not to try to kill init and allow the machine to panic due to unkillable processes. [rientjes@google.com: rewrote changelog] [akpm@linux-foundation.org: fix inverted test, per Ben] Signed-off-by: Chen Jie <chenjie6@huawei.com> Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Li Zefan <lizefan@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12drivers/base/memory.c: prohibit offlining of memory blocks with missing sectionsSeth Jennings1-0/+4
Commit bdee237c0343 ("x86: mm: Use 2GB memory block size on large-memory x86-64 systems") and 982792c782ef ("x86, mm: probe memory block size for generic x86 64bit") introduced large block sizes for x86. This made it possible to have multiple sections per memory block where previously, there was a only every one section per block. Since blocks consist of contiguous ranges of section, there can be holes in the blocks where sections are not present. If one attempts to offline such a block, a crash occurs since the code is not designed to deal with this. This patch is a quick fix to gaurd against the crash by not allowing blocks with non-present sections to be offlined. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=107781 Signed-off-by: Seth Jennings <sjennings@variantweb.net> Reported-by: Andrew Banman <abanman@sgi.com> Cc: Daniel J Blueman <daniel@numascale.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Greg KH <greg@kroah.com> Cc: Russ Anderson <rja@sgi.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12tmpfs: fix shmem_evict_inode() warnings on i_blocksHugh Dickins1-20/+14
Dmitry Vyukov provides a little program, autogenerated by syzkaller, which races a fault on a mapping of a sparse memfd object, against truncation of that object below the fault address: run repeatedly for a few minutes, it reliably generates shmem_evict_inode()'s WARN_ON(inode->i_blocks). (But there's nothing specific to memfd here, nor to the fstat which it happened to use to generate the fault: though that looked suspicious, since a shmem_recalc_inode() had been added there recently. The same problem can be reproduced with open+unlink in place of memfd_create, and with fstatfs in place of fstat.) v3.7 commit 0f3c42f522dc ("tmpfs: change final i_blocks BUG to WARNING") explains one cause of such a warning (a race with shmem_writepage to swap), and possible solutions; but we never took it further, and this syzkaller incident turns out to have a different cause. shmem_getpage_gfp()'s error recovery, when a freshly allocated page is then found to be beyond eof, looks plausible - decrementing the alloced count that was just before incremented - but in fact can go wrong, if a racing thread (the truncator, for example) gets its shmem_recalc_inode() in just after our delete_from_page_cache(). delete_from_page_cache() decrements nrpages, that shmem_recalc_inode() will balance the books by decrementing alloced itself, then our decrement of alloced take it one too low: leading to the WARNING when the object is finally evicted. Once the new page has been exposed in the page cache, shmem_getpage_gfp() must leave it to shmem_recalc_inode() itself to get the accounting right in all cases (and not fall through from "trunc:" to "decused:"). Adjust that error recovery block; and the reinitialization of info and sbinfo can be removed too. While we're here, fix shmem_writepage() to avoid the original issue: it will be safe against a racing shmem_recalc_inode(), if it merely increments swapped before the shmem_delete_from_page_cache() which decrements nrpages (but it must then do its own shmem_recalc_inode() before that, while still in balance, instead of after). (Aside: why do we shmem_recalc_inode() here in the swap path? Because its raison d'etre is to cope with clean sparse shmem pages being reclaimed behind our back: so here when swapping is a good place to look for that case.) But I've not now managed to reproduce this bug, even without the patch. I don't see why I didn't do that earlier: perhaps inhibited by the preference to eliminate shmem_recalc_inode() altogether. Driven by this incident, I do now have a patch to do so at last; but still want to sit on it for a bit, there's a couple of questions yet to be resolved. Signed-off-by: Hugh Dickins <hughd@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12mm/hugetlb.c: fix resv map memory leak for placeholder entriesMike Kravetz1-2/+12
Dmitry Vyukov reported the following memory leak unreferenced object 0xffff88002eaafd88 (size 32): comm "a.out", pid 5063, jiffies 4295774645 (age 15.810s) hex dump (first 32 bytes): 28 e9 4e 63 00 88 ff ff 28 e9 4e 63 00 88 ff ff (.Nc....(.Nc.... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: kmalloc include/linux/slab.h:458 region_chg+0x2d4/0x6b0 mm/hugetlb.c:398 __vma_reservation_common+0x2c3/0x390 mm/hugetlb.c:1791 vma_needs_reservation mm/hugetlb.c:1813 alloc_huge_page+0x19e/0xc70 mm/hugetlb.c:1845 hugetlb_no_page mm/hugetlb.c:3543 hugetlb_fault+0x7a1/0x1250 mm/hugetlb.c:3717 follow_hugetlb_page+0x339/0xc70 mm/hugetlb.c:3880 __get_user_pages+0x542/0xf30 mm/gup.c:497 populate_vma_page_range+0xde/0x110 mm/gup.c:919 __mm_populate+0x1c7/0x310 mm/gup.c:969 do_mlock+0x291/0x360 mm/mlock.c:637 SYSC_mlock2 mm/mlock.c:658 SyS_mlock2+0x4b/0x70 mm/mlock.c:648 Dmitry identified a potential memory leak in the routine region_chg, where a region descriptor is not free'ed on an error path. However, the root cause for the above memory leak resides in region_del. In this specific case, a "placeholder" entry is created in region_chg. The associated page allocation fails, and the placeholder entry is left in the reserve map. This is "by design" as the entry should be deleted when the map is released. The bug is in the region_del routine which is used to delete entries within a specific range (and when the map is released). region_del did not handle the case where a placeholder entry exactly matched the start of the range range to be deleted. In this case, the entry would not be deleted and leaked. The fix is to take these special placeholder entries into account in region_del. The region_chg error path leak is also fixed. Fixes: feba16e25a57 ("mm/hugetlb: add region_del() to delete a specific range of entries") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: <stable@vger.kernel.org> [4.3+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12mm: hugetlb: call huge_pte_alloc() only if ptep is nullNaoya Horiguchi1-4/+4
Currently at the beginning of hugetlb_fault(), we call huge_pte_offset() and check whether the obtained *ptep is a migration/hwpoison entry or not. And if not, then we get to call huge_pte_alloc(). This is racy because the *ptep could turn into migration/hwpoison entry after the huge_pte_offset() check. This race results in BUG_ON in huge_pte_alloc(). We don't have to call huge_pte_alloc() when the huge_pte_offset() returns non-NULL, so let's fix this bug with moving the code into else block. Note that the *ptep could turn into a migration/hwpoison entry after this block, but that's not a problem because we have another !pte_present check later (we never go into hugetlb_no_page() in that case.) Fixes: 290408d4a250 ("hugetlb: hugepage migration core") Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: <stable@vger.kernel.org> [2.6.36+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12kernel: remove stop_machine() Kconfig dependencyChris Wilson3-12/+5
Currently the full stop_machine() routine is only enabled on SMP if module unloading is enabled, or if the CPUs are hotpluggable. This leads to configurations where stop_machine() is broken as it will then only run the callback on the local CPU with irqs disabled, and not stop the other CPUs or run the callback on them. For example, this breaks MTRR setup on x86 in certain configs since ea8596bb2d8d379 ("kprobes/x86: Remove unused text_poke_smp() and text_poke_smp_batch() functions") as the MTRR is only established on the boot CPU. This patch removes the Kconfig option for STOP_MACHINE and uses the SMP and HOTPLUG_CPU config options to compile the correct stop_machine() for the architecture, removing the false dependency on MODULE_UNLOAD in the process. Link: https://lkml.org/lkml/2014/10/8/124 References: https://bugs.freedesktop.org/show_bug.cgi?id=84794 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Pranith Kumar <bobby.prani@gmail.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Tejun Heo <tj@kernel.org> Cc: Iulia Manda <iulia.manda21@gmail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Chuck Ebbert <cebbert.lkml@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12mm: kmemleak: mark kmemleak_init prototype as __initNicolas Iooss1-1/+1
The kmemleak_init() definition in mm/kmemleak.c is marked __init but its prototype in include/linux/kmemleak.h is marked __ref since commit a6186d89c913 ("kmemleak: Mark the early log buffer as __initdata"). This causes a section mismatch which is reported as a warning when building with clang -Wsection, because kmemleak_init() is declared in section .ref.text but defined in .init.text. Fix this by marking kmemleak_init() prototype __init. Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12mm: fix kerneldoc on mem_cgroup_replace_pageHugh Dickins1-1/+1
Whoops, I missed removing the kerneldoc comment of the lrucare arg removed from mem_cgroup_replace_page; but it's a good comment, keep it. Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12osd fs: __r4w_get_page rely on PageUptodate for uptodateHugh Dickins2-8/+2
Commit 42cb14b110a5 ("mm: migrate dirty page without clear_page_dirty_for_io etc") simplified the migration of a PageDirty pagecache page: one stat needs moving from zone to zone and that's about all. It's convenient and safest for it to shift the PageDirty bit from old page to new, just before updating the zone stats: before copying data and marking the new PageUptodate. This is all done while both pages are isolated and locked, just as before; and just as before, there's a moment when the new page is visible in the radix_tree, but not yet PageUptodate. What's new is that it may now be briefly visible as PageDirty before it is PageUptodate. When I scoured the tree to see if this could cause a problem anywhere, the only places I found were in two similar functions __r4w_get_page(): which look up a page with find_get_page() (not using page lock), then claim it's uptodate if it's PageDirty or PageWriteback or PageUptodate. I'm not sure whether that was right before, but now it might be wrong (on rare occasions): only claim the page is uptodate if PageUptodate. Or perhaps the page in question could never be migratable anyway? Signed-off-by: Hugh Dickins <hughd@google.com> Tested-by: Boaz Harrosh <ooo@electrozaur.com> Cc: Benny Halevy <bhalevy@panasas.com> Cc: Trond Myklebust <trond.myklebust@primarydata.com> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12MAINTAINERS: make Vladimir co-maintainer of the memory controllerJohannes Weiner1-0/+1
Vladimir architected and authored much of the current state of the memcg's slab memory accounting and tracking. Make sure he gets CC'd on bug reports ;-) Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-12mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make any ↵Michal Hocko2-5/+20
progress Tetsuo Handa has reported that the system might basically livelock in OOM condition without triggering the OOM killer. The issue is caused by internal dependency of the direct reclaim on vmstat counter updates (via zone_reclaimable) which are performed from the workqueue context. If all the current workers get assigned to an allocation request, though, they will be looping inside the allocator trying to reclaim memory but zone_reclaimable can see stalled numbers so it will consider a zone reclaimable even though it has been scanned way too much. WQ concurrency logic will not consider this situation as a congested workqueue because it relies that worker would have to sleep in such a situation. This also means that it doesn't try to spawn new workers or invoke the rescuer thread if the one is assigned to the queue. In order to fix this issue we need to do two things. First we have to let wq concurrency code know that we are in trouble so we have to do a short sleep. In order to prevent from issues handled by 0e093d99763e ("writeback: do not sleep on the congestion queue if there are no congested BDIs or if significant congestion is not being encountered in the current zone") we limit the sleep only to worker threads which are the ones of the interest anyway. The second thing to do is to create a dedicated workqueue for vmstat and mark it WQ_MEM_RECLAIM to note it participates in the reclaim and to have a spare worker thread for it. Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Tejun Heo <tj@kernel.org> Cc: Cristopher Lameter <clameter@sgi.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Arkadiusz Miskiewicz <arekm@maven.pl> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>