summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2013-11-11perf tools: Prevent condition that all sort keys are elidedNamhyung Kim1-0/+13
If given sort keys are all elided there'll be no output except for the overhead column - actually the TUI shows a noisy output. In this case it'd be better to show up the sort keys rather than elide. Before: $ perf report -s comm -c perf (...) # Overhead # ........ # 100.00% After: $ perf report -s comm -c perf (...) # Overhead Command # ........ ....... # 100.00% perf Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1383900822-14609-1-git-send-email-namhyung@kernel.org [ Us curly braces around multi-line statements, as requested by Ingo Molnar ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-11perf machine: Simplify synthesize_threads methodArnaldo Carvalho de Melo6-16/+22
Several tools (top, kvm) don't need to be called back to process each of the syntheiszed records, instead relying on the machine__process_event function to change the per machine data structures that represent threads and mmaps, so provide a way to ask for this common idiom. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-pusqibp8n3c4ynegd1frn4zd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-11perf machine: Introduce synthesize_threads method out of open coded equivalentArnaldo Carvalho de Melo6-39/+26
Further simplifications to be done on following patch, as most tools don't use the callback, using instead just the canned machine__process_event one. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-r1m0vuuj3cat4bampno9yc8d@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-11perf record: Synthesize non-exec MMAP records when --data usedArnaldo Carvalho de Melo7-32/+42
When perf_event_attr.mmap_data is set the kernel will generate PERF_RECORD_MMAP events when non-exec (data, SysV mem) mmaps are created, so we need to synthesize from /proc/pid/maps for existing threads, as we do for exec mmaps. Right now just 'perf record' does it, but any other tool that uses perf_event__synthesize_thread(s|map) can request it. Reported-by: Don Zickus <dzickus@redhat.com> Tested-by: Don Zickus <dzickus@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Bill Gray <bgray@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joe Mario <jmario@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Fowles <rfowles@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ihwzraikx23ian9txinogvv2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-11perf evsel: Remove idx parm from constructorArnaldo Carvalho de Melo12-28/+37
Most uses of the evsel constructor are followed by a call to perf_evlist__add with an idex of evlist->nr_entries, so make rename the current constructor to perf_evsel__new_idx and remove the need for passing the constructor for the common case. We still need the new_idx variant because the way groups are handled, with evsel->nr_members holding the number of entries in an evlist, partitioning the evlist into sublists inside a single linked list. This asks for a clarifying refactoring, but for now simplify the non parser cases, so that tool writers don't have to bother with evsel idx setting. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-zy9tskx6jqm2rmw7468zze2a@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-11perf ui tui progress: Don't force a refresh during progress updatePatrick Palka1-1/+2
Each call to tui_progress__update() would forcibly refresh the entire screen. This is somewhat inefficient and causes noticable flickering during the startup of perf-report, especially on large/slow terminals. It looks like the force-refresh in tui_progress__update() serves no purpose other than to clear the screen so that the progress bar of a previous operation does not subsume that of a subsequent operation. But we can do just that in a much more efficient manner by clearing only the region that a previous progress bar may have occupied before repainting the new progress bar. Then the force-refresh could be removed with no change in visuals. This patch disables the slow force-refresh in tui_progress__update() and instead calls SLsmg_fill_region() on the entire area that the progress bar may occupy before repainting it. This change makes the startup of perf-report much faster and appear much "smoother". It turns out that this was a big bottleneck in the startup speed of perf-report -- with this patch, perf-report starts up ~2x faster (1.1s vs 0.55s) on my machines. (These numbers were measured by running "time perf report" on an 8MB perf.data and pressing 'q' immediately.) Signed-off-by: Patrick Palka <patrick@parcs.ath.cx> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1382747149-9716-1-git-send-email-patrick@parcs.ath.cx Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-11Merge branch 'uprobes/core' of ↵Ingo Molnar1-23/+22
git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc into perf/core Pull uprobes fixes from Oleg Nesterov. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-09uprobes: Fix the memory out of bound overwrite in copy_insn()Oleg Nesterov1-22/+21
1. copy_insn() doesn't look very nice, all calculations are confusing and it is not immediately clear why do we read the 2nd page first. 2. The usage of inode->i_size is wrong on 32-bit machines. 3. "Instruction at end of binary" logic is simply wrong, it doesn't handle the case when uprobe->offset > inode->i_size. In this case "bytes" overflows, and __copy_insn() writes to the memory outside of uprobe->arch.insn. Yes, uprobe_register() checks i_size_read(), but this file can be truncated after that. All i_size checks are racy, we do this only to catch the obvious mistakes. Change copy_insn() to call __copy_insn() in a loop, simplify and fix the bytes/nbytes calculations. Note: we do not care if we read extra bytes after inode->i_size if we got the valid page. This is fine because the task gets the same page after page-fault, and arch_uprobe_analyze_insn() can't know how many bytes were actually read anyway. Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-11-09uprobes: Fix the wrong usage of current->utask in uprobe_copy_process()Oleg Nesterov1-1/+1
Commit aa59c53fd459 "uprobes: Change uprobe_copy_process() to dup xol_area" has a stupid typo, we need to setup t->utask->vaddr but the code wrongly uses current->utask. Even with this bug dup_xol_work() works "in practice", but only because get_unmapped_area(NULL, TASK_SIZE - PAGE_SIZE) likely returns the same address every time. Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-11-07Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar13-59/+239
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: * Fix version when building out of tree, as when using one of these: $ make help | grep perf perf-tar-src-pkg - Build perf-3.12.0.tar source tarball perf-targz-src-pkg - Build perf-3.12.0.tar.gz source tarball perf-tarbz2-src-pkg - Build perf-3.12.0.tar.bz2 source tarball perf-tarxz-src-pkg - Build perf-3.12.0.tar.xz source tarball $ from David Ahern. * Don't relookup fields by name in each sample in 'trace', by Arnaldo Carvalho de Melo. * 'perf record' code cleanups, from David Ahern. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-07perf tools: Remove unneeded includeRodrigo Campos1-1/+0
There is no point in sort.h including itself. The include was added when the file was created, in commit "perf tools: Create util/sort.and use it" (dd68ada2d) and added a include to "sort.h" in lot of files (all the files that started using the file). It was probably added by mistake on sort.h too. Signed-off-by: Rodrigo Campos <rodrigo@sdfg.com.ar> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1383776454-10595-1-git-send-email-rodrigo@sdfg.com.ar Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-07perf record: Remove post_processing_offset variableDavid Ahern1-5/+3
Duplicates the data_offset from header in the session. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1383763297-27066-4-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-07perf record: Remove advance_output functionDavid Ahern1-6/+1
1 line function with only 1 user; might as well embed directly. Signed-off-by: David Ahern <dsahern@gmail.com> Suggested-by: Ingo Molnar <mingo@kernel.org> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1383763297-27066-3-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-07perf record: Refactor feature handling into a separate functionDavid Ahern1-12/+21
Code move only. No logic changes. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1383763297-27066-2-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-07perf trace: Don't relookup fields by name in each sampleArnaldo Carvalho de Melo1-11/+188
Instead do the lookups just when creating the tracepoints, initially for the most common, raw_syscalls:sys_{enter,exit}. It works by having evsel->priv have a per tracepoint structure with entries for the fields, for direct access, with the offset and a function to get the value from the sample, doing the swap if needed. Using a simple workload that does M millions write syscalls, we go from: # perf stat -i -e cycles /tmp/oldperf trace ./sc_hello 100 > /dev/null Performance counter stats for '/tmp/oldperf trace ./sc_hello 100': 8,366,771,459 cycles 2.668025928 seconds time elapsed # perf stat -i -e cycles perf trace ./sc_hello 100 > /dev/null Performance counter stats for 'perf trace ./sc_hello 100': 8,345,187,650 cycles 2.631748425 seconds time elapsed Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-eyfhvoo510a5i10b27dnvm88@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-07perf tools: Fix version when building out of treeDavid Ahern2-1/+6
When building perf out of tree: $ make perf-tar-src-pkg $ tar -xf perf-<ver>.tar -C /tmp $ cd /tmp/perf<ver> $ make -C tools/perf you get this warning message: make[1]: *** No rule to make target `kernelversion'. Stop. Fix it by saving the perf version in the tar file and using that for the out of tree builds. v2: removed short form request and fixed up version string from usual output. Signed-off-by: David Ahern <dsahern@gmail.com> Suggested-by: Ingo Molnar <mingo@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1383753335-25782-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-07perf evsel: Ditch evsel->handler.data fieldArnaldo Carvalho de Melo9-23/+20
Not needed since this cset: fcf65bf149af: perf evsel: Cache associated event_format So lets trim this struct a bit. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-j8setslokt0goiwxq9dogzqm@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-07Merge branch 'uprobes/core' of ↵Ingo Molnar4-29/+24
git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc into perf/core Pull uprobes updates from Oleg Nesterov: " [...] this way the upcoming ARM port doesn't (almost) need changes outside of arch/arm and thus it would be simpler to route everything via the ARM trees. " Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06uprobes: Export write_opcode() as uprobe_write_opcode()Oleg Nesterov2-7/+8
set_swbp() and set_orig_insn() are __weak, but this is pointless because write_opcode() is static. Export write_opcode() as uprobe_write_opcode() for the upcoming arm port, this way it can actually override set_swbp() and use __opcode_to_mem_arm(bpinsn) instead if UPROBE_SWBP_INSN. Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-11-06uprobes: Introduce arch_uprobe->ixolOleg Nesterov3-2/+7
Currently xol_get_insn_slot() assumes that we should simply copy arch_uprobe->insn[] which is (ignoring arch_uprobe_analyze_insn) just the copy of the original insn. This is not true for arm which needs to create another insn to execute it out-of-line. So this patch simply adds the new member, ->ixol into the union. This doesn't make any difference for x86 and powerpc, but arm can divorce insn/ixol and initialize the correct xol insn in arch_uprobe_analyze_insn(). Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-11-06uprobes: Kill module_init() and module_exit()Oleg Nesterov1-6/+1
Turn module_init() into __initcall() and kill module_exit(). This code can't be compiled as a module so these module_*() calls only add the confusion, especially if arch-dependant code needs its own initialization hooks. Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-11-06uprobes: Move function declarations out of archDavid A. Long3-14/+8
Move the function declarations from the arch headers to the common header, since only the function bodies are architecture-specific. These changes are from Vincent Rabin's uprobes patch. [ oleg: update arch/powerpc/include/asm/uprobes.h ] Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: David A. Long <dave.long@linaro.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-11-06perf/x86/intel: Add Ivy Bridge-EP uncore IRP box supportYan, Zheng1-0/+73
Unlike other uncore boxes, IRP boxes live in PCI buses with no UBOX device. For PCI bus without UBOX device, we find the next bus that has UBOX device and use its 'bus to socket' mapping. Besides the counter/control registers in IRP boxes are not properly aligned. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: eranian@google.com Cc: "Yan Zheng" <zheng.z.yan@intel.com> Link: http://lkml.kernel.org/r/1383197815-17706-2-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06perf/x86/intel/uncore: Add filter support for IvyBridge-EP QPI boxesYan, Zheng1-10/+51
The encoding for filter registers of IvyBridge-EP uncore QPI boxes is completely the same as SandyBridge-EP. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: eranian@google.com Cc: "Yan Zheng" <zheng.z.yan@intel.com> Link: http://lkml.kernel.org/r/1383197815-17706-1-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06perf: Factor out strncpy() in perf_event_mmap_event()Oleg Nesterov1-16/+16
While this is really minor, but strncpy() does the unnecessary zero-padding till the end of tmp[16] and it is called every time we are going to use the string literal. Turn these strncpy()'s into the single strlcpy() under the new label, saves 72 bytes. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131017182417.GA17753@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06tools/perf: Add required memory barriersPeter Zijlstra3-16/+49
To match patch bf378d341e48 ("perf: Fix perf ring buffer memory ordering") change userspace to also adhere to the ordering outlined. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Michael Neuling <mikey@neuling.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: james.hogan@imgtec.com Cc: Vince Weaver <vince@deater.net> Cc: Victor Kaplansky <VICTORK@il.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Anton Blanchard <anton@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Michael Ellerman <michael@ellerman.id.au> Link: http://lkml.kernel.org/r/20131030104246.GH16117@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06perf: Fix arch_perf_out_copy_user defaultPeter Zijlstra6-16/+33
The arch_perf_output_copy_user() default of __copy_from_user_inatomic() returns bytes not copied, while all other argument functions given DEFINE_OUTPUT_COPY() return bytes copied. Since copy_from_user_nmi() is the odd duck out by returning bytes copied where all other *copy_{to,from}* functions return bytes not copied, change it over and ammend DEFINE_OUTPUT_COPY() to expect bytes not copied. Oddly enough DEFINE_OUTPUT_COPY() already returned bytes not copied while expecting its worker functions to return bytes copied. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: will.deacon@arm.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20131030201622.GR16117@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06perf: Update a stale commentPeter Zijlstra1-2/+2
Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: james.hogan@imgtec.com Cc: Vince Weaver <vince@deater.net> Cc: Victor Kaplansky <VICTORK@il.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Anton Blanchard <anton@samba.org> Link: http://lkml.kernel.org/n/tip-9s5mze78gmlz19agt39i8rii@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06perf: Optimize perf_output_begin() -- address calculationPeter Zijlstra1-7/+7
Rewrite the handle address calculation code to be clearer. Saves 8 bytes on x86_64-defconfig. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: james.hogan@imgtec.com Cc: Vince Weaver <vince@deater.net> Cc: Victor Kaplansky <VICTORK@il.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Anton Blanchard <anton@samba.org> Link: http://lkml.kernel.org/n/tip-3trb2n2henb9m27tncef3ag7@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06perf: Optimize perf_output_begin() -- lost_event casePeter Zijlstra1-5/+8
Avoid touching the lost_event and sample_data cachelines twince. Its not like we end up doing less work, but it might help to keep all accesses to these cachelines in one place. Due to code shuffle, this looses 4 bytes on x86_64-defconfig. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: james.hogan@imgtec.com Cc: Vince Weaver <vince@deater.net> Cc: Victor Kaplansky <VICTORK@il.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Anton Blanchard <anton@samba.org> Link: http://lkml.kernel.org/n/tip-zfxnc58qxj0eawdoj31hhupv@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06perf: Optimize perf_output_begin()Peter Zijlstra1-8/+9
There's no point in re-doing the memory-barrier when we fail the cmpxchg(). Also placing it after the space reservation loop makes it clearer it only separates the userpage->tail read from the data stores. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: james.hogan@imgtec.com Cc: Vince Weaver <vince@deater.net> Cc: Victor Kaplansky <VICTORK@il.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Anton Blanchard <anton@samba.org> Link: http://lkml.kernel.org/n/tip-c19u6egfldyx86tpyc3zgkw9@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06perf: Add unlikely() to the ring-buffer codePeter Zijlstra1-8/+8
Add unlikely() annotations to 'slow' paths: When having a sampling event but no output buffer; you have bigger issues -- also the bail is still faster than actually doing the work. When having a sampling event but a control page only buffer, you have bigger issues -- again the bail is still faster than actually doing work. Optimize for the case where you're not loosing events -- again, not doing the work is still faster but make sure that when you have to actually do work its as fast as possible. The typical watermark is 1/2 the buffer size, so most events will not take this path. Shrinks perf_output_begin() by 16 bytes on x86_64-defconfig. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: james.hogan@imgtec.com Cc: Vince Weaver <vince@deater.net> Cc: Victor Kaplansky <VICTORK@il.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Anton Blanchard <anton@samba.org> Link: http://lkml.kernel.org/n/tip-wlg3jew3qnutm8opd0hyeuwn@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06perf: Simplify the ring-buffer codePeter Zijlstra1-33/+4
By using CIRC_SPACE() we can obviate the need for perf_output_space(). Shrinks the size of perf_output_begin() by 17 bytes on x86_64-defconfig. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: james.hogan@imgtec.com Cc: Vince Weaver <vince@deater.net> Cc: Victor Kaplansky <VICTORK@il.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Anton Blanchard <anton@samba.org> Link: http://lkml.kernel.org/n/tip-vtb0xb0llebmsdlfn1v5vtfj@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar45-497/+591
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: * Check maximum frequency rate for record/top, emitting better error messages, from Jiri Olsa. * Disable live kvm command if timerfd is not supported, from David Ahern. * Add usage to 'perf list', from David Ahern. * Fix detection of non-core features, from David Ahern. * Consolidate __hists__add_*entry(), cleanup from Namhyung Kim. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-05perf tools: Finish the removal of 'self' argumentsArnaldo Carvalho de Melo21-246/+242
They convey no information, perhaps I was bitten by some snake at some point, complete the detox by naming the last of those arguments more sensibly. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-u1r0dnjoro08dgztiy2g3t2q@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-05perf tools: Check maximum frequency rate for record/topJiri Olsa4-28/+74
Adding the check for maximum allowed frequency rate defined in following file: /proc/sys/kernel/perf_event_max_sample_rate When we cross the maximum value we fail and display detailed error message with advise. $ perf record -F 3000 ls Maximum frequency rate (2000) reached. Please use -F freq option with lower value or consider tweaking /proc/sys/kernel/perf_event_max_sample_rate. In case user does not specify the frequency and the default value cross the maximum, we display warning and set the frequency value to the current maximum. $ perf record ls Lowering default frequency rate to 2000. Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate. Same messages are used for 'perf top'. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1383660887-1734-4-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-05perf fs: Add procfs supportJiri Olsa3-2/+19
Adding procfs support into fs class. The interface function: const char *procfs__mountpoint(void); provides existing mountpoint path for procfs. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1383660887-1734-3-git-send-email-jolsa@redhat.com [ Fixup namespace ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-05perf fs: Rename NAME_find_mountpoint() to NAME__mountpoint()Arnaldo Carvalho de Melo5-21/+16
Shorten it, "finding" it is an implementation detail, what callers want is the pathname, not to ask for it to _always_ do the lookup. And the existing implementation already caches it, i.e. it doesn't "finds" it on every call. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-r24wa4bvtccg7mnkessrbbdj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-05perf tools: Factor sysfs code into generic fs objectJiri Olsa9-72/+119
Moving sysfs code into generic fs object and preparing it to carry procfs support. This should be merged with tools/lib/lk/debugfs.c at some point in the future. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1383660887-1734-2-git-send-email-jolsa@redhat.com [ Added fs__ namespace qualifier to some more functions ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-05perf list: Add usageDavid Ahern1-3/+14
Currently 'perf list' is not very helpful if you forget the syntax: $ perf list -h List of pre-defined events (to be used in -e): After: $ perf list -h usage: perf list [hw|sw|cache|tracepoint|pmu|event_glob] Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/527133AD.4030003@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-05perf list: Remove a level of indentationDavid Ahern1-36/+37
With a return after the if check an indentation level can be removed. Indentation shift only; no functional changes. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1383149707-1008-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-05tools/perf/build: Fix detection of non-core featuresDavid Ahern1-5/+5
feature_check needs to be invoked through call, and LDFLAGS may not be set so quotes are needed. Thanks to Jiri for spotting the quotes around LDFLAGS; that one was driving me nuts with the upcoming timerfd feature detection. Signed-off-by: David Ahern <dsahern@gmail.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1383064996-20933-1-git-send-email-dsahern@gmail.com [ Fixed conflict with 8a0c4c2843d3 ("perf tools: Fix libunwind build and feature detection for 32-bit build") ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-05perf kvm: Disable live command if timerfd is not supportedDavid Ahern5-1/+47
If the OS does not have timerfd support (e.g., older OS'es like RHEL5) disable perf kvm stat live. Signed-off-by: David Ahern <dsahern@gmail.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1383064996-20933-2-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-04perf hists: Consolidate __hists__add_*entry()Namhyung Kim7-95/+30
The __hists__add_{branch,mem}_entry() does almost the same thing that __hists__add_entry() does. Consolidate them into one. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Rodrigo Campos <rodrigo@sdfg.com.ar> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1383202576-28141-2-git-send-email-namhyung@kernel.org [ Fixup clash with new COMM infrastructure ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-04Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar35-303/+722
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: * Add new COMM infrastructure, further improving histogram processing, from Frédéric Weisbecker, one fix from Namhyung Kim. * Enhance option parse error message, showing just the help lines of the options affected, from Namhyung Kim. * Fixup PERF_SAMPLE_TRANSACTION handling in sample synthesizing and 'perf test', from Adrian Hunter. * Set up output options for in-stream attributes, from Adrian Hunter. * Fix 32-bit cross build, from Adrian Hunter. * Fix libunwind build and feature detection for 32-bit build, from Adrian Hunter. * Always use perf_evsel__set_sample_bit to set sample_type, from Adrian Hunter. perf evlist: Add a debug print if event buffer mmap fails * Add missing data.h into LIB_H headers, fix from Jiri Olsa. * libtraceevent updates from upstream trace-cmd repo, from Steven Rostedt. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-04tools lib traceevent: Add pevent_print_func_field() helper functionSteven Rostedt2-0/+46
Add the pevent_print_func_field() that will look up a field that is expected to be a function pointer, and it will print the function name and offset of the address given by the field. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20131101215501.869542711@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-04tools lib traceevent: Add flags NOHANDLE and PRINTRAW to individual eventsSteven Rostedt2-2/+4
Add the flags EVENT_FL_NOHANDLE and EVENT_FL_PRINTRAW to the event flags to have the event either ignore the register handler or to ignore the handler and also print the raw format respectively. This allows a tool to force a raw format or non handle for an event. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20131101215501.655258742@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-04tools lib traceevent: Check for spaces in character arraySteven Rostedt (Red Hat)1-1/+1
Currently when using the raw format for fields, when looking at a character array, to determine if it is a string or not, we make sure all characters are "isprint()". If not, then we consider it a numeric array, and print the hex numbers of the characters instead. But it seems that '\n' fails the isprint() check! Add isspace() to the check as well, such that if all characters pass isprint() or isspace() it will assume the character array is a string. Reported-by: Xenia Ragiadakou <burzalodowa@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Xenia Ragiadakou <burzalodowa@gmail.com> Link: http://lkml.kernel.org/r/20131101215501.465091682@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-04tools lib traceevent: Have bprintk output the same as the kernel doesSteven Rostedt (Red Hat)1-4/+4
The trace_bprintk() in the kernel looks like: ring_buffer_producer_thread: Missed: 0 ring_buffer_producer_thread: Hit: 62174350 ring_buffer_producer_thread: Entries per millisec: 6296 ring_buffer_producer_thread: 158 ns per entry ring_buffer_producer_thread: Sleeping for 10 secs ring_buffer_producer_thread: Starting ring buffer hammer ring_buffer_producer_thread: End ring buffer hammer But the current output looks like this: ring_buffer_producer_thread : Time: 9407018 (usecs) ring_buffer_producer_thread : Overruns: 43285485 ring_buffer_producer_thread : Read: 4405365 (by events) ring_buffer_producer_thread : Entries: 0 ring_buffer_producer_thread : Total: 47690850 ring_buffer_producer_thread : Missed: 0 ring_buffer_producer_thread : Hit: 47690850 Remove the space between the function and the colon. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20131101215501.272654481@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-04tools lib traceevent: Handle __print_hex(__get_dynamic_array(fieldname), len)Howard Cochran1-8/+16
The kernel has a few events with a format similar to this excerpt: field:unsigned int len; offset:12; size:4; signed:0; field:__data_loc unsigned char[] data_array; offset:16; size:4; signed:0; print fmt: "%s", __print_hex(__get_dynamic_array(data_array), REC->len) trace-cmd could already parse that arg correctly, but print_str_arg() was unable to handle the first parameter being a dynamic array. (It just printed a "field not found" warning). Teach print_str_arg's PRINT_HEX case to handle the nested PRINT_DYNAMIC_ARRAY correctly. The output now matches the kernel's own formatting for this case. Signed-off-by: Howard Cochran <hcochran@lexmark.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1381503349-12271-1-git-send-email-hcochran@lexmark.com [ Removed "polish compare", we don't do that here ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>