summaryrefslogtreecommitdiffstats
path: root/tools/perf
AgeCommit message (Collapse)AuthorFilesLines
2019-03-19perf config: Fix an error in the config template documentationChangbin Du1-1/+1
The option 'sort-order' should be 'sort_order'. Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 893c5c798be9 ("perf config: Show default report configuration in example and docs") Link: http://lkml.kernel.org/r/20190316080556.3075-5-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf tools: Fix errors under optimization level '-Og'Changbin Du3-3/+3
Optimization level '-Og' offers a reasonable level of optimization while maintaining fast compilation and a good debugging experience. This patch tries to make it work. $ make DEBUG=1 EXTRA_CFLAGS='-Og' bench/epoll-ctl.c: In function ‘do_threads’: bench/epoll-ctl.c:274:9: error: ‘ret’ may be used uninitialized in this function [-Werror=maybe-uninitialized] return ret; ^~~ ... Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20190316080556.3075-4-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf list: Don't forget to drop the reference to the allocated thread_mapChangbin Du1-0/+1
Detected via gcc's ASan: Direct leak of 2048 byte(s) in 64 object(s) allocated from: 6 #0 0x7f606512e370 in __interceptor_realloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee370) 7 #1 0x556b0f1d7ddd in thread_map__realloc util/thread_map.c:43 8 #2 0x556b0f1d84c7 in thread_map__new_by_tid util/thread_map.c:85 9 #3 0x556b0f0e045e in is_event_supported util/parse-events.c:2250 10 #4 0x556b0f0e1aa1 in print_hwcache_events util/parse-events.c:2382 11 #5 0x556b0f0e3231 in print_events util/parse-events.c:2514 12 #6 0x556b0ee0a66e in cmd_list /home/changbin/work/linux/tools/perf/builtin-list.c:58 13 #7 0x556b0f01e0ae in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 14 #8 0x556b0f01e859 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 15 #9 0x556b0f01edc8 in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 16 #10 0x556b0f01f71f in main /home/changbin/work/linux/tools/perf/perf.c:520 17 #11 0x7f6062ccf09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 89896051f8da ("perf tools: Do not put a variable sized type not at the end of a struct") Link: http://lkml.kernel.org/r/20190316080556.3075-3-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf tools: Add doc about how to build perf with Asan and UBSanChangbin Du1-0/+24
AddressSanitizer (or ASan) and UndefinedBehaviorSanitizer (or UBSan) are very useful tools to detect program bugs: - AddressSanitizer (or ASan) is a GCC feature that detects memory corruption bugs such as buffer overflows and memory leaks. - UndefinedBehaviorSanitizer (or UBSan) is a fast undefined behavior detector supported by GCC. UBSan detects undefined behaviors of programs at runtime. This patch adds a document about how to use them on perf. Later patches will fix some of the issues disclosed by them. Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20190316080556.3075-2-changbin.du@gmail.com [ Make some changes based on comments made by Jiri Olsa ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf vendor events: Remove P8 HW events which are not supportedMamatha Inamdar1-594/+0
This patch is to remove following hardware events from JSON file which are not supported on POWER8. pm_co_disp_fail pm_co_tm_sc_footprint pm_iside_disp pm_iside_disp_fail pm_iside_disp_fail_other pm_iside_mru_touch pm_l2_castout_mod pm_l2_castout_shr pm_l2_dc_inv pm_l2_disp_all_l2miss pm_l2_grp_guess_correct pm_l2_grp_guess_wrong pm_l2_ic_inv pm_l2_inst pm_l2_inst_miss pm_l2_ld pm_l2_ld_disp pm_l2_ld_hit pm_l2_ld_miss pm_l2_loc_guess_correct pm_l2_loc_guess_wrong pm_l2_rcld_disp pm_l2_rcld_disp_fail_addr pm_l2_rcld_disp_fail_other pm_l2_rcst_disp pm_l2_rcst_disp_fail_addr pm_l2_rcst_disp_fail_other pm_l2_rc_st_done pm_l2_rty_ld pm_l2_sn_m_rd_done pm_l2_sn_m_wr_done pm_l2_sn_sx_i_done pm_l2_st_disp pm_l2_st_hit pm_l2_sys_guess_correct pm_l2_sys_guess_wrong pm_l2_sys_pump pm_l3_ci_hit pm_l3_ci_miss pm_l3_cinj pm_l3_co pm_l3_co_lco pm_l3_grp_guess_correct pm_l3_grp_guess_wrong_high pm_l3_grp_guess_wrong_low pm_l3_hit pm_l3_l2_co_hit pm_l3_l2_co_miss pm_l3_lat_ci_hit pm_l3_lat_ci_miss pm_l3_ld_hit pm_l3_ld_miss pm_l3_loc_guess_correct pm_l3_loc_guess_wrong pm_l3_miss pm_l3_p0_co_l31 pm_l3_p0_co_mem pm_l3_p0_co_rty pm_l3_p0_grp_pump pm_l3_p0_lco_data pm_l3_p0_lco_no_data pm_l3_p0_lco_rty pm_l3_p0_node_pump pm_l3_p0_pf_rty pm_l3_p0_sn_hit pm_l3_p0_sn_inv pm_l3_p0_sn_miss pm_l3_p0_sys_pump pm_l3_p1_co_l31 pm_l3_p1_co_mem pm_l3_p1_co_rty pm_l3_p1_grp_pump pm_l3_p1_lco_data pm_l3_p1_lco_no_data pm_l3_p1_lco_rty pm_l3_p1_node_pump pm_l3_p1_pf_rty pm_l3_p1_sn_hit pm_l3_p1_sn_inv pm_l3_p1_sn_miss pm_l3_p1_sys_pump pm_l3_pf_hit_l3 pm_l3_sys_guess_correct pm_l3_sys_guess_wrong pm_l3_trans_pf pm_l3_wi0_busy pm_l3_wi_usage pm_non_tm_rst_sc pm_rd_clearing_sc pm_rd_forming_sc pm_rd_hit_pf pm_snp_tm_hit_m pm_snp_tm_hit_t pm_st_caused_fail pm_tm_cam_overflow pm_tm_cap_overflow pm_tm_fav_caused_fail pm_tm_ld_caused_fail pm_tm_ld_conf pm_tm_rst_sc pm_tm_sc_co pm_tm_st_caused_fail pm_tm_st_conf Signed-off-by: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com> Acked-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Fixes: 2a81fa3bb5ed ("perf vendor events: Add power8 PMU events") Link: http://lkml.kernel.org/r/154953186583.11022.14819560028300370163.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf stat: Improve scalingAndi Kleen1-1/+1
The multiplexing scaling in perf stat mysteriously adds 0.5 to the value. This dates back to the original perf tool. Other scaling code doesn't use that strange convention. Remove the extra 0.5. Before: $ perf stat -e 'cycles,cycles,cycles,cycles,cycles,cycles' grep -rq foo Performance counter stats for 'grep -rq foo': 6,403,580 cycles (81.62%) 6,404,341 cycles (81.64%) 6,402,983 cycles (81.62%) 6,399,941 cycles (81.63%) 6,399,451 cycles (81.62%) 6,436,105 cycles (91.87%) 0.005843799 seconds time elapsed 0.002905000 seconds user 0.002902000 seconds sys After: $ perf stat -e 'cycles,cycles,cycles,cycles,cycles,cycles' grep -rq foo Performance counter stats for 'grep -rq foo': 6,422,704 cycles (81.68%) 6,401,842 cycles (81.68%) 6,398,432 cycles (81.68%) 6,397,098 cycles (81.68%) 6,396,074 cycles (81.67%) 6,434,980 cycles (91.62%) 0.005884437 seconds time elapsed 0.003580000 seconds user 0.002356000 seconds sys Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> LPU-Reference: 20190314225002.30108-10-andi@firstfloor.org Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf stat: Fix --no-scaleAndi Kleen4-14/+9
The -c option to enable multiplex scaling has been useless for quite some time because scaling is default. It's only useful as --no-scale to disable scaling. But the non scaling code path has bitrotted and doesn't print anything because perf output code relies on value run/ena information. Also even when we don't want to scale a value it's still useful to show its multiplex percentage. This patch: - Fixes help and documentation to show --no-scale instead of -c - Removes -c, only keeps the long option because -c doesn't support negatives. - Enables running/enabled even with --no-scale - And fixes some other problems in the no-scale output. Before: $ perf stat --no-scale -e cycles true Performance counter stats for 'true': <not counted> cycles 0.000984154 seconds time elapsed After: $ ./perf stat --no-scale -e cycles true Performance counter stats for 'true': 706,070 cycles 0.001219821 seconds time elapsed Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> LPU-Reference: 20190314225002.30108-9-andi@firstfloor.org Link: https://lkml.kernel.org/n/tip-xggjvwcdaj2aqy8ib3i4b1g6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf script: Support relative timeAndi Kleen2-2/+19
When comparing time stamps in 'perf script' traces it can be annoying to work with the full perf time stamps. Add a --reltime option that displays time stamps relative to the trace start to make it easier to read the traces. Note: not currently supported for --time. Report an error in this case. Before: % perf script swapper 0 [000] 245402.891216: 1 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms]) swapper 0 [000] 245402.891223: 1 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms]) swapper 0 [000] 245402.891227: 5 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms]) swapper 0 [000] 245402.891231: 41 cycles:ppp: ffffffffa0068816 native_write_msr+0x6 ([kernel.kallsyms]) swapper 0 [000] 245402.891235: 355 cycles:ppp: ffffffffa000dd51 intel_bts_enable_local+0x21 ([kernel.kallsyms]) swapper 0 [000] 245402.891239: 3084 cycles:ppp: ffffffffa0a0150a end_repeat_nmi+0x48 ([kernel.kallsyms]) After: % perf script --reltime swapper 0 [000] 0.000000: 1 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms]) swapper 0 [000] 0.000006: 1 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms]) swapper 0 [000] 0.000010: 5 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms]) swapper 0 [000] 0.000014: 41 cycles:ppp: ffffffffa0068816 native_write_msr+0x6 ([kernel.kallsyms]) swapper 0 [000] 0.000018: 355 cycles:ppp: ffffffffa000dd51 intel_bts_enable_local+0x21 ([kernel.kallsyms]) swapper 0 [000] 0.000022: 3084 cycles:ppp: ffffffffa0a0150a end_repeat_nmi+0x48 ([kernel.kallsyms]) Committer notes: Do not use 'time' as the name of a variable, as this breaks the build on older glibcs: cc1: warnings being treated as errors builtin-script.c: In function 'perf_sample__fprintf_start': builtin-script.c:691: warning: declaration of 'time' shadows a global declaration /usr/include/time.h:187: warning: shadowed declaration is here Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> LPU-Reference: 20190314225002.30108-8-andi@firstfloor.org Link: https://lkml.kernel.org/n/tip-bpahyi6pr9r399mvihu65fvc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf report: Indicate JITed code better in reportAndi Kleen1-18/+24
Print [TID] tid %d instead of the crypted /tmp/perf-%d.map default. % cat >loop.java public class loop { public static void main(String[] args) { for (;;); } } ^D % javac loop.java % perf record java loop ^C Before: % perf report --stdio ... 56.09% java perf-34724.map [.] 0x00007fd5bd021896 19.12% java perf-34724.map [.] 0x00007fd5bd021887 9.79% java perf-34724.map [.] 0x00007fd5bd021783 8.97% java perf-34724.map [.] 0x00007fd5bd02175b After: % perf report --stdio ... 56.09% java [JIT] tid 34724 [.] 0x00007fd5bd021896 19.12% java [JIT] tid 34724 [.] 0x00007fd5bd021887 9.79% java [JIT] tid 34724 [.] 0x00007fd5bd021783 8.97% java [JIT] tid 34724 [.] 0x00007fd5bd02175b Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> LPU-Reference: 20190314225002.30108-7-andi@firstfloor.org Link: https://lkml.kernel.org/n/tip-r17l6py9g0sezb7mi1f286gt@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf report: Show all sort keys in help outputAndi Kleen3-3/+56
Show all the supported sort keys in the command line help output, so that it's not needed to refer to the manpage. Before: % perf report -h ... -s, --sort <key[,key2...]> sort by key(s): pid, comm, dso, symbol, parent, cpu, srcline, ... Please refer the man page for the complete list. After: % perf report -h ... -s, --sort <key[,key2...]> sort by key(s): overhead overhead_sys overhead_us overhead_guest_sys overhead_guest_us overhead_children sample period pid comm dso symbol parent cpu ... Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> LPU-Reference: 20190314225002.30108-5-andi@firstfloor.org Link: https://lkml.kernel.org/n/tip-9r3uz2ch4izoi1uln3f889co@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf record: Clarify help for --switch-outputAndi Kleen1-2/+2
The help description for --switch-output looks like there are multiple comma separated fields. But it's actually a choice of different options. Make it clear and less confusing. Before: % perf record -h ... --switch-output[=<signal,size,time>] Switch output when receive SIGUSR2 or cross size,time threshold After: % perf record -h ... --switch-output[=<signal or size[BKMG] or time[smhd]>] Switch output when receiving SIGUSR2 (signal) or cross a size or time threshold Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> LPU-Reference: 20190314225002.30108-4-andi@firstfloor.org Link: https://lkml.kernel.org/n/tip-9yecyuha04nyg8toyd1b2pgi@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf record: Allow to limit number of reported perf.data filesAndi Kleen4-8/+39
When doing long term recording and waiting for some event to snapshot on, we often only care about the last minute or so. The --switch-output command line option supports rotating the perf.data file when the size exceeds a threshold. But the disk would still be filled with unnecessary old files. Add a new option to only keep a number of rotated files, so that the disk space usage can be limited. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> LPU-Reference: 20190314225002.30108-3-andi@firstfloor.org Link: https://lkml.kernel.org/n/tip-y5u2lik0ragt4vlktz6qc9ks@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19perf list: Filter metrics tooAndi Kleen1-1/+1
When a filter is specified on the command line, filter the metrics too. Before: % perf list foo List of pre-defined events (to be used in -e): Metric Groups: DSB: DSB_Coverage [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)] ... more metrics ... After: % perf list foo List of pre-defined events (to be used in -e): Metric Groups: Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> LPU-Reference: 20190314225002.30108-1-andi@firstfloor.org Link: https://lkml.kernel.org/n/tip-1y8oi2s8c4jhjtykgs5zvda1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf tools report: Add custom scripts to script menuAndi Kleen2-0/+28
Add a way to define custom scripts through ~/.perfconfig, which are then added to the scripts menu. The scripts get the same arguments as 'perf script', in particular -i, --cpu, --tid. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190311144502.15423-10-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf ui browser: Fix ui popup argv browser for many entriesAndi Kleen2-6/+7
Fix the argv ui browser code to correctly display more entries than fit on the screen without crashing. The problem was some type confusion with pointer types in the ->seek function. Do the argv arithmetic correctly with char ** pointers. Also add some asserts to find overruns and limit the display function correctly. Then finally remove a workaround for this in the res sample browser. Committer testing: 1) Resize the x terminal to have just some 5 lines 2) Use 'perf report --samples 1' to activate the sample browser options in the menu 3) Press ENTER, this will cause the crash: # perf report --samples 1 perf: Segmentation fault -------- backtrace -------- perf[0x5a514a] /lib64/libc.so.6(+0x385bf)[0x7f27281b55bf] /lib64/libc.so.6(+0x161a67)[0x7f27282dea67] /lib64/libslang.so.2(SLsmg_write_wrapped_string+0x82)[0x7f272874a0b2] perf(ui_browser__argv_refresh+0x77)[0x5939a7] perf[0x5924cc] perf(ui_browser__run+0x39)[0x593449] perf(ui__popup_menu+0x83)[0x5a5263] perf[0x59f421] perf(perf_evlist__tui_browse_hists+0x3a0)[0x5a3780] perf(cmd_report+0x2746)[0x447136] perf[0x4a95fe] perf(main+0x61c)[0x42dc6c] /lib64/libc.so.6(__libc_start_main+0xf2)[0x7f27281a1412] perf(_start+0x2d)[0x42de9d] # After applying this patch no crash takes place in such situation. Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190311144502.15423-12-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf script: Add array bound checking to list_scriptsAndi Kleen3-4/+10
Don't overflow array when the scripts directory is too large, or the script file name is too long. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190311144502.15423-11-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf tools: Add some new tips describing the new optionsAndi Kleen1-0/+7
Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190311144502.15423-9-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf report: Implement browsing of individual samplesAndi Kleen12-1/+226
Now 'perf report' can show whole time periods with 'perf script', but the user still has to find individual samples of interest manually. It would be expensive and complicated to search for the right samples in the whole perf file. Typically users only need to look at a small number of samples for useful analysis. Also the full scripts tend to show samples of all CPUs and all threads mixed up, which can be very confusing on larger systems. Add a new --samples option to save a small random number of samples per hist entry. Use a reservoir sample technique to select a representatve number of samples. Then allow browsing the samples using 'perf script' as part of the hist entry context menu. This automatically adds the right filters, so only the thread or cpu of the sample is displayed. Then we use less' search functionality to directly jump the to the time stamp of the selected sample. It uses different menus for assembler and source display. Assembler needs xed installed and source needs debuginfo. Currently it only supports as many samples as fit on the screen due to some limitations in the slang ui code. Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190311174605.GA29294@tassilo.jf.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf report: Support builtin perf script in scripts menuAndi Kleen4-40/+120
The scripts menu traditionally only showed custom perf scripts. Allow to run standard perf script with useful default options too. - Normal perf script - perf script with assembler (needs xed installed) - perf script with source code output (needs debuginfo) - perf script with custom arguments Then we automatically select the right options to display the information in the perf.data file. For example with -b display branch contexts. It's not easily possible to check for xed's existence in advance. perf script usually gives sensible error messages when it's not available. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190311144502.15423-7-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf report: Support running scripts for current time rangeAndi Kleen1-11/+72
When using the time sort key, add new context menus to run scripts for only the currently selected time range. Compute the correct range for the selection add pass it as the --time option to perf script. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190311144502.15423-6-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf report: Support time sort keyAndi Kleen5-0/+55
Add a time sort key to perf report to display samples for different time quantums separately. This allows easier analysis of workloads that change over time, and also will allow looking at the context of samples. % perf record ... % perf report --sort time,overhead,symbol --time-quantum 1ms --stdio ... 0.67% 277061.87300 [.] _dl_start 0.50% 277061.87300 [.] f1 0.50% 277061.87300 [.] f2 0.33% 277061.87300 [.] main 0.29% 277061.87300 [.] _dl_lookup_symbol_x 0.29% 277061.87300 [.] dl_main 0.29% 277061.87300 [.] do_lookup_x 0.17% 277061.87300 [.] _dl_debug_initialize 0.17% 277061.87300 [.] _dl_init_paths 0.08% 277061.87300 [.] check_match 0.04% 277061.87300 [.] _dl_count_modids 1.33% 277061.87400 [.] f1 1.33% 277061.87400 [.] f2 1.33% 277061.87400 [.] main 1.17% 277061.87500 [.] main 1.08% 277061.87500 [.] f1 1.08% 277061.87500 [.] f2 1.00% 277061.87600 [.] main 0.83% 277061.87600 [.] f1 0.83% 277061.87600 [.] f2 1.00% 277061.87700 [.] main Committer notes: Rename 'time' argument to hist_time() to htime to overcome this in older distros: cc1: warnings being treated as errors util/hist.c: In function 'hist_time': util/hist.c:251: error: declaration of 'time' shadows a global declaration /usr/include/time.h:186: error: shadowed declaration is here Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190311144502.15423-4-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf script: Filter COMM/FORK/.. events by CPUAndi Kleen1-24/+47
The --cpu option only filtered samples. Filter other perf events, such as COMM, FORK, SWITCH by the CPU too. Reported-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190311144502.15423-2-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf tools: Update x86's syscall_64.tbl, no change in tools/perf behaviourArnaldo Carvalho de Melo1-2/+4
To pick the changes in 7948450d4556 ("x86/x32: use time64 versions of sigtimedwait and recvmmsg"), that doesn't cause any change in behaviour in tools/perf/ as it deals just with the x32 entries. This silences this tools/perf build warning: Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl' diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-mqpvshayeqidlulx5qpioa59@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf script python: Add printdate function to SQL exportersTony Jones2-13/+19
Introduce a printdate function to eliminate the repetitive use of datetime.datetime.today() in the SQL exporting scripts. Signed-off-by: Tony Jones <tonyj@suse.de> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: http://lkml.kernel.org/r/20190309000518.2438-5-tonyj@suse.de Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf script python: Add Python3 support to export-to-sqlite.pyTony Jones1-9/+14
Support both Python2 and Python3 in the export-to-sqlite.py script The use of 'from __future__' implies the minimum supported Python2 version is now v2.6 Signed-off-by: Tony Jones <tonyj@suse.de> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: http://lkml.kernel.org/r/20190309000518.2438-4-tonyj@suse.de Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf script python: Add Python3 support to export-to-postgresql.pyTony Jones1-17/+41
Support both Python2 and Python3 in the export-to-postgresql.py script. The use of 'from __future__' implies the minimum supported Python2 version is now v2.6 Signed-off-by: Tony Jones <tonyj@suse.de> Link: http://lkml.kernel.org/r/20190309000518.2438-3-tonyj@suse.de Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf script python: Add Python3 support to exported-sql-viewer.pyTony Jones1-14/+28
Support both Python2 and Python3 in the exported-sql-viewer.py script. The use of 'from __future__' implies the minimum supported Python2 version is now v2.6 Signed-off-by: Tony Jones <tonyj@suse.de> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: http://lkml.kernel.org/r/20190309000518.2438-2-tonyj@suse.de Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf report: Use less for scripts outputAndi Kleen1-113/+17
The UI viewer for scripts output has a lot of limitations: limited size, no search or save function, slow, and various other issues. Just use 'less' to display directly on the terminal instead. This won't work in GTK mode, but GTK doesn't support these context menus anyways. If that is ever done could use an terminal for the output. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Feng Tang <feng.tang@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190309055628.21617-8-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf session: Add process callback to reader objectJiri Olsa1-4/+19
Adding callback function to reader object so callers can process data in different ways. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190308134745.5057-7-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf header: Add DIR_FORMAT feature to describe directory dataJiri Olsa5-3/+59
The data files layout is described by HEADER_DIR_FORMAT feature. Currently it holds only version number (1): uint64_t version; The current version holds only version value (1) means that data files: - Follow the 'data.*' name format. - Contain raw events data in standard perf format as read from kernel (and need to be sorted) Future versions are expected to describe different data files layout according to special needs. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190308134745.5057-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf data: Make perf_data__size() work over directoryJiri Olsa2-5/+18
Make perf_data__size() return proper size for directory data, summing up all the individual file sizes. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190308134745.5057-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf data: Add perf_data__update_dir() functionJiri Olsa2-0/+21
Add perf_data__update_dir() to update the size for every file within the perf.data directory. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190308134745.5057-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf data: Don't store auxtrace index for directory data fileJiri Olsa1-1/+1
We can't store the auxtrace index when we store into multiple files, because we keep only offset for it, not the file. The auxtrace data will be processed correctly in the 'pipe' mode. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190308134745.5057-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf data: Support having perf.data stored as a directoryJiri Olsa3-1/+58
The caller needs to set 'struct perf_data::is_dir flag and the path will be treated as a directory. The 'struct perf_data::file' is initialized and open as 'path/header' file. Add a check to the direcory interface functions to check the is_dir flag. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190308134745.5057-2-jolsa@kernel.org [ Be consistent on how to signal failure, i.e. use -1 and let users check errno ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf vendor events amd: perf PMU events for AMD Family 17hMartin Liška7-0/+829
Thi patch adds PMC events for AMD Family 17 CPUs as defined in [1]. It covers events described in section: 2.1.13. Regex pattern in mapfile.csv covers all CPUs of the family. [1] https://support.amd.com/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf Signed-off-by: Martin Liška <mliska@suse.cz> Acked-by: Borislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jon Grimm <jon.grimm@amd.com> Cc: Martin Jambor <mjambor@suse.cz> Cc: William Cohen <wcohen@redhat.com> Link: https://lkml.kernel.org/r/d65873ca-e402-b198-4fe9-8c4af81258c8@suse.cz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf probe: Fix getting the kernel mapAdrian Hunter1-2/+4
Since commit 4d99e4136580 ("perf machine: Workaround missing maps for x86 PTI entry trampolines"), perf tools has been creating more than one kernel map, however 'perf probe' assumed there could be only one. Fix by using machine__kernel_map() to get the main kernel map. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Joseph Qi <joseph.qi@linux.alibaba.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiufei Xue <jiufei.xue@linux.alibaba.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Cc: Xu Yu <xuyu@linux.alibaba.com> Fixes: 4d99e4136580 ("perf machine: Workaround missing maps for x86 PTI entry trampolines") Fixes: d83212d5dd67 ("kallsyms, x86: Export addresses of PTI entry trampolines") Link: http://lkml.kernel.org/r/2ed432de-e904-85d2-5c36-5897ddc5b23b@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf report: Parse time quantumAndi Kleen4-0/+49
Many workloads change over time. 'perf report' currently aggregates the whole time range reported in perf.data. This patch adds an option for a time quantum to quantisize the perf.data over time. This just adds the option, will be used in follow on patches for a time sort key. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20190305144758.12397-6-andi@firstfloor.org [ Use NSEC_PER_[MU]SEC ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf time-utils: Add utility function to print time stamps in nanosecondsAndi Kleen2-0/+9
Add a utility function to print nanosecond timestamps. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20190305144758.12397-11-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf report: Support output in nanosecondsAndi Kleen5-6/+11
Upcoming changes add timestamp output in perf report. Add a --ns argument similar to perf script to support nanoseconds resolution when needed. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20190305144758.12397-5-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11perf script: Support insn output for normal samplesAndi Kleen4-1/+59
perf script -F +insn was only working for PT traces because the PT instruction decoder was filling in the insn/insn_len sample attributes. Support it for non PT samples too on x86 using the existing x86 instruction decoder. This adds some extra checking to ensure that we don't try to decode instructions when using perf.data from a different architecture. % perf record -a sleep 1 % perf script -F ip,sym,insn --xed ffffffff811704c9 remote_function movl %eax, 0x18(%rbx) ffffffff8100bb50 intel_bts_enable_local retq ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi) ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi) ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi) ffffffff810f1f79 generic_exec_single xor %eax, %eax ffffffff811704c9 remote_function movl %eax, 0x18(%rbx) ffffffff8100bb34 intel_bts_enable_local movl 0x2000(%rax), %edx ffffffff81048610 native_apic_mem_write mov %edi, %edi ... Committer testing: Before: # perf script -F ip,sym,insn --xed | head -5 ffffffffa4068804 native_write_msr addb %al, (%rax) ffffffffa4068804 native_write_msr addb %al, (%rax) ffffffffa4068804 native_write_msr addb %al, (%rax) ffffffffa4068806 native_write_msr addb %al, (%rax) ffffffffa4068806 native_write_msr addb %al, (%rax) # perf script -F ip,sym,insn --xed | grep -v "addb %al, (%rax)" # After: # perf script -F ip,sym,insn --xed | head -5 ffffffffa4068804 native_write_msr wrmsr ffffffffa4068804 native_write_msr wrmsr ffffffffa4068804 native_write_msr wrmsr ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1) ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1) # perf script -F ip,sym,insn --xed | grep -v "addb %al, (%rax)" | head -5 ffffffffa4068804 native_write_msr wrmsr ffffffffa4068804 native_write_msr wrmsr ffffffffa4068804 native_write_msr wrmsr ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1) ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1) # More examples: # perf script -F ip,sym,insn --xed | grep -v native_write_msr | head ffffffffa416b90e tick_check_broadcast_expired btq %rax, 0x1a5f42a(%rip) ffffffffa4956bd0 nmi_cpu_backtrace pushq %r13 ffffffffa415b95e __hrtimer_next_event_base movq 0x18(%rax), %rdx ffffffffa4956bf3 nmi_cpu_backtrace popq %r12 ffffffffa4171d5c smp_call_function_single pause ffffffffa4956bdd nmi_cpu_backtrace mov %ebp, %r12d ffffffffa4797e4d menu_select cmp $0x190, %rax ffffffffa4171d5c smp_call_function_single pause ffffffffa405a7d8 nmi_cpu_backtrace_handler callq 0xffffffffa4956bd0 ffffffffa4797f7a menu_select shr $0x3, %rax # Which matches the annotate output modulo resolving callqs: # perf annotate --stdio2 nmi_cpu_backtrace_handler Samples: 4 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 35908, [percent: local period] nmi_cpu_backtrace_handler() /lib/modules/5.0.0+/build/vmlinux Percent Disassembly of section .text: ffffffff8105a7d0 <nmi_cpu_backtrace_handler>: nmi_cpu_backtrace_handler(): nmi_trigger_cpumask_backtrace(mask, exclude_self, nmi_raise_cpu_backtrace); } static int nmi_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs) { 24.45 → callq __fentry__ if (nmi_cpu_backtrace(regs)) mov %rsi,%rdi 75.55 → callq nmi_cpu_backtrace return NMI_HANDLED; movzbl %al,%eax return NMI_DONE; } ← retq # # perf annotate --stdio2 __hrtimer_next_event_base Samples: 4 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 767977, [percent: local period] __hrtimer_next_event_base() /lib/modules/5.0.0+/build/vmlinux Percent Disassembly of section .text: ffffffff8115b910 <__hrtimer_next_event_base>: __hrtimer_next_event_base(): static ktime_t __hrtimer_next_event_base(struct hrtimer_cpu_base *cpu_base, const struct hrtimer *exclude, unsigned int active, ktime_t expires_next) { → callq __fentry__ <SNIP> 4a: add $0x1,%r14 77.31 mov 0x18(%rax),%rdx shl $0x6,%r14 sub 0x38(%rbx,%r14,1),%rdx if (expires < expires_next) { cmp %r12,%rdx ↓ jge 68 <SNIP> Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20190305144758.12397-3-andi@firstfloor.org [ Converted fetch_exe() to use the name it ended up having when merged: thread__memcpy() ] [ archinsn.c needs the instruction decoder that is only build when CONFIG_AUXTRACE=y, fix that ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-10Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds49-446/+974
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Thomas Gleixner: "Perf updates and fixes: Kernel: - Handle events which have the bpf_event attribute set as side band events as they carry information about BPF programs. - Add missing switch-case fall-through comments Libraries: - Fix leaks and double frees in error code paths. - Prevent buffer overflows in libtraceevent Tools: - Improvements in handling Intel BT/PTS - Add BTF ELF markers to perf trace BPF programs to improve output - Support --time, --cpu, --pid and --tid filters for perf diff - Calculate the column width in perf annotate as the hardcoded 6 characters for the instruction are not sufficient - Small fixes all over the place" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits) perf/core: Mark expected switch fall-through perf/x86/intel/uncore: Fix client IMC events return huge result perf/ring_buffer: Use high order allocations for AUX buffers optimistically perf data: Force perf_data__open|close zero data->file.path perf session: Fix double free in perf_data__close perf evsel: Probe for precise_ip with simple attr perf tools: Read and store caps/max_precise in perf_pmu perf hist: Fix memory leak of srcline perf hist: Add error path into hist_entry__init perf c2c: Fix c2c report for empty numa node perf script python: Add Python3 support to intel-pt-events.py perf script python: Add Python3 support to event_analyzing_sample.py perf script python: add Python3 support to check-perf-trace.py perf script python: Add Python3 support to futex-contention.py perf script python: Remove mixed indentation perf diff: Support --pid/--tid filter options perf diff: Support --cpu filter option perf diff: Support --time filter option perf thread: Generalize function to copy from thread addr space from intel-bts code perf annotate: Calculate the max instruction name, align column to that ...
2019-03-09Merge tag 'perf-core-for-mingo-5.1-20190307' of ↵Ingo Molnar49-446/+974
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/core changes from Arnaldo Carvalho de Melo: perf bpf: Arnaldo Carvalho de Melo: - Automatically add BTF ELF markers to 'perf trace' BPF programs, so that tools such as 'bpftool map dump' can pretty print map keys and values. perf c2c: Jiri Olsa: - Fix report for empty NUMA node. perf diff: Jin Yao: - Support --time, --cpu, --pid and --tid filter options. perf probe: Arnaldo Carvalho de Melo: - Clarify error message about not finding kernel modules debuginfo. perf record: Jiri Olsa: - Fixup probing for max attr.precise_ip. perf trace: Arnaldo Carvalho de Melo: - Add missing %s lost in the 'msg_flags' recvmmsg arg when adding prefix suppression logic. perf annotate: Arnaldo Carvalho de Melo: - Calculate the max instruction name, align column to that, removing the hardcoded max 6 chars and cope with instructions with names longer than that, such as vpmovmskb, vpcmpeqb, etc. kernel: Song Liu: - Consider events with attr.bpf_event set as side-band. Gustavo A. R. Silva: - Mark expected switch fall-through in perf_event_parse_addr_filter(). Libraries: Jiri Olsa: - Fix leaks and double frees on error paths. libtraceevent: Tony Jones: - Fix buffer overflow in arg_eval(). python scripting: Tony Jones: - More python3 fixes. Trivial: Yang Wei: - Remove needless extra semicolon in clang C++ glue code. Intel PT/BTS: Adrian Hunter: - Improve auxtrace address filter error message when there is no DSO. - Fix divide by zero when TSC is not available. - Further improvements to the export to sqlite/posgresql python scripts and to the GUI sqlviewer, exporting 'parent_id' so that we have enable the creation of call trees. Andi Kleen: - Generalize function to copy from thread addr space from intel-bts code. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-03-06perf data: Force perf_data__open|close zero data->file.pathJiri Olsa1-2/+2
Making sure the data->file.path is zeroed on perf_data__open error path and in perf_data__close, so we don't double free it in case someone call it twice. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Cc: Nageswara R Sastry <nasastry@in.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lkml.kernel.org/r/20190305152536.21035-9-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-06perf session: Fix double free in perf_data__closeJiri Olsa1-3/+1
We can't call perf_data__close and subsequently perf_session__delete, because it will call perf_data__close again and cause double free for data->file.path. $ perf report -i . incompatible file format (rerun with -v to learn more) free(): double free detected in tcache 2 Aborted (core dumped) In fact we don't need to call perf_data__close at all, because at the time the got out_close is reached, session->data is already initialized, so the perf_data__close call will be triggered from perf_session__delete. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Cc: Nageswara R Sastry <nasastry@in.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Fixes: 2d4f27999b88 ("perf data: Add global path holder") Link: http://lkml.kernel.org/r/20190305152536.21035-8-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-06perf evsel: Probe for precise_ip with simple attrJiri Olsa2-13/+20
Currently we probe for precise_ip with user specified perf_event_attr, which might fail because of unsupported kernel features, which would get disabled during the open time anyway. Switching the probe to take place on simple hw cycles, so the following record sets proper precise_ip: # perf record -e cycles:P ls # perf evlist -v cycles:P: size: 112, ... precise_ip: 3, ... Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Cc: Nageswara R Sastry <nasastry@in.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lkml.kernel.org/r/20190305152536.21035-7-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-06perf tools: Read and store caps/max_precise in perf_pmuJiri Olsa2-0/+15
Read the caps/max_precise value and store it in struct perf_pmu to be used when setting the maximum precise_ip field in following patch. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Cc: Nageswara R Sastry <nasastry@in.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lkml.kernel.org/r/20190305152536.21035-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-06perf hist: Fix memory leak of srclineJiri Olsa1-2/+12
We can't allocate he->srcline unconditionaly, only when new hist_entry is created. Moving he->srcline allocation into hist_entry__init function. Original-patch-by: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Suggested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Nageswara R Sastry <nasastry@in.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lkml.kernel.org/r/20190305152536.21035-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-06perf hist: Add error path into hist_entry__initJiri Olsa1-20/+19
Adding error path into hist_entry__init to unify error handling, so every new member does not need to free everything else. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: nageswara r sastry <nasastry@in.ibm.com> Link: http://lkml.kernel.org/r/20190305152536.21035-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-06perf c2c: Fix c2c report for empty numa nodeJiri Olsa1-2/+6
Ravi Bangoria reported that we fail with an empty NUMA node with the following message: $ lscpu NUMA node0 CPU(s): NUMA node1 CPU(s): 0-4 $ sudo ./perf c2c report node/cpu topology bugFailed setup nodes Fix this by detecting the empty node and keeping its CPU set empty. Reported-by: Nageswara R Sastry <nasastry@in.ibm.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190305152536.21035-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-06perf script python: Add Python3 support to intel-pt-events.pyTony Jones1-13/+19
Support both Python2 and Python3 in the intel-pt-events.py script There may be differences in the ordering of output lines due to differences in dictionary ordering etc. However the format within lines should be unchanged. The use of 'from __future__' implies the minimum supported Python2 version is now v2.6 Signed-off-by: Tony Jones <tonyj@suse.de> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: http://lkml.kernel.org/r/fd26acf9-0c0f-717f-9664-a3c33043ce19@suse.de Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>