summaryrefslogtreecommitdiffstats
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2020-10-14perf intel-pt: Improve PT documentation slightlyAndi Kleen1-0/+30
Document the higher level --insn-trace etc. perf script options. Include the howto how to build xed into the manpage Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20201014035346.4772-1-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14perf tools: Add support for exclusive groups/eventsAndi Kleen4-4/+68
Peter suggested that using the exclusive mode in perf could avoid some problems with bad scheduling of groups. Exclusive is implemented in the kernel, but wasn't exposed by the perf tool, so hard to use without custom low level API users. Add support for marking groups or events with :e for exclusive in the perf tool. The implementation is basically the same as the existing pinned attribute. Committer testing: # perf test "parse event" 6: Parse event definition strings : Ok # perf test -v "parse event" |& grep :u*e running test 56 'instructions:uep' running test 57 '{cycles,cache-misses,branch-misses}:e' # # # grep "model name" -m1 /proc/cpuinfo model name : AMD Ryzen 9 3900X 12-Core Processor # # perf stat -a -e '{cycles,cache-misses,branch-misses}:e' sleep 1 Performance counter stats for 'system wide': <not counted> cycles (0.00%) <not counted> cache-misses (0.00%) <not counted> branch-misses (0.00%) 1.001269893 seconds time elapsed Some events weren't counted. Try disabling the NMI watchdog: echo 0 > /proc/sys/kernel/nmi_watchdog perf stat ... echo 1 > /proc/sys/kernel/nmi_watchdog # echo 0 > /proc/sys/kernel/nmi_watchdog # perf stat -a -e '{cycles,cache-misses,branch-misses}:e' sleep 1 Performance counter stats for 'system wide': 1,298,663,141 cycles 30,962,215 cache-misses 5,325,150 branch-misses 1.001474934 seconds time elapsed # # The output for asking for precise events on AMD needs to improve, it # supposedly works only for system wide or per CPU # # perf stat -a -e '{cycles,cache-misses,branch-misses}:uep' sleep 1 Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cycles). /bin/dmesg | grep -i perf may provide additional information. # perf stat -a -e '{cycles,cache-misses,branch-misses}:ue' sleep 1 Performance counter stats for 'system wide': 746,363,126 cycles 16,881,611 cache-misses 2,871,259 branch-misses 1.001636066 seconds time elapsed # Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20201014144255.22699-1-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14perf test: Add build id shell testJiri Olsa1-0/+101
Add a test for the build id cache that adds a binary with sha1 and md5 build ids and verifies it's added properly. The test updates build id cache with 'perf record' and 'perf buildid-cache -a'. Committer testing: # perf test "build id" 82: build id cache operations : Ok # # perf test -v "build id" 82: build id cache operations : --- start --- test child forked, pid 447218 test binaries: /tmp/perf.ex.SHA1.B8I /tmp/perf.ex.MD5.7Nv Adding d1abc1eb7568358cf23c959566f23462461834d1 /tmp/perf.ex.SHA1.B8I: Ok build id: d1abc1eb7568358cf23c959566f23462461834d1 link: /tmp/perf.debug.sS2/.build-id/d1/abc1eb7568358cf23c959566f23462461834d1 file: /tmp/perf.debug.sS2/.build-id/d1/../../tmp/perf.ex.SHA1.B8I/d1abc1eb7568358cf23c959566f23462461834d1/elf OK for /tmp/perf.ex.SHA1.B8I Adding a50e350e97c43b4708d09bcd85ebfff7 /tmp/perf.ex.MD5.7Nv: Ok build id: a50e350e97c43b4708d09bcd85ebfff7 link: /tmp/perf.debug.IuW/.build-id/a5/0e350e97c43b4708d09bcd85ebfff7 file: /tmp/perf.debug.IuW/.build-id/a5/../../tmp/perf.ex.MD5.7Nv/a50e350e97c43b4708d09bcd85ebfff7/elf OK for /tmp/perf.ex.MD5.7Nv [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.034 MB /tmp/perf.data.xrH ] build id: d1abc1eb7568358cf23c959566f23462461834d1 link: /tmp/perf.debug.eGR/.build-id/d1/abc1eb7568358cf23c959566f23462461834d1 file: /tmp/perf.debug.eGR/.build-id/d1/../../tmp/perf.ex.SHA1.B8I/d1abc1eb7568358cf23c959566f23462461834d1/elf OK for /tmp/perf.ex.SHA1.B8I [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.034 MB /tmp/perf.data.cbE ] build id: a50e350e97c43b4708d09bcd85ebfff7 link: /tmp/perf.debug.82t/.build-id/a5/0e350e97c43b4708d09bcd85ebfff7 file: /tmp/perf.debug.82t/.build-id/a5/../../tmp/perf.ex.MD5.7Nv/a50e350e97c43b4708d09bcd85ebfff7/elf OK for /tmp/perf.ex.MD5.7Nv test child finished with 0 ---- end ---- build id cache operations: Ok # Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20201013192441.1299447-10-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14perf tools: Align buildid list output for short build idsJiri Olsa3-4/+5
With shorter md5 build ids we need to align their paths properly with other build ids: $ perf buildid-list 17f4e448cc746582ea1881528deb549f7fdb3fd5 [kernel.kallsyms] a50e350e97c43b4708d09bcd85ebfff7 .../tools/perf/buildid-ex-md5 1805c738c8f3ec0f47b7ea09080c28f34d18a82b /usr/lib64/ld-2.31.so $ Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20201013192441.1299447-9-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14perf tools: Add size to 'struct perf_record_header_build_id'Jiri Olsa3-7/+23
We do not store size with build ids in perf data, but there's enough space to do it. Adding misc bit PERF_RECORD_MISC_BUILD_ID_SIZE to mark build id event with size. With this fix the dso with md5 build id will have correct build id data and will be usable for debuginfod processing if needed (coming in following patches). Committer notes: Use %zu with size_t to fix this error on 32-bit arches: util/header.c: In function '__event_process_build_id': util/header.c:2105:3: error: format '%lu' expects argument of type 'long unsigned int', but argument 6 has type 'size_t' [-Werror=format=] pr_debug("build id event received for %s: %s [%lu]\n", ^ Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20201013192441.1299447-8-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14perf tools: Pass build_id object to dso__build_id_equal()Jiri Olsa4-6/+11
Passing build_id object to dso__build_id_equal(), so we can properly check build id with different size than sha1. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20201013192441.1299447-7-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14perf tools: Pass build_id object to dso__set_build_id()Jiri Olsa5-6/+8
Passing build_id object to dso__set_build_id(), so it's easier to initialize dos's build id object. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20201013192441.1299447-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14perf tools: Pass build_id object to build_id__sprintf()Jiri Olsa12-35/+43
Passing build_id object to build_id__sprintf function, so it can operate with the proper size of build id. This will create proper md5 build id readable names, like following: a50e350e97c43b4708d09bcd85ebfff7 instead of: a50e350e97c43b4708d09bcd85ebfff700000000 Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20201013192441.1299447-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14perf tools: Pass build id object to sysfs__read_build_id()Jiri Olsa6-23/+16
Passing build id object to sysfs__read_build_id function, so it can populate the size of the build_id object. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20201013192441.1299447-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14perf tools: Pass build_id object to filename__read_build_id()Jiri Olsa11-57/+60
Pass a build_id object to filename__read_build_id function, so it can populate the size of the build_id object. Changing filename__read_build_id() code for both ELF/non-ELF code. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20201013192441.1299447-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14perf tools: Use build_id object in dsoJiri Olsa15-29/+34
Replace build_id byte array with struct build_id object and all the code that references it. The objective is to carry size together with build id array, so it's better to keep both together. This is preparatory change for following patches, and there's no functional change. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20201013192441.1299447-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13perf config: Export the perf_config_from_file() functionArnaldo Carvalho de Melo2-1/+3
We'll use it to ask for extra config files to be loaded, profile like stuff that will be used first to make 'perf trace' mimic 'strace' output via a 'perf strace' command that just sets up 'perf trace' output. At some point it'll be used for regression tests, where we'll run some simple commands like: perf strace ls > perf-strace.output strace ls > strace.output And then do some mutable syscall arg aware diff like tool to deal with arguments for things like mmap, that change at each execution, to be first ignored and then properly tracked when used accoss multiple syscalls. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13perf python: Autodetect python3 binaryJames Clark1-6/+9
Some distros don't come with python2 and only have python3 available. This causes the "'import perf' in python" self test to fail. This change adds python3 to the list of possible python versions that are autodetected but maintains the priorities for 'python2' and 'python' detection. Python3 has the lowest priority. Committer notes: On a fedora system without python2 packages the 'perf test python' continues to work: # python2 bash: python2: command not found... Similar command is: 'python' # rpm -qa | grep python2 # That "Similar command" gives the clue: # rpm -qf /usr/bin/python python-unversioned-command-3.8.5-5.fc32.noarch # rpm -ql python-unversioned-command /usr/bin/python /usr/share/man/man1/python.1.gz # With it in place the 'python' binary is found and perf builds the python binding using python3: # perf test -v python 19: 'import perf' in python : --- start --- test child forked, pid 379988 python usage test: "echo "import sys ; sys.path.append('/tmp/build/perf/python'); import perf" | '/usr/bin/python' " test child finished with 0 ---- end ---- 'import perf' in python: Ok # Looking at that path: # ls -la /tmp/build/perf/python total 1864 drwxrwxr-x. 2 acme acme 60 Oct 13 16:20 . drwxrwxr-x. 18 acme acme 4420 Oct 13 16:28 .. -rwxrwxr-x. 1 acme acme 1907216 Oct 13 16:28 perf.cpython-38-x86_64-linux-gnu.so # And: # ldd ~/bin/perf | grep python libpython3.8.so.1.0 => /lib64/libpython3.8.so.1.0 (0x00007f5471187000) # As soon as we remove it: # rpm -e python-unversioned-command-3.8.5-5.fc32.noarch # hash -r # python bash: python: command not found... Install package 'python-unversioned-command' to provide command 'python'? [N/y] n # And rebuilding perf now doesn't find python in the system: make: Entering directory '/home/acme/git/perf/tools/perf' BUILD: Doing 'make -j24' parallel build <SNIP> Makefile.config:786: No python interpreter was found: disables Python support - please install python-devel/python-dev <SNIP> After this patch: $ rpm -qi python-unversioned-command package python-unversioned-command is not installed $ $ python bash: python: command not found... Install package 'python-unversioned-command' to provide command 'python'? [N/y] ^C $ $ m make: Entering directory '/home/acme/git/perf/tools/perf' BUILD: Doing 'make -j24' parallel build <SNIP> CC /tmp/build/perf/tests/attr.o CC /tmp/build/perf/tests/python-use.o DESCEND plugins GEN /tmp/build/perf/python/perf.so INSTALL trace_plugins LD /tmp/build/perf/tests/perf-in.o LD /tmp/build/perf/perf-in.o LINK /tmp/build/perf/perf <SNIP> make: Leaving directory '/home/acme/git/perf/tools/perf' 19: 'import perf' in python : Ok $ ldd ~/bin/perf | grep python libpython3.8.so.1.0 => /lib64/libpython3.8.so.1.0 (0x00007f2c8c708000) $ ls -la /tmp/build/perf/python total 1864 drwxrwxr-x. 2 acme acme 60 Oct 13 16:20 . drwxrwxr-x. 18 acme acme 4420 Oct 13 16:31 .. -rwxrwxr-x. 1 acme acme 1907216 Oct 13 16:31 perf.cpython-38-x86_64-linux-gnu.so $ Signed-off-by: James Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> LPU-Reference: 20201005080645.6588-1-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13perf tests: Show python test script in verbose modeArnaldo Carvalho de Melo1-0/+1
To help figure out where it is getting the binding. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13perf build: Allow nested externs to enable BUILD_BUG() usageVasily Gorbik1-1/+1
Currently BUILD_BUG() macro is expanded to smth like the following: do { extern void __compiletime_assert_0(void) __attribute__((error("BUILD_BUG failed"))); if (!(!(1))) __compiletime_assert_0(); } while (0); If used in a function body this obviously would produce build errors with -Wnested-externs and -Werror. To enable BUILD_BUG() usage in tools/arch/x86/lib/insn.c which perf includes in intel-pt-decoder, build perf without -Wnested-externs. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> # build tested Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lore.kernel.org/lkml/patch-1.thread-251403.git-2514037e9477.your-ad-here.call-01602244460-ext-7088@work.hours Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13perf trace: Fix off by ones in memset() after realloc() in arches using libauditJiri Slaby1-1/+5
'perf trace ls' started crashing after commit d21cb73a9025 on !HAVE_SYSCALL_TABLE_SUPPORT configs (armv7l here) like this: 0 strlen () at ../sysdeps/arm/armv6t2/strlen.S:126 1 0xb6800780 in __vfprintf_internal (s=0xbeff9908, s@entry=0xbeff9900, format=0xa27160 "]: %s()", ap=..., mode_flags=<optimized out>) at vfprintf-internal.c:1688 ... 5 0x0056ecdc in fprintf (__fmt=0xa27160 "]: %s()", __stream=<optimized out>) at /usr/include/bits/stdio2.h:100 6 trace__sys_exit (trace=trace@entry=0xbeffc710, evsel=evsel@entry=0xd968d0, event=<optimized out>, sample=sample@entry=0xbeffc3e8) at builtin-trace.c:2475 7 0x00566d40 in trace__handle_event (sample=0xbeffc3e8, event=<optimized out>, trace=0xbeffc710) at builtin-trace.c:3122 ... 15 main (argc=2, argv=0xbefff6e8) at perf.c:538 It is because memset in trace__read_syscall_info zeroes wrong memory: 1) when initializing for the first time, it does not reset the last id. 2) in other cases, it resets the last id of previous buffer. ad 1) it causes the crash above as sc->name used in the fprintf above contains garbage. ad 2) it sets nonexistent from true back to false for id 11 here. Not sure, what the consequences are. So fix it by introducing a special case for the initial initialization and do the right +1 in both cases. Fixes: d21cb73a9025 ("perf trace: Grow the syscall table as needed when using libaudit") Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20201001093419.15761-1-jslaby@suse.cz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13perf c2c: Update usage for showing memory eventsLeo Yan1-1/+1
Since commit b027cc6fdf1b ("perf c2c: Fix 'perf c2c record -e list' to show the default events used"), "perf c2c" tool can show the memory events properly, it's no reason to still suggest user to use the command "perf mem record -e list" for showing events. This patch updates the usage for showing memory events with command "perf c2c record -e list". Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Ian Rogers <irogers@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20201011121022.22409-1-leo.yan@linaro.org
2020-10-13Merge branch 'perf/urgent' into perf/coreArnaldo Carvalho de Melo4-23/+65
To pick fixes that missed v5.9. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13tools lib traceevent: Hide non API functionsTzvetomir Stoyanov (VMware)6-107/+83
There are internal library functions, which are not declared as a static. They are used inside the library from different files. Hide them from the library users, as they are not part of the API. These functions are made hidden and are renamed without the prefix "tep_": tep_free_plugin_paths tep_peek_char tep_buffer_init tep_get_input_buf_ptr tep_get_input_buf tep_read_token tep_free_token tep_free_event tep_free_format_field __tep_parse_format Link: https://lore.kernel.org/linux-trace-devel/e4afdd82deb5e023d53231bb13e08dca78085fb0.camel@decadent.org.uk/ Reported-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: linux-trace-devel@vger.kernel.org Link: http://lore.kernel.org/lkml/20200930110733.280534-1-tz.stoyanov@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13perf sched: Show start of latency as wellJoel Fernandes (Google)1-10/+14
The 'perf sched latency' tool is really useful at showing worst-case latencies that task encountered since wakeup. However it shows only the end of the latency. Often times the start of a latency is interesting as it can show what else was going on at the time to cause the latency. I certainly myself spending a lot of time backtracking to the start of the latency in "perf sched script" which wastes a lot of time. This patch therefore adds a new column "Max delay start". Considering this, also rename "Maximum delay at" to "Max delay end" as its easier to understand. Example of the new output: ---------------------------------------------------------------------------------------------------------------------------------- Task | Runtime ms | Switches | Avg delay ms | Max delay ms | Max delay start | Max delay end | ---------------------------------------------------------------------------------------------------------------------------------- MediaScannerSer:11936 | 651.296 ms | 67978 | avg: 0.113 ms | max: 77.250 ms | max start: 477.691360 s | max end: 477.768610 s audio@2.0-servi:(3) | 0.000 ms | 3440 | avg: 0.034 ms | max: 72.267 ms | max start: 477.697051 s | max end: 477.769318 s AudioOut_1D:8112 | 0.000 ms | 2588 | avg: 0.083 ms | max: 64.020 ms | max start: 477.710740 s | max end: 477.774760 s Time-limited te:14973 | 7966.090 ms | 24807 | avg: 0.073 ms | max: 15.563 ms | max start: 477.162746 s | max end: 477.178309 s surfaceflinger:8049 | 9.680 ms | 603 | avg: 0.063 ms | max: 13.275 ms | max start: 476.931791 s | max end: 476.945067 s HeapTaskDaemon:(3) | 1588.830 ms | 7040 | avg: 0.065 ms | max: 6.880 ms | max start: 473.666043 s | max end: 473.672922 s mount-passthrou:(3) | 1370.809 ms | 68904 | avg: 0.011 ms | max: 6.524 ms | max start: 478.090630 s | max end: 478.097154 s ReferenceQueueD:(3) | 11.794 ms | 1725 | avg: 0.014 ms | max: 6.521 ms | max start: 476.119782 s | max end: 476.126303 s writer:14077 | 18.410 ms | 1427 | avg: 0.036 ms | max: 6.131 ms | max start: 474.169675 s | max end: 474.175805 s Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20200925235634.4089867-1-joel@joelfernandes.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13perf vendor events: Fix typos in power8 PMU eventsSandipan Das5-25/+25
This replaces the incorrectly spelled word "localtion" with "location" in some power8 PMU event descriptions. Fixes: 2a81fa3bb5ed ("perf vendor events: Add power8 PMU events") Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lore.kernel.org/lkml/20201012050205.328523-1-sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13perf bench: Run inject-build-id with --buildid-all option tooNamhyung Kim1-19/+35
For comparison, it now runs the benchmark twice - one if regular -b and another for --buildid-all. $ perf bench internals inject-build-id # Running 'internals/inject-build-id' benchmark: Average build-id injection took: 21.002 msec (+- 0.172 msec) Average time per event: 2.059 usec (+- 0.017 usec) Average memory usage: 8169 KB (+- 0 KB) Average build-id-all injection took: 19.543 msec (+- 0.124 msec) Average time per event: 1.916 usec (+- 0.012 usec) Average memory usage: 7348 KB (+- 0 KB) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20201012070214.2074921-7-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13perf inject: Add --buildid-all optionNamhyung Kim2-6/+113
Like 'perf record', we can even more speedup build-id processing by just using all DSOs. Then we don't need to look at all the sample events anymore. The following patch will update 'perf bench' to show the result of the --buildid-all option too. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Original-patch-by: Stephane Eranian <eranian@google.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20201012070214.2074921-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13perf inject: Do not load map/dso when injecting build-idNamhyung Kim3-37/+27
No need to load symbols in a DSO when injecting build-id. I guess the reason was to check the DSO is a special file like anon files. Use some helper functions in map.c to check them before reading build-id. Also pass sample event's cpumode to a new build-id event. It brought a speedup in the benchmark of 25 -> 21 msec on my laptop. Also the memory usage (Max RSS) went down by ~200 KB. # Running 'internals/inject-build-id' benchmark: Average build-id injection took: 21.389 msec (+- 0.138 msec) Average time per event: 2.097 usec (+- 0.014 usec) Average memory usage: 8225 KB (+- 0 KB) Committer notes: Before: $ perf stat -r5 perf bench internals inject-build-id > /dev/null Performance counter stats for 'perf bench internals inject-build-id' (5 runs): 4,020.56 msec task-clock:u # 1.271 CPUs utilized ( +- 0.74% ) 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 123,354 page-faults:u # 0.031 M/sec ( +- 0.81% ) 7,119,951,568 cycles:u # 1.771 GHz ( +- 1.74% ) (83.27%) 230,086,969 stalled-cycles-frontend:u # 3.23% frontend cycles idle ( +- 1.97% ) (83.41%) 1,168,298,765 stalled-cycles-backend:u # 16.41% backend cycles idle ( +- 1.13% ) (83.44%) 11,173,083,669 instructions:u # 1.57 insn per cycle # 0.10 stalled cycles per insn ( +- 1.58% ) (83.31%) 2,413,908,936 branches:u # 600.392 M/sec ( +- 1.69% ) (83.26%) 46,576,289 branch-misses:u # 1.93% of all branches ( +- 2.20% ) (83.31%) 3.1638 +- 0.0309 seconds time elapsed ( +- 0.98% ) $ After: $ perf stat -r5 perf bench internals inject-build-id > /dev/null Performance counter stats for 'perf bench internals inject-build-id' (5 runs): 2,379.94 msec task-clock:u # 1.473 CPUs utilized ( +- 0.18% ) 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 62,584 page-faults:u # 0.026 M/sec ( +- 0.07% ) 2,372,389,668 cycles:u # 0.997 GHz ( +- 0.29% ) (83.14%) 106,937,862 stalled-cycles-frontend:u # 4.51% frontend cycles idle ( +- 4.89% ) (83.20%) 581,697,915 stalled-cycles-backend:u # 24.52% backend cycles idle ( +- 0.71% ) (83.47%) 3,659,692,199 instructions:u # 1.54 insn per cycle # 0.16 stalled cycles per insn ( +- 0.10% ) (83.63%) 791,372,961 branches:u # 332.518 M/sec ( +- 0.27% ) (83.39%) 10,648,083 branch-misses:u # 1.35% of all branches ( +- 0.22% ) (83.16%) 1.61570 +- 0.00172 seconds time elapsed ( +- 0.11% ) $ Signed-off-by: Namhyung Kim <namhyung@kernel.org> Original-patch-by: Stephane Eranian <eranian@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20201012070214.2074921-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13perf inject: Enter namespace when reading build-idNamhyung Kim1-2/+6
It should be in a proper mnt namespace when accessing the file. I think this had no problem since the build-id was actually read from map__load() -> dso__load() already. But I'd like to change it in the following commit. Acked-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20201012070214.2074921-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13perf inject: Add missing callbacks in perf_toolNamhyung Kim1-5/+31
I found some events (like PERF_RECORD_CGROUP) are not copied by perf inject due to the missing callbacks. Let's add them. While at it, I've changed the order of the callbacks to match with struct perf_tool so that we can compare them easily. Acked-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20201012070214.2074921-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13perf bench: Add build-id injection benchmarkNamhyung Kim6-5/+471
Sometimes I can see that 'perf record' piped with 'perf inject' take a long time processing build-ids. So introduce a inject-build-id benchmark to the internals benchmark suite to measure its overhead regularly. It runs the 'perf inject' command internally and feeds the given number of synthesized events (MMAP2 + SAMPLE basically). Usage: perf bench internals inject-build-id <options> -i, --iterations <n> Number of iterations used to compute average (default: 100) -m, --nr-mmaps <n> Number of mmap events for each iteration (default: 100) -n, --nr-samples <n> Number of sample events per mmap event (default: 100) -v, --verbose be more verbose (show iteration count, DSO name, etc) By default, it measures average processing time of 100 MMAP2 events and 10000 SAMPLE events. Below is a result on my laptop. $ perf bench internals inject-build-id # Running 'internals/inject-build-id' benchmark: Average build-id injection took: 25.789 msec (+- 0.202 msec) Average time per event: 2.528 usec (+- 0.020 usec) Average memory usage: 8411 KB (+- 7 KB) Committer testing: $ perf bench Usage: perf bench [<common options>] <collection> <benchmark> [<options>] # List of all available benchmark collections: sched: Scheduler and IPC benchmarks syscall: System call benchmarks mem: Memory access benchmarks numa: NUMA scheduling and MM benchmarks futex: Futex stressing benchmarks epoll: Epoll stressing benchmarks internals: Perf-internals benchmarks all: All benchmarks $ perf bench internals # List of available benchmarks for collection 'internals': synthesize: Benchmark perf event synthesis kallsyms-parse: Benchmark kallsyms parsing inject-build-id: Benchmark build-id injection $ perf bench internals inject-build-id # Running 'internals/inject-build-id' benchmark: Average build-id injection took: 14.202 msec (+- 0.059 msec) Average time per event: 1.392 usec (+- 0.006 usec) Average memory usage: 12650 KB (+- 10 KB) Average build-id-all injection took: 12.831 msec (+- 0.071 msec) Average time per event: 1.258 usec (+- 0.007 usec) Average memory usage: 11895 KB (+- 10 KB) $ $ perf stat -r5 perf bench internals inject-build-id # Running 'internals/inject-build-id' benchmark: Average build-id injection took: 14.380 msec (+- 0.056 msec) Average time per event: 1.410 usec (+- 0.006 usec) Average memory usage: 12608 KB (+- 11 KB) Average build-id-all injection took: 11.889 msec (+- 0.064 msec) Average time per event: 1.166 usec (+- 0.006 usec) Average memory usage: 11838 KB (+- 10 KB) # Running 'internals/inject-build-id' benchmark: Average build-id injection took: 14.246 msec (+- 0.065 msec) Average time per event: 1.397 usec (+- 0.006 usec) Average memory usage: 12744 KB (+- 10 KB) Average build-id-all injection took: 12.019 msec (+- 0.066 msec) Average time per event: 1.178 usec (+- 0.006 usec) Average memory usage: 11963 KB (+- 10 KB) # Running 'internals/inject-build-id' benchmark: Average build-id injection took: 14.321 msec (+- 0.067 msec) Average time per event: 1.404 usec (+- 0.007 usec) Average memory usage: 12690 KB (+- 10 KB) Average build-id-all injection took: 11.909 msec (+- 0.041 msec) Average time per event: 1.168 usec (+- 0.004 usec) Average memory usage: 11938 KB (+- 10 KB) # Running 'internals/inject-build-id' benchmark: Average build-id injection took: 14.287 msec (+- 0.059 msec) Average time per event: 1.401 usec (+- 0.006 usec) Average memory usage: 12864 KB (+- 10 KB) Average build-id-all injection took: 11.862 msec (+- 0.058 msec) Average time per event: 1.163 usec (+- 0.006 usec) Average memory usage: 12103 KB (+- 10 KB) # Running 'internals/inject-build-id' benchmark: Average build-id injection took: 14.402 msec (+- 0.053 msec) Average time per event: 1.412 usec (+- 0.005 usec) Average memory usage: 12876 KB (+- 10 KB) Average build-id-all injection took: 11.826 msec (+- 0.061 msec) Average time per event: 1.159 usec (+- 0.006 usec) Average memory usage: 12111 KB (+- 10 KB) Performance counter stats for 'perf bench internals inject-build-id' (5 runs): 4,267.48 msec task-clock:u # 1.502 CPUs utilized ( +- 0.14% ) 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 102,092 page-faults:u # 0.024 M/sec ( +- 0.08% ) 3,894,589,578 cycles:u # 0.913 GHz ( +- 0.19% ) (83.49%) 140,078,421 stalled-cycles-frontend:u # 3.60% frontend cycles idle ( +- 0.77% ) (83.34%) 948,581,189 stalled-cycles-backend:u # 24.36% backend cycles idle ( +- 0.46% ) (83.25%) 5,835,587,719 instructions:u # 1.50 insn per cycle # 0.16 stalled cycles per insn ( +- 0.21% ) (83.24%) 1,267,423,636 branches:u # 296.996 M/sec ( +- 0.22% ) (83.12%) 17,484,290 branch-misses:u # 1.38% of all branches ( +- 0.12% ) (83.55%) 2.84176 +- 0.00222 seconds time elapsed ( +- 0.08% ) $ Acked-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20201012070214.2074921-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-07perf stat: Fix out of bounds CPU map access when handling armv8_pmu eventsNamhyung Kim1-0/+3
It was reported that 'perf stat' crashed when using with armv8_pmu (CPU) events with the task mode. As 'perf stat' uses an empty cpu map for task mode but armv8_pmu has its own cpu mask, it has confused which map it should use when accessing file descriptors and this causes segfaults: (gdb) bt #0 0x0000000000603fc8 in perf_evsel__close_fd_cpu (evsel=<optimized out>, cpu=<optimized out>) at evsel.c:122 #1 perf_evsel__close_cpu (evsel=evsel@entry=0x716e950, cpu=7) at evsel.c:156 #2 0x00000000004d4718 in evlist__close (evlist=0x70a7cb0) at util/evlist.c:1242 #3 0x0000000000453404 in __run_perf_stat (argc=3, argc@entry=1, argv=0x30, argv@entry=0xfffffaea2f90, run_idx=119, run_idx@entry=1701998435) at builtin-stat.c:929 #4 0x0000000000455058 in run_perf_stat (run_idx=1701998435, argv=0xfffffaea2f90, argc=1) at builtin-stat.c:947 #5 cmd_stat (argc=1, argv=0xfffffaea2f90) at builtin-stat.c:2357 #6 0x00000000004bb888 in run_builtin (p=p@entry=0x9764b8 <commands+288>, argc=argc@entry=4, argv=argv@entry=0xfffffaea2f90) at perf.c:312 #7 0x00000000004bbb54 in handle_internal_command (argc=argc@entry=4, argv=argv@entry=0xfffffaea2f90) at perf.c:364 #8 0x0000000000435378 in run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:408 #9 main (argc=4, argv=0xfffffaea2f90) at perf.c:538 To fix this, I simply used the given cpu map unless the evsel actually is not a system-wide event (like uncore events). Fixes: 7736627b865d ("perf stat: Use affinity for closing file descriptors") Reported-by: Wei Li <liwei391@huawei.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Barry Song <song.bao.hua@hisilicon.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20201007081311.1831003-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-01perf python scripting: Fix printable strings in python3 scriptsJiri Olsa1-1/+1
Hagen reported broken strings in python3 tracepoint scripts: make PYTHON=python3 perf record -e sched:sched_switch -a -- sleep 5 perf script --gen-script py perf script -s ./perf-script.py [..] sched__sched_switch 7 563231.759525792 0 swapper prev_comm=bytearray(b'swapper/7\x00\x00\x00\x00\x00\x00\x00'), prev_pid=0, prev_prio=120, prev_state=, next_comm=bytearray(b'mutex-thread-co\x00'), The problem is in the is_printable_array function that does not take the zero byte into account and claim such string as not printable, so the code will create byte array instead of string. Committer testing: After this fix: sched__sched_switch 3 484522.497072626 1158680 kworker/3:0-eve prev_comm=kworker/3:0, prev_pid=1158680, prev_prio=120, prev_state=I, next_comm=swapper/3, next_pid=0, next_prio=120 Sample: {addr=0, cpu=3, datasrc=84410401, datasrc_decode=N/A|SNP N/A|TLB N/A|LCK N/A, ip=18446744071841817196, period=1, phys_addr=0, pid=1158680, tid=1158680, time=484522497072626, transaction=0, values=[(0, 0)], weight=0} sched__sched_switch 4 484522.497085610 1225814 perf prev_comm=perf, prev_pid=1225814, prev_prio=120, prev_state=, next_comm=migration/4, next_pid=30, next_prio=0 Sample: {addr=0, cpu=4, datasrc=84410401, datasrc_decode=N/A|SNP N/A|TLB N/A|LCK N/A, ip=18446744071841817196, period=1, phys_addr=0, pid=1225814, tid=1225814, time=484522497085610, transaction=0, values=[(0, 0)], weight=0} Fixes: 249de6e07458 ("perf script python: Fix string vs byte array resolving") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Hagen Paul Pfeifer <hagen@jauu.net> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/20200928201135.3633850-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-01perf trace: Use the autogenerated mmap 'prot' string/id tableArnaldo Carvalho de Melo2-27/+24
No change in behaviour: # perf trace -e mmap sleep 1 0.000 ( 0.009 ms): sleep/751870 mmap(len: 143317, prot: READ, flags: PRIVATE, fd: 3) = 0x7fa96d0f7000 0.028 ( 0.004 ms): sleep/751870 mmap(len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS) = 0x7fa96d0f5000 0.037 ( 0.005 ms): sleep/751870 mmap(len: 1872744, prot: READ, flags: PRIVATE|DENYWRITE, fd: 3) = 0x7fa96cf2b000 0.044 ( 0.011 ms): sleep/751870 mmap(addr: 0x7fa96cf50000, len: 1376256, prot: READ|EXEC, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x25000) = 0x7fa96cf50000 0.056 ( 0.007 ms): sleep/751870 mmap(addr: 0x7fa96d0a0000, len: 307200, prot: READ, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x175000) = 0x7fa96d0a0000 0.064 ( 0.007 ms): sleep/751870 mmap(addr: 0x7fa96d0eb000, len: 24576, prot: READ|WRITE, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x1bf000) = 0x7fa96d0eb000 0.075 ( 0.005 ms): sleep/751870 mmap(addr: 0x7fa96d0f1000, len: 13160, prot: READ|WRITE, flags: PRIVATE|FIXED|ANONYMOUS) = 0x7fa96d0f1000 0.253 ( 0.005 ms): sleep/751870 mmap(len: 218049136, prot: READ, flags: PRIVATE, fd: 3) = 0x7fa95ff38000 # # # set -o vi # strace -e mmap sleep 1 mmap(NULL, 143317, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f333bd83000 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f333bd81000 mmap(NULL, 1872744, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f333bbb7000 mmap(0x7f333bbdc000, 1376256, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x25000) = 0x7f333bbdc000 mmap(0x7f333bd2c000, 307200, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x175000) = 0x7f333bd2c000 mmap(0x7f333bd77000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1bf000) = 0x7f333bd77000 mmap(0x7f333bd7d000, 13160, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f333bd7d000 mmap(NULL, 218049136, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f332ebc4000 +++ exited with 0 +++ # And you can as well tweak 'perf trace's output to more closely match strace's: # perf config trace.show_arg_names=no # perf config trace.show_duration=no # perf config trace.show_prefix=yes # perf config trace.show_timestamp=no # perf config trace.show_zeros=yes # perf config trace.no_inherit=yes # perf trace -e mmap sleep 1 mmap(NULL, 143317, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f0d287ca000 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS) = 0x7f0d287c8000 mmap(NULL, 1872744, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f0d285fe000 mmap(0x7f0d28623000, 1376256, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x25000) = 0x7f0d28623000 mmap(0x7f0d28773000, 307200, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x175000) = 0x7f0d28773000 mmap(0x7f0d287be000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1bf000) = 0x7f0d287be000 mmap(0x7f0d287c4000, 13160, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS) = 0x7f0d287c4000 mmap(NULL, 218049136, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f0d1b60b000 # # perf config | grep ^trace trace.show_arg_names=no trace.show_duration=no trace.show_prefix=yes trace.show_timestamp=no trace.show_zeros=yes trace.no_inherit=yes # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-01tools beauty: Add script to generate table of mmap's 'prot' argumentArnaldo Carvalho de Melo1-0/+30
Will be wired up in the following csets: $ tools/perf/trace/beauty/mmap_prot.sh static const char *mmap_prot[] = { [ilog2(0x1) + 1] = "READ", #ifndef PROT_READ #define PROT_READ 0x1 #endif [ilog2(0x2) + 1] = "WRITE", #ifndef PROT_WRITE #define PROT_WRITE 0x2 #endif [ilog2(0x4) + 1] = "EXEC", #ifndef PROT_EXEC #define PROT_EXEC 0x4 #endif [ilog2(0x8) + 1] = "SEM", #ifndef PROT_SEM #define PROT_SEM 0x8 #endif [ilog2(0x01000000) + 1] = "GROWSDOWN", #ifndef PROT_GROWSDOWN #define PROT_GROWSDOWN 0x01000000 #endif [ilog2(0x02000000) + 1] = "GROWSUP", #ifndef PROT_GROWSUP #define PROT_GROWSUP 0x02000000 #endif }; $ $ $ $ tools/perf/trace/beauty/mmap_prot.sh alpha static const char *mmap_prot[] = { [ilog2(0x4) + 1] = "EXEC", #ifndef PROT_EXEC #define PROT_EXEC 0x4 #endif [ilog2(0x01000000) + 1] = "GROWSDOWN", #ifndef PROT_GROWSDOWN #define PROT_GROWSDOWN 0x01000000 #endif [ilog2(0x02000000) + 1] = "GROWSUP", #ifndef PROT_GROWSUP #define PROT_GROWSUP 0x02000000 #endif [ilog2(0x1) + 1] = "READ", #ifndef PROT_READ #define PROT_READ 0x1 #endif [ilog2(0x8) + 1] = "SEM", #ifndef PROT_SEM #define PROT_SEM 0x8 #endif [ilog2(0x2) + 1] = "WRITE", #ifndef PROT_WRITE #define PROT_WRITE 0x2 #endif }; $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-30perf beauty mmap_flags: Conditionaly define the mmap flagsArnaldo Carvalho de Melo1-8/+8
So that in older systems we get it in the mmap flags scnprintf routines: $ tools/perf/trace/beauty/mmap_flags.sh | head -9 2> /dev/null static const char *mmap_flags[] = { [ilog2(0x40) + 1] = "32BIT", #ifndef MAP_32BIT #define MAP_32BIT 0x40 #endif [ilog2(0x01) + 1] = "SHARED", #ifndef MAP_SHARED #define MAP_SHARED 0x01 #endif $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-29perf trace beauty: Add script to autogenerate mremap's flags args string/id ↵Arnaldo Carvalho de Melo3-19/+39
table It'll also conditionally generate the defines, so that if we don't have those when building a new tool tarball in an older systems, we get those, and we need them sometimes in the actual scnprintf routine, such as when checking if a flags means we have an extra arg, like with MREMAP_FIXED. $ tools/perf/trace/beauty/mremap_flags.sh static const char *mremap_flags[] = { [ilog2(1) + 1] = "MAYMOVE", #ifndef MREMAP_MAYMOVE #define MREMAP_MAYMOVE 1 #endif [ilog2(2) + 1] = "FIXED", #ifndef MREMAP_FIXED #define MREMAP_FIXED 2 #endif [ilog2(4) + 1] = "DONTUNMAP", #ifndef MREMAP_DONTUNMAP #define MREMAP_DONTUNMAP 4 #endif }; $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-29perf tools: Separate the checking of headers only used to build ↵Arnaldo Carvalho de Melo1-2/+20
beautification tables Some headers are not used in building the tools directly, but instead to generate tables that then gets source code included to do id->string and string->id lookups for things like syscall flags and commands. We were adding it directly to tools/include/ and this sometimes gets in the way of building using system headers, lets untangle this a bit. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28Merge remote-tracking branch 'torvalds/master' into perf/coreArnaldo Carvalho de Melo12-11/+106
To pick up fixes and get v5.10 development in sync with the main kernel sources. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28perf test: Fix msan uninitialized use.Ian Rogers1-1/+1
Ensure 'st' is initialized before an error branch is taken. Fixes test "67: Parse and process metrics" with LLVM msan: ==6757==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x5570edae947d in rblist__exit tools/perf/util/rblist.c:114:2 #1 0x5570edb1c6e8 in runtime_stat__exit tools/perf/util/stat-shadow.c:141:2 #2 0x5570ed92cfae in __compute_metric tools/perf/tests/parse-metric.c:187:2 #3 0x5570ed92cb74 in compute_metric tools/perf/tests/parse-metric.c:196:9 #4 0x5570ed92c6d8 in test_recursion_fail tools/perf/tests/parse-metric.c:318:2 #5 0x5570ed92b8c8 in test__parse_metric tools/perf/tests/parse-metric.c:356:2 #6 0x5570ed8de8c1 in run_test tools/perf/tests/builtin-test.c:410:9 #7 0x5570ed8ddadf in test_and_print tools/perf/tests/builtin-test.c:440:9 #8 0x5570ed8dca04 in __cmd_test tools/perf/tests/builtin-test.c:661:4 #9 0x5570ed8dbc07 in cmd_test tools/perf/tests/builtin-test.c:807:9 #10 0x5570ed7326cc in run_builtin tools/perf/perf.c:313:11 #11 0x5570ed731639 in handle_internal_command tools/perf/perf.c:365:8 #12 0x5570ed7323cd in run_argv tools/perf/perf.c:409:2 #13 0x5570ed731076 in main tools/perf/perf.c:539:3 Fixes: commit f5a56570a3f2 ("perf test: Fix memory leaks in parse-metric test") Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: clang-built-linux@googlegroups.com Link: http://lore.kernel.org/lkml/20200923210655.4143682-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28perf parse-events: Reduce casts around bp_addrIan Rogers3-7/+7
perf_event_attr bp_addr is a u64. parse-events.y parses it as a u64, but casts it to a void* and then parse-events.c casts it back to a u64. Rather than all the casts, change the type of the address to be a u64. This removes an issue noted in: https://lore.kernel.org/lkml/20200903184359.GC3495158@kernel.org/ Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200925003903.561568-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28perf test: Add expand cgroup event testNamhyung Kim4-0/+247
It'll expand given events for cgroups A, B and C. $ perf test -v expansion 69: Event expansion for cgroups : --- start --- test child forked, pid 983140 metric expr 1 / IPC for CPI metric expr instructions / cycles for IPC found event instructions found event cycles adding {instructions,cycles}:W copying metric event for cgroup 'A': instructions (idx=0) copying metric event for cgroup 'B': instructions (idx=0) copying metric event for cgroup 'C': instructions (idx=0) test child finished with 0 ---- end ---- Event expansion for cgroups: Ok Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200924124455.336326-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28perf tools: Allow creation of cgroup without openNamhyung Kim3-9/+14
This is a preparation for a test case of expanding events for multiple cgroups. Instead of using real system cgroup, the test will use fake cgroups so it needs a way to have them without a open file descriptor. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200924124455.336326-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28perf tools: Copy metric events properly when expand cgroupsNamhyung Kim8-3/+149
The metricgroup__copy_metric_events() is to handle metrics events when expanding event for cgroups. As the metric events keep pointers to evsel, it should be refreshed when events are cloned during the operation. The perf_stat__collect_metric_expr() is also called in case an event has a metric directly. During the copy, it references evsel by index as the evlist now has cloned evsels for the given cgroup. Also kernel test robot found an issue in the python module import so add empty implementations of those two functions to fix it. Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200924124455.336326-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28perf stat: Add --for-each-cgroup optionNamhyung Kim5-1/+112
The --for-each-cgroup option is a syntax sugar to monitor large number of cgroups easily. Current command line requires to list all the events and cgroups even if users want to monitor same events for each cgroup. This patch addresses that usage by copying given events for each cgroup on user's behalf. For instance, if they want to monitor 6 events for 200 cgroups each they should write 1200 event names (with -e) AND 1200 cgroup names (with -G) on the command line. But with this change, they can just specify 6 events and 200 cgroups with a new option. A simpler example below: It wants to measure 3 events for 2 cgroups ('A' and 'B'). The result is that total 6 events are counted like below. $ perf stat -a -e cpu-clock,cycles,instructions --for-each-cgroup A,B sleep 1 Performance counter stats for 'system wide': 988.18 msec cpu-clock A # 0.987 CPUs utilized 3,153,761,702 cycles A # 3.200 GHz (100.00%) 8,067,769,847 instructions A # 2.57 insn per cycle (100.00%) 982.71 msec cpu-clock B # 0.982 CPUs utilized 3,136,093,298 cycles B # 3.182 GHz (99.99%) 8,109,619,327 instructions B # 2.58 insn per cycle (99.99%) 1.001228054 seconds time elapsed Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200924124455.336326-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28perf evsel: Add evsel__clone() functionNamhyung Kim2-39/+158
The evsel__clone() is to create an exactly same evsel from same attributes. The function assumes the given evsel is not configured yet so it cares fields set during event parsing. Those fields are now moved together as Jiri suggested. Note that metric events will be handled by later patch. It will be used by perf stat to generate separate events for each cgroup. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200924124455.336326-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28perf vendor events: Update SkylakeX events to v1.21Jin Yao10-3565/+4129
- Update SkylakeX events to v1.21. - Update SkylakeX JSON metrics from TMAM 4.0. Other fixes: - Add NO_NMI_WATCHDOG metric constraint to Backend_Bound - Fix misspelled error Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/lkml/20200922031918.3723-1-yao.jin@linux.intel.com/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28perf vendor events intel: Update CascadelakeX events to v1.08Jin Yao8-995/+1067
- Update CascadelakeX events to v1.08. - Update CascadelakeX JSON metrics from TMAM 4.0. Other fixes: - Add NO_NMI_WATCHDOG metric constraint to Backend_Bound - Change 'MB/sec' to 'MB' in UNC_M_PMM_BANDWIDTH. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Link: https://lore.kernel.org/lkml/20200922031918.3723-1-yao.jin@linux.intel.com/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-25Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-1/+1
Pull more kvm fixes from Paolo Bonzini: "Five small fixes. The nested migration bug will be fixed with a better API in 5.10 or 5.11, for now this is a fix that works with existing userspace but keeps the current ugly API" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: SVM: Add a dedicated INVD intercept routine KVM: x86: Reset MMU context if guest toggles CR4.SMAP or CR4.PKE KVM: x86: fix MSR_IA32_TSC read for nested migration selftests: kvm: Fix assert failure in single-step test KVM: x86: VMX: Make smaller physical guest address space support user-configurable
2020-09-23Merge tag 'trace-v5.9-rc5-2' of ↵Linus Torvalds1-0/+25
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull bootconfig fixes from Steven Rostedt: "A couple of fixes for bootconfig. Masami discovered two bugs which this fixes and he added tests to cover these issues. - Fix a bug that breaks bootconfig tree nodes - Fix a bug that does not truncate whitespace properly - Add tests to cover the above two cases" * tag 'trace-v5.9-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tools/bootconfig: Add testcase for tailing space tools/bootconfig: Add testcases for repeated key with brace lib/bootconfig: Fix to remove tailing spaces after value lib/bootconfig: Fix a bug of breaking existing tree nodes
2020-09-23perf script: Add min, max to futex-contention output, in addition to avgHagen Paul Pfeifer1-2/+2
Average is quite informative, but the outliners - especially max - are also of interest. Before: mutex-locker[793299] lock 5637ec61e080 contended 3400 times, 446 avg ns mutex-locker[793301] lock 5637ec61e080 contended 3563 times, 385 avg ns mutex-locker[793300] lock 5637ec61e080 contended 3110 times, 1855 avg ns After: mutex-locker[795251] lock 55b14e6dd080 contended 3853 times, 1279 avg ns [max: 12270 ns, min 340 ns] mutex-locker[795253] lock 55b14e6dd080 contended 2911 times, 518 avg ns [max: 51660261 ns, min 347 ns] mutex-locker[795252] lock 55b14e6dd080 contended 3843 times, 385 avg ns [max: 24323998 ns, min 338 ns] Committer testing: [root@five ~]# perf script record futex-contention -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.877 MB perf.data (923 samples) ] [root@five ~]# perf evlist syscalls:sys_enter_futex syscalls:sys_exit_futex dummy:HG # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events # Before: [root@five ~]# perf script report futex-contention JS Helper[2457] lock 55fe0cf82610 contended 4 times, 6657 avg ns ibus-daemon[2975] lock 56227f6d0210 contended 4 times, 1020 avg ns chromium-browse[1905801] lock 7ffe573f5088 contended 8 times, 108463 avg ns gnome-shell[2240] lock 55fe0cf82678 contended 1 times, 8616 avg ns gnome-shel:cs0[2292] lock 55fe0d0ab768 contended 3 times, 606016034 avg ns JS Helper[2458] lock 55fe0cf82690 contended 1 times, 1167840 avg ns chromium-browse[1905470] lock 7ffe573f5358 contended 1 times, 551504 avg ns chromium-browse[1905948] lock 7ffe573f5358 contended 1 times, 577422 avg ns gnome-shell[2240] lock 55fe0cf82660 contended 6 times, 202696 avg ns pool[2602] lock 7fd600008ef0 contended 1 times, 500046007 avg ns chromium-browse[1905801] lock 7ffe573f5128 contended 4 times, 285083 avg ns JS Helper[2460] lock 55fe0cf82690 contended 1 times, 680877 avg ns JS Helper[2459] lock 55fe0cf82610 contended 7 times, 4224 avg ns chromium-browse[1905434] lock 7ffe573f5358 contended 1 times, 697038 avg ns chromium-browse[212592] lock 7ffe573f53c8 contended 4 times, 460601 avg ns gnome-shel:cs0[2292] lock 55fe0d0ab76c contended 2 times, 601237648 avg ns JS Helper[2460] lock 55fe0cf82610 contended 4 times, 3340 avg ns JS Helper[2462] lock 55fe0cf82694 contended 1 times, 237275 avg ns chromium-browse[1905605] lock 7ffe573f5358 contended 2 times, 634555 avg ns chromium-browse[1905992] lock 7ffe573f5358 contended 1 times, 583965 avg ns chromium-browse[1905647] lock 7ffe573f5368 contended 8 times, 549800 avg ns JS Helper[2462] lock 55fe0cf82610 contended 2 times, 4694 avg ns JS Helper[2461] lock 55fe0cf82694 contended 1 times, 257793 avg ns JS Helper[2456] lock 55fe0cf82690 contended 1 times, 677771 avg ns JS Helper[2463] lock 55fe0cf82610 contended 3 times, 5139 avg ns gdbus[2980] lock 56227f6d0210 contended 2 times, 2465 avg ns gnome-shell[2240] lock 55fe0cf82664 contended 5 times, 8036 avg ns chromium-browse[1906308] lock 7ffe573f5358 contended 1 times, 210735 avg ns JS Helper[2463] lock 55fe0cf82694 contended 1 times, 251531 avg ns chromium-browse[1905801] lock 7ffe573f4f58 contended 4 times, 399927 avg ns [root@five ~]# After: [root@five ~]# perf script report futex-contention JS Helper[2457] lock 55fe0cf82610 contended 4 times, 6657 avg ns [max: 11502 ns, min 792 ns] ibus-daemon[2975] lock 56227f6d0210 contended 4 times, 1020 avg ns [max: 1813 ns, min 581 ns] chromium-browse[1905801] lock 7ffe573f5088 contended 8 times, 108463 avg ns [max: 380103 ns, min 57989 ns] gnome-shell[2240] lock 55fe0cf82678 contended 1 times, 8616 avg ns [max: 8616 ns, min 8616 ns] gnome-shel:cs0[2292] lock 55fe0d0ab768 contended 3 times, 606016034 avg ns [max: 611295960 ns, min 600191357 ns] JS Helper[2458] lock 55fe0cf82690 contended 1 times, 1167840 avg ns [max: 1167840 ns, min 1167840 ns] chromium-browse[1905470] lock 7ffe573f5358 contended 1 times, 551504 avg ns [max: 551504 ns, min 551504 ns] chromium-browse[1905948] lock 7ffe573f5358 contended 1 times, 577422 avg ns [max: 577422 ns, min 577422 ns] gnome-shell[2240] lock 55fe0cf82660 contended 6 times, 202696 avg ns [max: 398998 ns, min 5050 ns] pool[2602] lock 7fd600008ef0 contended 1 times, 500046007 avg ns [max: 500046007 ns, min 500046007 ns] chromium-browse[1905801] lock 7ffe573f5128 contended 4 times, 285083 avg ns [max: 389531 ns, min 76183 ns] JS Helper[2460] lock 55fe0cf82690 contended 1 times, 680877 avg ns [max: 680877 ns, min 680877 ns] JS Helper[2459] lock 55fe0cf82610 contended 7 times, 4224 avg ns [max: 12724 ns, min 1012 ns] chromium-browse[1905434] lock 7ffe573f5358 contended 1 times, 697038 avg ns [max: 697038 ns, min 697038 ns] chromium-browse[212592] lock 7ffe573f53c8 contended 4 times, 460601 avg ns [max: 594956 ns, min 232996 ns] gnome-shel:cs0[2292] lock 55fe0d0ab76c contended 2 times, 601237648 avg ns [max: 601255863 ns, min 601219434 ns] JS Helper[2460] lock 55fe0cf82610 contended 4 times, 3340 avg ns [max: 9168 ns, min 962 ns] JS Helper[2462] lock 55fe0cf82694 contended 1 times, 237275 avg ns [max: 237275 ns, min 237275 ns] chromium-browse[1905605] lock 7ffe573f5358 contended 2 times, 634555 avg ns [max: 1024060 ns, min 245050 ns] chromium-browse[1905992] lock 7ffe573f5358 contended 1 times, 583965 avg ns [max: 583965 ns, min 583965 ns] chromium-browse[1905647] lock 7ffe573f5368 contended 8 times, 549800 avg ns [max: 775293 ns, min 258375 ns] JS Helper[2462] lock 55fe0cf82610 contended 2 times, 4694 avg ns [max: 8556 ns, min 832 ns] JS Helper[2461] lock 55fe0cf82694 contended 1 times, 257793 avg ns [max: 257793 ns, min 257793 ns] JS Helper[2456] lock 55fe0cf82690 contended 1 times, 677771 avg ns [max: 677771 ns, min 677771 ns] JS Helper[2463] lock 55fe0cf82610 contended 3 times, 5139 avg ns [max: 6873 ns, min 931 ns] gdbus[2980] lock 56227f6d0210 contended 2 times, 2465 avg ns [max: 4188 ns, min 742 ns] gnome-shell[2240] lock 55fe0cf82664 contended 5 times, 8036 avg ns [max: 13105 ns, min 401 ns] chromium-browse[1906308] lock 7ffe573f5358 contended 1 times, 210735 avg ns [max: 210735 ns, min 210735 ns] JS Helper[2463] lock 55fe0cf82694 contended 1 times, 251531 avg ns [max: 251531 ns, min 251531 ns] chromium-browse[1905801] lock 7ffe573f4f58 contended 4 times, 399927 avg ns [max: 476904 ns, min 178495 ns] [root@five ~]# Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lore.kernel.org/lkml/20200922200922.1306034-1-hagen@jauu.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-23perf script: Autopep8 futex-contentionHagen Paul Pfeifer1-23/+28
10 years leaves its mark! Python has evolved and so has its style guide. Even with vim it is getting hard to follow the no longer valid guidelines (spaces vs. tabs). Autopep8 this code to modernize it! Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Link: http://lore.kernel.org/lkml/20200921201928.799498-1-hagen@jauu.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-23perf stat: Skip duration_time in setup_system_wideJin Yao1-1/+3
Some metrics (such as DRAM_BW_Use) consists of uncore events and duration_time. For uncore events, counter->core.system_wide is true. But for duration_time, counter->core.system_wide is false so target.system_wide is set to false. Then 'enable_on_exec' is set in perf_event_attr of uncore event. Kernel will return error when trying to open the uncore event. This patch skips the duration_time in setup_system_wide then target.system_wide will be set to true for the evlist of uncore events + duration_time. Before (tested on skylake desktop): # perf stat -M DRAM_BW_Use -- sleep 1 Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (arb/event=0x84,umask=0x1/). /bin/dmesg | grep -i perf may provide additional information. After: # perf stat -M DRAM_BW_Use -- sleep 1 Performance counter stats for 'system wide': 169 arb/event=0x84,umask=0x1/ # 0.00 DRAM_BW_Use 40,427 arb/event=0x81,umask=0x1/ 1,000,902,197 ns duration_time 1.000902197 seconds time elapsed Fixes: e3ba76deef23064f ("perf tools: Force uncore events to system wide monitoring") Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200922015004.30114-1-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-23selftests: kvm: Fix assert failure in single-step testYang Weijiang1-1/+1
This is a follow-up patch to fix an issue left in commit: 98b0bf02738004829d7e26d6cb47b2e469aaba86 selftests: kvm: Use a shorter encoding to clear RAX With the change in the commit, we also need to modify "xor" instruction length from 3 to 2 in array ss_size accordingly to pass below check: for (i = 0; i < (sizeof(ss_size) / sizeof(ss_size[0])); i++) { target_rip += ss_size[i]; CLEAR_DEBUG(); debug.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_SINGLESTEP; debug.arch.debugreg[7] = 0x00000400; APPLY_DEBUG(); vcpu_run(vm, VCPU_ID); TEST_ASSERT(run->exit_reason == KVM_EXIT_DEBUG && run->debug.arch.exception == DB_VECTOR && run->debug.arch.pc == target_rip && run->debug.arch.dr6 == target_dr6, "SINGLE_STEP[%d]: exit %d exception %d rip 0x%llx " "(should be 0x%llx) dr6 0x%llx (should be 0x%llx)", i, run->exit_reason, run->debug.arch.exception, run->debug.arch.pc, target_rip, run->debug.arch.dr6, target_dr6); } Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> Message-Id: <20200826015524.13251-1-weijiang.yang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>