summaryrefslogtreecommitdiffstats
path: root/tools/perf
AgeCommit message (Collapse)AuthorFilesLines
2022-11-23perf trace: Return error if a system call doesn't existLeo Yan1-2/+2
When a system call is not detected, the reason is either because the system call ID is out of scope or failure to find the corresponding path in the sysfs, trace__read_syscall_info() returns zero. Finally, without returning an error value it introduces confusion for the caller. This patch lets the function trace__read_syscall_info() to return -EEXIST when a system call doesn't exist. Fixes: b8b1033fcaa091d8 ("perf trace: Mark syscall ids that are not allocated to avoid unnecessary error messages") Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: bpf@vger.kernel.org Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20221121075237.127706-3-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf trace: Use macro RAW_SYSCALL_ARGS_NUM to replace numberLeo Yan1-4/+7
This patch defines a macro RAW_SYSCALL_ARGS_NUM to replace the open coded number '6'. Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20221121075237.127706-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf list: Add JSON output optionIan Rogers2-67/+245
Output events and metrics in a JSON format by overriding the print callbacks. Currently other command line options aren't supported and metrics are repeated once per metric group. Committer testing: $ perf list cache List of pre-defined events (to be used in -e or -M): L1-dcache-load-misses [Hardware cache event] L1-dcache-loads [Hardware cache event] L1-dcache-prefetches [Hardware cache event] L1-icache-load-misses [Hardware cache event] L1-icache-loads [Hardware cache event] branch-load-misses [Hardware cache event] branch-loads [Hardware cache event] dTLB-load-misses [Hardware cache event] dTLB-loads [Hardware cache event] iTLB-load-misses [Hardware cache event] iTLB-loads [Hardware cache event] $ perf list --json cache [ { "Unit": "cache", "EventName": "L1-dcache-load-misses", "EventType": "Hardware cache event" }, { "Unit": "cache", "EventName": "L1-dcache-loads", "EventType": "Hardware cache event" }, { "Unit": "cache", "EventName": "L1-dcache-prefetches", "EventType": "Hardware cache event" }, { "Unit": "cache", "EventName": "L1-icache-load-misses", "EventType": "Hardware cache event" }, { "Unit": "cache", "EventName": "L1-icache-loads", "EventType": "Hardware cache event" }, { "Unit": "cache", "EventName": "branch-load-misses", "EventType": "Hardware cache event" }, { "Unit": "cache", "EventName": "branch-loads", "EventType": "Hardware cache event" }, { "Unit": "cache", "EventName": "dTLB-load-misses", "EventType": "Hardware cache event" }, { "Unit": "cache", "EventName": "dTLB-loads", "EventType": "Hardware cache event" }, { "Unit": "cache", "EventName": "iTLB-load-misses", "EventType": "Hardware cache event" }, { "Unit": "cache", "EventName": "iTLB-loads", "EventType": "Hardware cache event" } ] $ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xin Gao <gaoxin@cdjrlc.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20221114210723.2749751-11-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf list: Reorganize to use callbacks to allow honouring command line optionsIan Rogers7-512/+624
Rather than controlling the list output with passed flags, add callbacks that are called when an event or metric are encountered. State is passed to the callback so that command line options can be respected, alternatively the callbacks can be changed. Fix a few bugs: - wordwrap to columns metric descriptions and expressions; - remove unnecessary whitespace after PMU event names; - the metric filter is a glob but matched using strstr which will always fail, switch to using a proper globmatch, - the detail flag gives details for extra kernel PMU events like branch-instructions. In metricgroup.c switch from struct mep being a rbtree of metricgroups containing a list of metrics, to the tree directly containing all the metrics. In general the alias for a name is passed to the print routine rather than being contained in the name with OR. Committer notes: Check the asprint() return to address this on fedora 36: util/print-events.c: In function ‘print_sdt_events’: util/print-events.c:183:33: error: ignoring return value of ‘asprintf’ declared with attribute ‘warn_unused_result’ [-Werror=unused-result] 183 | asprintf(&evt_name, "%s@%s(%.12s)", sdt_name->s, path, bid); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors $ gcc --version | head -1 gcc (GCC) 12.2.1 20220819 (Red Hat 12.2.1-2) $ Fix ps.pmu_glob setting when dealing with *:* events, it was being left with a freed pointer that then at the end of cmd_list() would be double freed. Check if pmu_name is NULL in default_print_event() before calling strglobmatch(pmu_name, ...) to avoid a segfault. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xin Gao <gaoxin@cdjrlc.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20221114210723.2749751-10-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf build: Fix LIBTRACEEVENT_DYNAMICIan Rogers2-4/+24
The tools/lib includes fixes break LIBTRACEVENT_DYNAMIC as the makefile erroneously had dependencies on building libtraceevent even when not linking with it. This change fixes the issues with LIBTRACEEVENT_DYNAMIC by making the built files optional. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20221116224631.207631-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf test: Replace data symbol test workload with datasymNamhyung Kim1-28/+1
So that it can get rid of requirement of a compiler. $ sudo ./perf test -v 109 109: Test data symbol : --- start --- test child forked, pid 844526 Recording workload... [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.354 MB /tmp/__perf_test.perf.data.GFeZO (4847 samples) ] Cleaning up files... test child finished with 0 ---- end ---- Test data symbol: Ok Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221116233854.1596378-13-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf test: Add 'datasym' test workloadNamhyung Kim4-0/+28
The datasym workload is to check if perf mem command gets the data addresses precisely. This is needed for data symbol test. $ perf test -w datasym I had to keep the buf1 in the data section, otherwise it could end up in the BSS and was mmaped as a separate //anon region, then it was not symbolized at all. It needs to be fixed separately. Committer notes: Add a -U _FORTIFY_SOURCE to the datasym CFLAGS, as the main perf flags set it and it requires building with optimization, and this new test has a -O0. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221116233854.1596378-12-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf test: Replace brstack test workloadNamhyung Kim1-54/+12
So that it can get rid of requirement of a compiler. Also rename the symbols to match with the perf test workload. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: James Clark <james.clark@arm.com> Acked-by: German Gomez <german.gomez@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221116233854.1596378-11-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf test: Add 'brstack' test workloadNamhyung Kim4-0/+44
The brstack is to run different kinds of branches repeatedly. This is necessary for brstack test case to verify if it has correct branch info. $ perf test -w brstack I renamed the internal functions to have brstack_ prefix as it's too generic name. Add a -U_FORTIFY_SOURCE to the brstack CFLAGS, as the main perf flags set it and it requires building with optimization, and this new test has a -O0. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221116233854.1596378-10-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf test: Replace arm spe fork test workload with sqrtloopNamhyung Kim1-43/+1
So that it can get rid of requirement of a compiler. I've also removed killall as it'll kill perf process now and run the test workload for 10 sec instead. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: James Clark <james.clark@arm.com> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221116233854.1596378-9-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf test: Add 'sqrtloop' test workloadNamhyung Kim4-0/+49
The sqrtloop creates a child process to run an infinite loop calling sqrt() with rand(). This is needed for ARM SPE fork test. $ perf test -w sqrtloop It can take an optional argument to specify how long it will run in seconds (default: 1). Committer notes: Explicitely ignored the sqrt() return to fix the build on systems where the compiler complains it isn't being used. And added a sqrtloop specific CFLAGS to disable optimizations to make this a bit more robust wrt dead code elimination. Doing that a -U_FORTIFY_SOURCE needs to be added, as -O0 is incompatible with it. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221116233854.1596378-8-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf test: Replace arm callgraph fp test workload with leafloopNamhyung Kim1-31/+3
So that it can get rid of requirement of a compiler. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221116233854.1596378-7-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf test: Add 'leafloop' test workloadNamhyung Kim4-0/+39
The leafloop workload is to run an infinite loop in the test_leaf function. This is needed for the ARM fp callgraph test to verify if it gets the correct callchains. $ perf test -w leafloop Committer notes: Add a: -U_FORTIFY_SOURCE to the leafloop CFLAGS as the main perf flags set it and it requires building with optimization, and this new test has a -O0. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221116233854.1596378-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-20perf test: Replace record test workload with thloopNamhyung Kim1-56/+3
So that it can get rid of requirements for a compiler. $ sudo ./perf test -v 92 92: perf record tests : --- start --- test child forked, pid 740204 Basic --per-thread mode test Basic --per-thread mode test [Success] Register capture test Register capture test [Success] Basic --system-wide mode test Basic --system-wide mode test [Success] Basic target workload test Basic target workload test [Success] test child finished with 0 ---- end ---- perf record tests: Ok Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221116233854.1596378-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-20perf test: Add 'thloop' test workloadNamhyung Kim4-0/+56
The thloop is similar to noploop but runs in two threads. This is needed to verify perf record --per-thread to handle multi-threaded programs properly. $ perf test -w thloop It also takes an optional argument to specify runtime in seconds (default: 1). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221116233854.1596378-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-20perf test: Replace pipe test workload with noploopNamhyung Kim1-45/+10
So that it can get rid of requirement of a compiler. Also define and use more local symbols to ease future changes. $ sudo ./perf test -v pipe 87: perf pipe recording and injection test : --- start --- test child forked, pid 748003 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] 748014 748014 -1 |perf [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] 99.83% perf perf [.] noploop [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] 99.85% perf perf [.] noploop [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.160 MB /tmp/perf.data.2XYPdw (4007 samples) ] 99.83% perf perf [.] noploop test child finished with 0 ---- end ---- perf pipe recording and injection test: Ok Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221116233854.1596378-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-20perf test: Add -w/--workload optionNamhyung Kim5-0/+83
The -w/--workload option is to run a simple workload used by testing. This adds a basic framework to run the workloads and 'noploop' workload as an example. $ perf test -w noploop The noploop does a loop doing nothing (NOP) for a second by default. It can have an optional argument to specify the time in seconds. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221116233854.1596378-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf build: Use tools/lib headers from install pathIan Rogers2-4/+14
Switch -I from tools/lib to the install path for the tools/lib libraries. Add the include_headers build targets to prepare target, as well as pmu-events.c compilation that dependes on libperf. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20221109184914.1357295-15-irogers@google.com Link: http://lore.kernel.org/lkml/20221116072211.2837834-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf cpumap: Tidy libperf includesIan Rogers6-5/+5
Use public API when possible, don't include internal API in header files in evsel.h. Fix any related breakages. Committer note: There was one missing case, when building for arm64: arch/arm64/util/pmu.c: In function 'pmu_events_table__find': arch/arm64/util/pmu.c:18:30: error: invalid use of undefined type 'struct perf_cpu_map' 18 | if (pmu->cpus->nr != cpu__max_cpu().cpu) | ^~ Fix it by adding one more exception, including <internal/cpumap.h> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20221109184914.1357295-14-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf thread_map: Reduce exposure of libperf internal APIIan Rogers10-9/+12
Remove unnecessary include of internal threadmap.h and refcount.h in thread_map.h. Switch to using public APIs when possible or including the internal header file in the C file. Fix a transitive dependency in openat-syscall.c broken by the clean up. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20221109184914.1357295-13-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf expr: Tidy hashmap dependencyIan Rogers9-18/+6
hashmap.h comes from libbpf but isn't installed with its headers. Always use the header file of the code in util. Change the hashmap.h dependency in expr.h to a forward declaration, add the necessary header file includes in the C files. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20221109184914.1357295-12-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf build: Install libsymbol locally when buildingIan Rogers4-10/+24
The perf build currently has a '-Itools/lib' on the CC command line. This causes issues as the libapi, libsubcmd, libtraceevent, libbpf and libsymbol headers are all found via this path, making it impossible to override include behavior. Change the libsymbol build mirroring the libbpf, libsubcmd, libapi, libperf and libtraceevent build, so that it is installed in a directory along with its headers. A later change will modify the include behavior. Don't build kallsyms.o as part of util as this will lead to duplicate definitions. Add kallsym's directory to the MANIFEST rather than individual files, so that the Build and Makefile are added to a source tar ball. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20221109184914.1357295-11-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf build: Install libtraceevent locally when buildingIan Rogers2-20/+40
The perf build currently has a '-Itools/lib' on the CC command line. This causes issues as the libapi, libsubcmd, libtraceevent, libbpf headers are all found via this path, making it impossible to override include behavior. Change the libtraceevent build mirroring the libbpf, libsubcmd, libapi and libperf build, so that it is installed in a directory along with its headers. A later change will modify the include behavior. Similarly, the plugins are now installed into libtraceevent_plugins except they have no header files. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20221109184914.1357295-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf build: Install libperf locally when buildingIan Rogers2-7/+15
The perf build currently has a '-Itools/lib' on the CC command line. This causes issues as the libapi, libsubcmd, libtraceevent, libbpf headers are all found via this path, making it impossible to override include behavior. Change the libperf build mirroring the libbpf, libsubcmd and libapi build, so that it is installed in a directory along with its headers. A later change will modify the include behavior. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20221109184914.1357295-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf build: Install libapi locally when buildingIan Rogers2-12/+16
The perf build currently has a '-Itools/lib' on the CC command line. This causes issues as the libapi, libsubcmd, libtraceevent, libbpf headers are all found via this path, making it impossible to override include behavior. Change the libapi build mirroring the libbpf and libsubcmd build, so that it is installed in a directory along with its headers. A later change will modify the include behavior. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20221109184914.1357295-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf build: Install libsubcmd locally when buildingIan Rogers2-8/+17
The perf build currently has a '-Itools/lib' on the CC command line. This causes issues as the libapi, libsubcmd, libtraceevent, libbpf headers are all found via this path, making it impossible to override include behavior. Change the libsubcmd build mirroring the libbpf build, so that it is installed in a directory along with its headers. A later change will modify the include behavior. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20221109184914.1357295-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf stat: Add print_aggr_cgroup() for --for-each-cgroup and --topdownNamhyung Kim1-1/+40
Normally, --for-each-cgroup only works with AGGR_GLOBAL. However the --topdown on some cpu (e.g. Intel Skylake) converts it to the AGGR_CORE internally. To support those machines, add print_aggr_cgroup and handle the events like in print_cgroup_events(). $ perf stat -a --for-each-cgroup system.slice,user.slice --topdown sleep 1 nmi_watchdog enabled with topdown. May give wrong results. Disable with echo 0 > /proc/sys/kernel/nmi_watchdog Performance counter stats for 'system wide': retiring bad speculation frontend bound backend bound S0-D0-C0 2 system.slice 49.0% -46.6% 31.4% S0-D0-C1 2 system.slice 55.5% 8.0% 45.5% -9.0% S0-D0-C2 2 system.slice 87.8% 22.1% 30.3% -40.3% S0-D0-C3 2 system.slice 53.3% -11.9% 45.2% 13.4% S0-D0-C0 2 user.slice 123.5% 4.0% 48.5% -75.9% S0-D0-C1 2 user.slice 19.9% 6.5% 89.9% -16.3% S0-D0-C2 2 user.slice 29.9% 7.9% 71.3% -9.1 S0-D0-C3 2 user.slice 28.0% 7.2% 43.3% 21.5% 1.004136937 seconds time elapsed Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-20-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf stat: Support --for-each-cgroup and --metric-onlyNamhyung Kim1-11/+47
When we have events for each cgroup, the metric should be printed for each cgroup separately. Add print_cgroup_counter() to handle that situation properly. Also change print_metric_headers() not to print duplicate headers by checking cgroups. $ perf stat -a --for-each-cgroup system.slice,user.slice --metric-only sleep 1 Performance counter stats for 'system wide': GHz insn per cycle branch-misses of all branches system.slice 3.792 0.61 3.24% user.slice 3.661 2.32 0.37% 1.016111516 seconds time elapsed Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-19-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf stat: Factor out print_metric_{begin,end}()Namhyung Kim1-22/+34
For the metric-only case, add new functions to handle the start and the end of each metric display. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-18-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf stat: Factor out prefix displayNamhyung Kim1-28/+15
The prefix is needed for interval mode to print timestamp at the beginning of each line. But the it's tricky for the metric only mode since it doesn't print every evsel and combines the metrics into a single line. So it needed to pass 'first' argument to print_counter_aggrdata() to determine if the current event is being printed at first. This makes the code hard to read. Let's move the logic out of the function and do it in the outer print loop. This would enable further cleanups later. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-17-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf stat: Move condition to print_footer()Namhyung Kim1-2/+4
Likewise, I think it'd better to have the control inside the function, and keep the higher level function clearer. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-16-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf stat: Rework header displayNamhyung Kim1-79/+106
There are print_header() and print_interval() to print header lines before actual counter values. Also print_metric_headers() needs to be called for the metric-only case. Let's move all these logics to a single place including num_print_iv to refresh the headers for interval mode. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-15-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf stat: Remove impossible conditionNamhyung Kim1-3/+0
The print would run only if metric_only is not set, but it's already in a block that says it's in metric_only case. And there's no place to change the setting. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-14-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf stat: Cleanup interval print alignmentNamhyung Kim1-74/+91
Instead of using magic values, define symbolic constants and use them. Also add aggr_header_std[] array to simplify aggr_mode handling. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-13-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf stat: Factor out prepare_interval()Namhyung Kim1-15/+24
This logic does not print the time directly, but it just puts the timestamp in the buffer as a prefix. To reduce the confusion, factor out the code into a separate function. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-12-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf stat: Split print_metric_headers() functionNamhyung Kim1-15/+37
The print_metric_headers() shows metric headers a little bit for each mode. Split it out to make the code clearer. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-11-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf stat: Align cgroup namesNamhyung Kim1-1/+1
We don't know how long cgroup name is, but at least we can align short ones like below. $ perf stat -a --for-each-cgroup system.slice,user.slice true Performance counter stats for 'system wide': 0.13 msec cpu-clock system.slice # 0.010 CPUs utilized 4 context-switches system.slice # 31.989 K/sec 1 cpu-migrations system.slice # 7.997 K/sec 0 page-faults system.slice # 0.000 /sec 450,673 cycles system.slice # 3.604 GHz (92.41%) 161,216 instructions system.slice # 0.36 insn per cycle (92.41%) 32,678 branches system.slice # 261.332 M/sec (92.41%) 2,628 branch-misses system.slice # 8.04% of all branches (92.41%) 14.29 msec cpu-clock user.slice # 1.163 CPUs utilized 35 context-switches user.slice # 2.449 K/sec 12 cpu-migrations user.slice # 839.691 /sec 57 page-faults user.slice # 3.989 K/sec 49,683,026 cycles user.slice # 3.477 GHz (99.38%) 110,790,266 instructions user.slice # 2.23 insn per cycle (99.38%) 24,552,255 branches user.slice # 1.718 G/sec (99.38%) 127,779 branch-misses user.slice # 0.52% of all branches (99.38%) 0.012289431 seconds time elapsed Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-10-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf stat: Add before_metric argumentNamhyung Kim1-40/+42
Unfortunately, event running time, percentage and noise data are printed in different positions in normal output than CSV/JSON. I think it's better to put such details in where it actually prints. So add before_metric argument to print_noise() and print_running() and call them twice before and after the metric. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-9-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf stat: Handle bad events in abs_printout()Namhyung Kim1-41/+27
In the printout() function, it checks if the event is bad (i.e. not counted or not supported) and print the result. But it does the same what abs_printout() is doing. So add an argument to indicate the value is ok or not and use the same function in both cases. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-8-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf stat: Factor out print_counter_value() functionNamhyung Kim1-28/+53
And split it for each output mode like others. I believe it makes the code simpler and more intuitive. Now abs_printout() becomes just to call sub-functions. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-7-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf stat: Split aggr_printout() functionNamhyung Kim1-99/+121
The aggr_printout() function is to print aggr_id and count (nr). Split it for each output mode to simplify the code. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf stat: Split print_cgroup() functionNamhyung Kim1-2/+19
Likewise, split print_cgroup() for each output mode. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf stat: Split print_noise_pct() functionNamhyung Kim1-4/+23
Likewise, split print_noise_pct() for each output mode. Although it's a tiny function, more logic will be added soon so it'd be better split it and treat it in the same way. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf stat: Split print_running() functionNamhyung Kim1-10/+27
To make the code more obvious and hopefully simpler, factor out the code for each output mode - stdio, CSV, JSON. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16perf stat: Clear screen only if output file is a ttyNamhyung Kim1-0/+8
The --interval-clear option makes perf stat to clear the terminal at each interval. But it doesn't need to clear the screen when it saves to a file. Make it fail when it's enabled with the output options. $ perf stat -I 1 --interval-clear -o myfile true --interval-clear does not work with output Usage: perf stat [<options>] [<command>] -o, --output <file> output file name --log-fd <n> log output to fd, instead of stderr --interval-clear clear screen in between new interval Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-15perf pmu: Restructure print_pmu_events() to avoid memory allocationsIan Rogers1-98/+110
Previously print_pmu_events() would compute the values to be printed, place them in struct sevent, sort them and then print them. Modify the code so that struct sevent holds just the PMU and event, sort these and then in the main print loop calculate aliases for names, etc. This avoids memory allocations for copied values as they are computed then printed. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xin Gao <gaoxin@cdjrlc.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20221114210723.2749751-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-15perf list: Simplify symbol event printingIan Rogers1-58/+21
The current code computes an array of symbol names then sorts and prints them. Use a strlist to create a list of names that is sorted and then print it. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xin Gao <gaoxin@cdjrlc.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20221114210723.2749751-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-15perf list: Simplify cache event printingIan Rogers1-103/+27
The current code computes an array of cache names then sorts and prints them. Use a strlist to create a list of names that is sorted. Keep the hybrid names, it is unclear how to generalize it, but drop the computation of evt_pmus that is never used. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xin Gao <gaoxin@cdjrlc.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20221114210723.2749751-7-irogers@google.com [ Fixed up clash with cf9f67b36303de65 ("perf print-events: Remove redundant comparison with zero")] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-15perf list: Generalize limiting to a PMU nameIan Rogers4-13/+18
Deprecate the --cputype option and add a --unit option where '--unit cpu_atom' behaves like '--cputype atom'. The --unit option can be used with arbitrary PMUs, for example: ``` $ perf list --unit msr pmu List of pre-defined events (to be used in -e or -M): msr/aperf/ [Kernel PMU event] msr/cpu_thermal_margin/ [Kernel PMU event] msr/mperf/ [Kernel PMU event] msr/pperf/ [Kernel PMU event] msr/smi/ [Kernel PMU event] msr/tsc/ [Kernel PMU event] ``` Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xin Gao <gaoxin@cdjrlc.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20221114210723.2749751-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-15perf tracepoint: Sort events in iteratorIan Rogers1-71/+37
In print_tracepoint_events() use tracing_events__scandir_alphasort() and scandir alphasort so that the subsystem and events are sorted and don't need a secondary qsort. Locally this results in the following change: ... ext4:ext4_zero_range [Tracepoint event] - fib6:fib6_table_lookup [Tracepoint event] fib:fib_table_lookup [Tracepoint event] + fib6:fib6_table_lookup [Tracepoint event] filelock:break_lease_block [Tracepoint event] ... ie fib6 now is after fib and not before it. This is more consistent with how numbers are more generally sorted, such as: ... syscalls:sys_enter_renameat [Tracepoint event] syscalls:sys_enter_renameat2 [Tracepoint event] ... and so an improvement over the qsort approach. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xin Gao <gaoxin@cdjrlc.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20221114210723.2749751-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>