summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2021-02-11perf daemon: Add up time for daemon/session listJiri Olsa1-0/+20
Display up time for both daemon and sessions. Example: # cat ~/.perfconfig [daemon] base=/opt/perfdata [session-cycles] run = -m 10M -e cycles --overwrite --switch-output -a [session-sched] run = -m 20M -e sched:* --overwrite --switch-output -a Starting the daemon: # perf daemon start Get the details with up time: # perf daemon -v [778315:daemon] base: /opt/perfdata output: /opt/perfdata/output lock: /opt/perfdata/lock up: 15 minutes [778316:cycles] perf record -m 20M -e cycles --overwrite --switch-output -a base: /opt/perfdata/session-cycles output: /opt/perfdata/session-cycles/output control: /opt/perfdata/session-cycles/control ack: /opt/perfdata/session-cycles/ack up: 10 minutes [778317:sched] perf record -m 20M -e sched:* --overwrite --switch-output -a base: /opt/perfdata/session-sched output: /opt/perfdata/session-sched/output control: /opt/perfdata/session-sched/control ack: /opt/perfdata/session-sched/ack up: 2 minutes Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20210208200908.1019149-18-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11perf daemon: Use control to stop sessionJiri Olsa1-10/+46
Use the 'stop' control command to stop perf record session. If that fails, fall back to current SIGTERM/SIGKILL pair. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20210208200908.1019149-17-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11perf daemon: Add 'ping' commandJiri Olsa2-0/+157
Add a 'ping' command to verify that the 'perf record' session is up and operational. It's used in the following patches via test code to make sure 'perf record' is ready to receive signals. Example: # cat ~/.perfconfig [daemon] base=/opt/perfdata [session-cycles] run = -m 10M -e cycles --overwrite --switch-output -a [session-sched] run = -m 20M -e sched:* --overwrite --switch-output -a Start the daemon: # perf daemon start Ping all sessions: # perf daemon ping OK cycles OK sched Ping specific session: # perf daemon ping --session sched OK sched Committer notes: Fixed up bug pointed by clang: Buggy: if (!pollfd.revents & POLLIN) Correct code: if (!(pollfd.revents & POLLIN)) clang warning: builtin-daemon.c:560:6: error: logical not is only applied to the left hand side of this bitwise operator [-Werror,-Wlogical-not-parentheses] if (!pollfd.revents & POLLIN) { ^ ~ builtin-daemon.c:560:6: note: add parentheses after the '!' to evaluate the bitwise operator first Also use designated initialized with pollfd, i.e.: struct pollfd pollfd = { .events = POLLIN, }; Instead of: struct pollfd pollfd = { 0, }; To get past: builtin-daemon.c:510:30: error: missing field 'events' initializer [-Werror,-Wmissing-field-initializers] struct pollfd pollfd = { 0, }; ^ 1 error generated. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20210208200908.1019149-16-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11perf daemon: Set control fifo for sessionJiri Olsa2-1/+28
Setup control fifos for session and add --control option to session arguments. Example: # cat ~/.perfconfig [daemon] base=/opt/perfdata [session-cycles] run = -m 10M -e cycles --overwrite --switch-output -a [session-sched] run = -m 20M -e sched:* --overwrite --switch-output -a Starting the daemon: # perf daemon start Use can list control fifos with (control and ack files): # perf daemon -v [776459:daemon] base: /opt/perfdata output: /opt/perfdata/output lock: /opt/perfdata/lock [776460:cycles] perf record -m 20M -e cycles --overwrite --switch-output -a base: /opt/perfdata/session-cycles output: /opt/perfdata/session-cycles/output control: /opt/perfdata/session-cycles/control ack: /opt/perfdata/session-cycles/ack [776461:sched] perf record -m 20M -e sched:* --overwrite --switch-output -a base: /opt/perfdata/session-sched output: /opt/perfdata/session-sched/output control: /opt/perfdata/session-sched/control ack: /opt/perfdata/session-sched/ack Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20210208200908.1019149-15-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11perf daemon: Allow only one daemon over base directoryJiri Olsa2-1/+77
Add 'lock' file under daemon base and flock it, so only one perf daemon can run on top of it. Each daemon tries to create and lock BASE/lock file, if it's successful we are sure we're the only daemon running over the BASE. Once daemon is finished, file descriptor to lock file is closed and lock is released. Example: # cat ~/.perfconfig [daemon] base=/opt/perfdata [session-cycles] run = -m 10M -e cycles --overwrite --switch-output -a [session-sched] run = -m 20M -e sched:* --overwrite --switch-output -a Starting the daemon: # perf daemon start And try once more: # perf daemon start failed: another perf daemon (pid 775594) owns /opt/perfdata will end up with an error, because there's already one running on top of /opt/perfdata. Committer notes: Provide lockf(F_TLOCK) when not available, i.e. transform: lockf(fd, F_TLOCK, 0); into: flock(fd, LOCK_EX | LOCK_NB); Which should be equivalent. Noticed when cross building to some odd Android NDK. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20210208200908.1019149-14-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11perf daemon: Add 'stop' commandJiri Olsa2-0/+35
Add 'perf daemon stop' command to stop daemon process and all running sessions. Example: # cat ~/.perfconfig [daemon] base=/opt/perfdata [session-cycles] run = -m 10M -e cycles --overwrite --switch-output -a [session-sched] run = -m 20M -e sched:* --overwrite --switch-output -a Start the daemon: # perf daemon start Stop the daemon # perf daemon stop Daemon is not running, nothing to connect to: # perf daemon connect error: Connection refused Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20210208200908.1019149-13-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11perf daemon: Add 'signal' commandJiri Olsa2-7/+77
Allow the 'perf daemon' to send SIGUSR2 to all running sessions or just to a specific session. Example: # cat ~/.perfconfig [daemon] base=/opt/perfdata [session-cycles] run = -m 10M -e cycles --overwrite --switch-output -a [session-sched] run = -m 20M -e sched:* --overwrite --switch-output -a Start the daemon: # perf daemon start Send signal to all running sessions: # perf daemon signal signal 12 sent to session 'cycles [773738]' signal 12 sent to session 'sched [773739]' Or to specific one: # perf daemon signal --session sched signal 12 sent to session 'sched [773739]' And verify signals were delivered and perf.data dumped: # cat /opt/perfdata/session-cycles/output rounding mmap pages size to 32M (8192 pages) [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2021010220382490 ] # car /opt/perfdata/session-sched/output rounding mmap pages size to 32M (8192 pages) [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2021010220382489 ] [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2021010220393745 ] Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20210208200908.1019149-12-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11perf daemon: Add 'list' commandJiri Olsa1-3/+85
Add a 'list' command to display all running sessions. It's the default command if no other command is specified. Example: # cat ~/.perfconfig [daemon] base=/opt/perfdata [session-cycles] run = -m 10M -e cycles --overwrite --switch-output -a [session-sched] run = -m 20M -e sched:* --overwrite --switch-output -a Start the daemon: # perf daemon start List sessions: # perf daemon [771394:daemon] base: /opt/perfdata [771395:cycles] perf record -m 10M -e cycles --overwrite --switch-output -a [771396:sched] perf record -m 20M -e sched:* --overwrite --switch-output -a List sessions with more info: # perf daemon -v [771394:daemon] base: /opt/perfdata output: /opt/perfdata/output [771395:cycles] perf record -m 10M -e cycles --overwrite --switch-output -a base: /opt/perfdata/session-cycles output: /opt/perfdata/session-cycles/output [771396:sched] perf record -m 20M -e sched:* --overwrite --switch-output -a base: /opt/perfdata/session-sched output: /opt/perfdata/session-sched/output The 'output' file is perf record output for specific session. Note you have to stop all running perf processes manually at this point, stop command is coming in following patches. Committer notes: Fixup union initialization to overcome this in multiple older systems: 22 15.74 debian:8 : FAIL gcc version 4.9.2 (Debian 4.9.2-10+deb8u2) builtin-daemon.c: In function 'send_cmd_list': builtin-daemon.c:1386:2: error: missing initializer for field 'csv_sep' of 'struct <anonymous>' [-Werror=missing-field-initializers] }; ^ builtin-daemon.c:641:8: note: 'csv_sep' declared here char csv_sep; ^ cc1: all warnings being treated as errors Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20210208200908.1019149-11-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11perf daemon: Add signalfd supportJiri Olsa1-6/+155
Use a signalfd fd to track SIGCHLD signals as notifications for perf session termination. This way we don't need to actively check for child status, being notified if there's change. Suggested-by: Alexei Budankov <abudankov@huawei.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20210208200908.1019149-10-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11perf daemon: Add background supportJiri Olsa2-0/+66
Add support to put the daemon process in the background. It's now enabled by default and -f option is added to keep the daemon process on the console for debugging. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20210208200908.1019149-9-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11perf daemon: Add config file change checkJiri Olsa1-3/+97
Add support to detect changes to the daemon's config file triggering a re-read of the configuration when that happens. Use a inotify file descriptor plugged into the main fdarray object for polling. Example: # cat ~/.perfconfig [daemon] base=/opt/perfdata [session-cycles] run = -m 10M -e cycles --overwrite --switch-output -a Starting the daemon: # perf daemon start Check sessions: # perf daemon [772262:daemon] base: /opt/perfdata [772263:cycles] perf record -m 10M -e cycles --overwrite --switch-output -a Change '-m 10M' to '-m 20M', and check daemon log: # tail -f /opt/perfdata/output [2021-01-02 20:31:41.234045] daemon started (pid 772262) [2021-01-02 20:31:41.235072] reconfig: ruining session [cycles:772263]: -m 10M -e cycles --overwrite --switch-output -a [2021-01-02 20:32:08.310137] reconfig: session 'cycles' killed [2021-01-02 20:32:08.310847] reconfig: ruining session [cycles:772338]: -m 20M -e cycles --overwrite --switch-output -a And the session list: # perf daemon [772262:daemon] base: /opt/perfdata [772338:cycles] perf record -m 20M -e cycles --overwrite --switch-output -a Note the changed '-m 20M' option is in place. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20210208200908.1019149-8-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11perf daemon: Add config file supportJiri Olsa3-2/+393
Adding support to configure daemon with config file. Each client or server invocation of perf daemon needs to know the base directory, where all sessions data is stored. The base is defined with: daemon.base Base path for daemon data. All sessions data are stored under this path. The daemon allows to create record sessions. Each session is a record command spawned and monitored by perf daemon. The session is defined with: session-<NAME>.run Defines new record session for daemon. The value is record's command line without the 'record' keyword. Example: # cat ~/.perfconfig [daemon] base=/opt/perfdata [session-cycles] run = -m 10M -e cycles --overwrite --switch-output -a [session-sched] run = -m 20M -e sched:* --overwrite --switch-output -a The example above defines '/opt/perfdata' as the base directory and 2 record sessions. # perf daemon start [2021-01-28 19:47:33.454413] daemon started (pid 16015) [2021-01-28 19:47:33.455910] reconfig: ruining session [cycles:16016]: -m 10M -e cycles --overwrite --switch-output -a [2021-01-28 19:47:33.456599] reconfig: ruining session [sched:16017]: -m 20M -e sched:* --overwrite --switch-output -a # ps -ef | grep perf ... perf daemon start ... /home/jolsa/.../perf record -m 20M -e cycles --overwrite --switch-output -a ... /home/jolsa/.../perf record -m 20M -e sched:* --overwrite --switch-output -a The base directory is populated with: # find /opt/perfdata/ /opt/perfdata/ /opt/perfdata/control <- control socket /opt/perfdata/session-cycles <- data for session 'cycles': /opt/perfdata/session-cycles/output <- perf record output /opt/perfdata/session-cycles/perf.data <- perf data /opt/perfdata/session-sched <- ditto for session 'sched' /opt/perfdata/session-sched/output /opt/perfdata/session-sched/perf.data Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20210208200908.1019149-7-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11perf daemon: Add client socket supportJiri Olsa1-0/+138
Add support for client socket side that will be used to send commands to the daemon server socket. This patch adds only the core support, all commands using this functionality are coming in the following patches. Committer notes: Hat to patch patch it to deal with this in some systems: cc1: warnings being treated as errors builtin-daemon.c: In function 'send_cmd': MKDIR /tmp/build/perf/bench/ builtin-daemon.c:1368: error: ignoring return value of 'fwrite', declared with attribute warn_unused_result MKDIR /tmp/build/perf/tests/ make[3]: *** [/tmp/build/perf/builtin-daemon.o] Error 1 And also to not leak the 'line' buffer allocated by getline(), since you initialized line to NULL and len to zero, man page says: If *lineptr is set to NULL and *n is set 0 before the call, then getline() will allocate a buffer for storing the line. This buffer should be freed by the user program even if getline() failed. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20210208200908.1019149-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-09perf daemon: Add server socket supportJiri Olsa1-1/+116
Add support to create a server socket that listens for client commands and processes them. This patch adds only the core support, all commands using this functionality are coming in the following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20210208200908.1019149-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-09perf daemon: Add base optionJiri Olsa2-0/+15
Add a base option allowing the user to specify a base directory. It will have precedence over config file base definition coming in the following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20210208200908.1019149-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-09perf daemon: Add config optionJiri Olsa2-0/+48
Add a config option and base functionality that takes the option argument (if specified) and other system config locations and produces an 'acting' config file path. The actual config file processing is coming in following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20210208200908.1019149-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-09perf daemon: Add daemon commandJiri Olsa6-0/+130
Add a daemon skeleton with a minimal base (non) functionality, covering various setup in start command. Add an initial perf-daemon.txt with basic info. This is in response to pople asking for the possibility to be able run record long running sessions on the background. The patchset that starts with this adds support to configure and run record sessions on background via new 'perf daemon' command. This is useful for being able to use perf as a flight recorder that one can interact with asking for events to be enabled or disabled, added or removed, etc. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20210208200908.1019149-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-09perf script: Simplify bool conversionYang Li1-2/+2
Fix the following coccicheck warning: ./tools/perf/builtin-script.c:2789:36-41: WARNING: conversion to bool not needed here ./tools/perf/builtin-script.c:3237:48-53: WARNING: conversion to bool not needed here Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/1612773936-98691-1-git-send-email-yang.lee@linux.alibaba.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-09perf arm64/s390: Fix printf conversion specifier for IP addressesArnaldo Carvalho de Melo2-2/+4
We need to use "%#" PRIx64 for u64 values, not "%lx". In arm64's and s390x cases the compiler doesn't complain, but lets fix this in case this code gets copied to a 32-bit arch, like with powerpc 32-bit that got fixed in the previous patch. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hewenliang <hewenliang4@huawei.com> Cc: Hu Shiyuan <hushiyuan@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-09perf powerpc: Fix printf conversion specifier for IP addressesArnaldo Carvalho de Melo1-1/+2
We need to use "%#" PRIx64 for u64 values, not "%lx", fixing this build problem on powerpc 32-bit: 72 13.69 ubuntu:18.04-x-powerpc : FAIL powerpc-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 arch/powerpc/util/machine.c: In function 'arch__symbols__fixup_end': arch/powerpc/util/machine.c:23:12: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'u64 {aka long long unsigned int}' [-Werror=format=] pr_debug4("%s sym:%s end:%#lx\n", __func__, p->name, p->end); ^ /git/linux/tools/perf/util/debug.h:18:21: note: in definition of macro 'pr_fmt' #define pr_fmt(fmt) fmt ^~~ /git/linux/tools/perf/util/debug.h:33:29: note: in expansion of macro 'pr_debugN' #define pr_debug4(fmt, ...) pr_debugN(4, pr_fmt(fmt), ##__VA_ARGS__) ^~~~~~~~~ /git/linux/tools/perf/util/debug.h:33:42: note: in expansion of macro 'pr_fmt' #define pr_debug4(fmt, ...) pr_debugN(4, pr_fmt(fmt), ##__VA_ARGS__) ^~~~~~ arch/powerpc/util/machine.c:23:2: note: in expansion of macro 'pr_debug4' pr_debug4("%s sym:%s end:%#lx\n", __func__, p->name, p->end); ^~~~~~~~~ cc1: all warnings being treated as errors /git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed make[5]: *** [util] Error 2 /git/linux/tools/build/Makefile.build:139: recipe for target 'powerpc' failed make[4]: *** [powerpc] Error 2 /git/linux/tools/build/Makefile.build:139: recipe for target 'arch' failed make[3]: *** [arch] Error 2 73 30.47 ubuntu:18.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 Fixes: 557c3eadb7712741 ("perf powerpc: Fix gap between kernel end and module start") Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-08perf script: Support filtering by hex addressJin Yao5-1/+98
'perf script' supports '-S' or '--symbol' options to only list the records for these symbols. A symbol is typically a name or hex address. If it's hex address, it is the start address of one symbol. While it would be useful if we can filter trace records by any hex address (not only the start address of symbol). So now we support filtering trace records by more conditions, such as: - symbol name - start address of symbol - any hexadecimal address - address range The comparison order is defined as: 1. symbol name comparison 2. symbol start address comparison. 3. any hexadecimal address comparison. 4. address range comparison. The idea is if we can get a valid address from -S list, we add the address to addr_list for address comparison otherwise we still leave it to sym_list for symbol comparison. Some examples: root@kbl-ppc:~# ./perf script -S ffffffff9a477308 perf 8562 [000] 347303.578858: 1 cycles: ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms]) perf 8562 [000] 347303.578860: 1 cycles: ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms]) perf 8562 [000] 347303.578861: 11 cycles: ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms]) perf 8562 [001] 347303.578903: 1 cycles: ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms]) perf 8562 [001] 347303.578905: 1 cycles: ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms]) perf 8562 [001] 347303.578906: 15 cycles: ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms]) perf 8562 [002] 347303.578952: 1 cycles: ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms]) perf 8562 [002] 347303.578953: 1 cycles: ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms]) Filter the traced records by hex address ffffffff9a477308. root@kbl-ppc:~# ./perf script -S ffffffff9a4dd4ce,ffffffff9a4d2de9,ffffffff9a6bf9f4 perf 8562 [001] 347303.578911: 311706 cycles: ffffffff9a6bf9f4 __kmalloc_node+0x204 ([kernel.kallsyms]) perf 8562 [002] 347303.578960: 354477 cycles: ffffffff9a4d2de9 sched_setaffinity+0x49 ([kernel.kallsyms]) perf 8562 [003] 347303.579015: 450958 cycles: ffffffff9a4dd4ce dequeue_task_fair+0x1ae ([kernel.kallsyms]) Filter the traced records by hex address ffffffff9a4dd4ce, ffffffff9a4d2de9, ffffffff9a6bf9f4. root@kbl-ppc:~# ./perf script -S ffffffff9a477309 --addr-range 16 perf 8562 [000] 347303.578863: 291 cycles: ffffffff9a47730a native_write_msr+0xa ([kernel.kallsyms]) perf 8562 [001] 347303.578907: 411 cycles: ffffffff9a47730a native_write_msr+0xa ([kernel.kallsyms]) perf 8562 [002] 347303.578956: 462 cycles: ffffffff9a47730f native_write_msr+0xf ([kernel.kallsyms]) perf 8562 [003] 347303.579010: 497 cycles: ffffffff9a47730f native_write_msr+0xf ([kernel.kallsyms]) perf 8562 [004] 347303.579059: 429 cycles: ffffffff9a47730f native_write_msr+0xf ([kernel.kallsyms]) perf 8562 [005] 347303.579109: 408 cycles: ffffffff9a47730a native_write_msr+0xa ([kernel.kallsyms]) perf 8562 [006] 347303.579159: 460 cycles: ffffffff9a47730f native_write_msr+0xf ([kernel.kallsyms]) perf 8562 [007] 347303.579213: 436 cycles: ffffffff9a47730f native_write_msr+0xf ([kernel.kallsyms]) Filter the traced records from address range [ffffffff9a477309, ffffffff9a477309 + 15]. root@kbl-ppc:~# ./perf script -S "ffffffff9b163046,rcu_nmi_exit" perf 8562 [004] 347303.579060: 12013 cycles: ffffffff9b163046 exc_nmi+0x166 ([kernel.kallsyms]) perf 8562 [007] 347303.579214: 12138 cycles: ffffffff9b165944 rcu_nmi_exit+0x34 ([kernel.kallsyms]) Filter by address + symbol Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210207080935.31784-2-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-08perf intlist: Change 'struct intlist' int member to 'unsigned long'Jin Yao3-17/+22
This is to let intlist support addresses as its payload. One potential problem is it can't support negative number. But so far, there is no such kind of use case. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210207080935.31784-1-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-08perf stat: Use nftw() instead of ftw()Paul Cercueil1-4/+4
ftw() has been obsolete for about 12 years now. Committer notes: Further notes provided by the patch author: "NOTE: Not runtime-tested, I have no idea what I need to do in perf to test this. But at least it compiles now with my uClibc-based toolchain." I looked at the nftw()/ftw() man page and for the use made with cgroups in 'perf stat' the end result is equivalent. Fixes: bb1c15b60b98 ("perf stat: Support regex pattern in --for-each-cgroup") Signed-off-by: Paul Cercueil <paul@crapouillou.net> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: od@zcrc.me Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/20210208181157.1324550-1-paul@crapouillou.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-08perf tools: Update topdown documentation for Sapphire RapidsKan Liang1-4/+74
Update Topdown extension on Sapphire Rapids and how to collect the L2 events. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/1612296553-21962-10-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-08perf stat: Support L2 Topdown eventsKan Liang5-4/+149
The TMA method level 2 metrics is supported from the Intel Sapphire Rapids server, which expose four L2 Topdown metrics events to user space. There are eight L2 events in total. The other four L2 Topdown metrics events are calculated from the corresponding L1 and the exposed L2 events. Now, the --topdown prints the complete top-down metrics that supported by the CPU. For the Intel Sapphire Rapids server, there are 4 L1 events and 8 L2 events displyed in one line. Add a new option, --td-level, to display the top-down statistics that equal to or lower than the input level. The L2 event is marked only when both its L1 parent event and itself crosse the threshold. Here is an example: $ perf stat --topdown --td-level=2 --no-metric-only sleep 1 Topdown accuracy may decrease when measuring long periods. Please print the result regularly, e.g. -I1000 Performance counter stats for 'sleep 1': 16,734,390 slots 2,100,001 topdown-retiring # 12.6% retiring 2,034,376 topdown-bad-spec # 12.3% bad speculation 4,003,128 topdown-fe-bound # 24.1% frontend bound 328,125 topdown-heavy-ops # 2.0% heavy operations # 10.6% light operations 1,968,751 topdown-br-mispredict # 11.9% branch mispredict # 0.4% machine clears 2,953,127 topdown-fetch-lat # 17.8% fetch latency # 6.3% fetch bandwidth 5,906,255 topdown-mem-bound # 35.6% memory bound # 15.4% core bound Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/1612296553-21962-9-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-08perf test: Support PERF_SAMPLE_WEIGHT_STRUCTKan Liang1-3/+11
Support the new sample type for sample-parsing test case. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/1612296553-21962-8-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-08perf report: Support instruction latencyKan Liang10-11/+81
The instruction latency information can be recorded on some platforms, e.g., the Intel Sapphire Rapids server. With both memory latency (weight) and the new instruction latency information, users can easily locate the expensive load instructions, and also understand the time spent in different stages. The users can optimize their applications in different pipeline stages. The 'weight' field is shared among different architectures. Reusing the 'weight' field may impacts other architectures. Add a new field to store the instruction latency. Like the 'weight' support, introduce a 'ins_lat' for the global instruction latency, and a 'local_ins_lat' for the local instruction latency version. Add new sort functions, INSTR Latency and Local INSTR Latency, accordingly. Add local_ins_lat to the default_mem_sort_order[]. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/1612296553-21962-7-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-08perf tools: Support PERF_SAMPLE_WEIGHT_STRUCTKan Liang8-10/+61
The new sample type, PERF_SAMPLE_WEIGHT_STRUCT, is an alternative of the PERF_SAMPLE_WEIGHT sample type. Users can apply either the PERF_SAMPLE_WEIGHT sample type or the PERF_SAMPLE_WEIGHT_STRUCT sample type to retrieve the sample weight, but they cannot apply both sample types simultaneously. The new sample type shares the same space as the PERF_SAMPLE_WEIGHT sample type. The lower 32 bits are exactly the same for both sample type. The higher 32 bits may be different for different architecture. Add arch specific arch_evsel__set_sample_weight() to set the new sample type for X86. Only store the lower 32 bits for the sample->weight if the new sample type is applied. In practice, no memory access could last than 4G cycles. No data will be lost. If the kernel doesn't support the new sample type. Fall back to the PERF_SAMPLE_WEIGHT sample type. There is no impact for other architectures. Committer notes: Fixup related to PERF_SAMPLE_CODE_PAGE_SIZE, present in acme/perf/core but not upstream yet. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/1612296553-21962-6-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-08perf c2c: Support data block and addr blockKan Liang3-0/+11
'perf c2c' is also a memory profiling tool. Apply the two new data source fields to 'perf c2c' as well. Extend 'perf c2c' to display the number of loads which blocked by data or address conflict. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joe Mario <jmario@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/1612296553-21962-5-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-08perf tools: Support data block and addr blockKan Liang8-4/+70
Two new data source fields, to indicate the block reasons of a load instruction, are introduced on the Intel Sapphire Rapids server. The fields can be used by the memory profiling. Add a new sort function, SORT_MEM_BLOCKED, for the two fields. For the previous platforms or the block reason is unknown, print "N/A" for the block reason. Add blocked as a default mem sort key for perf report and perf mem report. Committer testing: So in machines without this capability we get a "N/A" filling the new "Blocked" column: $ perf mem record ls arch certs CREDITS Documentation include ipc Kconfig lib MAINTAINERS mm samples security usr block COPYING crypto drivers fs init Kbuild kernel LICENSES Makefile net README scripts sound tools virt [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.008 MB perf.data (17 samples) ] $ $ perf mem report --stdio # To display the perf.data header info, please use --header/--header-only options. # # Total Lost Samples: 0 # # Samples: 6 of event 'cpu/mem-loads,ldlat=30/Pu' # Total weight : 1381 # Sort order : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked # # Overhead Samples Local Weight Memory access Symbol Shared Object Data Symbol Data Object Snoop TLB access Locked Blocked # ........ ....... ............ .................... ....................... ............. ...................... ............ ..... ............ ...... ....... # 32.87% 1 454 Local RAM or RAM hit [.] _dl_relocate_object ld-2.31.so [.] 0x00007fe91cef3078 libc-2.31.so Hit L1 or L2 hit No N/A 25.56% 1 353 LFB or LFB hit [.] strcmp ld-2.31.so [.] 0x00005586973855ca ls None L1 or L2 hit No N/A 22.59% 1 312 LFB or LFB hit [.] _dl_cache_libcmp ld-2.31.so [.] 0x00007fe91d0e3b18 ld.so.cache None L1 or L2 hit No N/A 8.47% 1 117 LFB or LFB hit [.] _dl_relocate_object ld-2.31.so [.] 0x00007fe91ceee570 libc-2.31.so None L1 or L2 hit No N/A 6.88% 1 95 LFB or LFB hit [.] _dl_relocate_object ld-2.31.so [.] 0x00007fe91ceed490 libc-2.31.so None L1 or L2 hit No N/A 3.62% 1 50 LFB or LFB hit [.] _dl_cache_libcmp ld-2.31.so [.] 0x00007fe91d0ebe60 ld.so.cache None L1 or L2 hit No N/A # Samples: 11 of event 'cpu/mem-stores/Pu' # Total weight : 11 # Sort order : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked # # Overhead Samples Local Weight Memory access Symbol Shared Object Data Symbol Data Object Snoop TLB access Locked Blocked # ........ ....... ............ ............. ....................... ............. ...................... ........... ..... .......... ...... ....... # 9.09% 1 0 L1 hit [.] __strcoll_l libc-2.31.so [.] 0x00007fffe5648fc8 [stack] N/A N/A N/A N/A 9.09% 1 0 L1 hit [.] _dl_lookup_symbol_x ld-2.31.so [.] 0x00007fffe56490b8 [stack] N/A N/A N/A N/A 9.09% 1 0 L1 hit [.] _dl_name_match_p ld-2.31.so [.] 0x00007fffe56487d8 [stack] N/A N/A N/A N/A 9.09% 1 0 L1 hit [.] _dl_start ld-2.31.so [.] start_time+0x0 ld-2.31.so N/A N/A N/A N/A 9.09% 1 0 L1 hit [.] _dl_sysdep_start ld-2.31.so [.] 0x00007fffe56494b8 [stack] N/A N/A N/A N/A 9.09% 1 0 L1 hit [.] do_lookup_x ld-2.31.so [.] 0x00007fffe5648ff8 [stack] N/A N/A N/A N/A 9.09% 1 0 L1 hit [.] do_lookup_x ld-2.31.so [.] 0x00007fffe5649064 [stack] N/A N/A N/A N/A 9.09% 1 0 L1 hit [.] do_lookup_x ld-2.31.so [.] 0x00007fffe5649130 [stack] N/A N/A N/A N/A 9.09% 1 0 L1 miss [.] _dl_start ld-2.31.so [.] _rtld_global+0xaf8 ld-2.31.so N/A N/A N/A N/A 9.09% 1 0 L1 miss [.] _dl_start ld-2.31.so [.] _rtld_global+0xc28 ld-2.31.so N/A N/A N/A N/A 9.09% 1 0 L1 miss [.] _dl_start ld-2.31.so [.] 0x00007fffe56495b8 [stack] N/A N/A N/A N/A # (Tip: Show user configuration overrides: perf config --user --list) $ Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/1612296553-21962-4-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-08perf tools: Support the auxiliary eventKan Liang7-1/+60
On the Intel Sapphire Rapids server, an auxiliary event has to be enabled simultaneously with the load latency event to retrieve complete Memory Info. Add X86 specific perf_mem_events__name() to handle the auxiliary event. - Users are only interested in the samples of the mem-loads event. Sample read the auxiliary event. - The auxiliary event must be in front of the load latency event in a group. Assume the second event to sample if the auxiliary event is the leader. - Add a weak is_mem_loads_aux_event() to check the auxiliary event for X86. For other ARCHs, it always return false. Parse the unique event name, mem-loads-aux, for the auxiliary event. Committer notes: According to 61b985e3e775a3a7 ("perf/x86/intel: Add perf core PMU support for Sapphire Rapids"), ENODATA is only returned by sys_perf_event_open() when used with these auxiliary events, with this in evsel__open_strerror(): case ENODATA: return scnprintf(msg, size, "Cannot collect data source with the load latency event alone. " "Please add an auxiliary event in front of the load latency event."); This is Ok at this point in time, but fragile long term, I pointed this out in the e-mail thread, requesting a follow up patch to check if ENODATA is really for this specific case. Fixed up sizeof(MEM_LOADS_AUX_NAME) bug pointed out by Namhyung. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20210205152648.GC920417@kernel.org Link: http://lore.kernel.org/lkml/1612296553-21962-3-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-08tools headers uapi: Update tools's copy of linux/perf_event.hKan Liang1-4/+50
To get the changes in these csets: 2a6c6b7d7ad346f0 ("perf/core: Add PERF_SAMPLE_WEIGHT_STRUCT") 61b985e3e775a3a7 ("perf/x86/intel: Add perf core PMU support for Sapphire Rapids") This cures the following warning during perf's build: Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h' diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h Committer notes: Picked by hand as I had already merged the MMAP buildid patch that also touches perf_event.h and is also only in {acme,tip}/perf/core, not yet upstream. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/1612296553-21962-2-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-08perf powerpc: Support exposing Performance Monitor Counter SPRs as part of ↵Athira Rajeev3-6/+34
extended regs To enable presenting of Performance Monitor Counter Registers (PMC1 to PMC6) as part of extended regsiters, this patch adds these to sample_reg_mask in the tool side (to use with -I? option). Simplified the PERF_REG_PMU_MASK_300/31 definition. Excluded the unsupported SPRs (MMCR3, SIER2, SIER3) from extended mask value for CPU_FTR_ARCH_300. Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-08perf probe: Add protection to avoid endless loopJianlin Lv1-2/+6
if dwarf_offdie() returns NULL, the continue statement forces the next iteration of the loop without updating the 'off' variable. It will cause an endless loop in the process of traversing the compile unit. So add exception protection for looping CUs. Signed-off-by: Jianlin Lv <Jianlin.Lv@arm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: jianlin.lv@arm.com Link: http://lore.kernel.org/lkml/20210203145702.1219509-1-Jianlin.Lv@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03perf trace-event-info: Rename for_each_event.Ian Rogers1-5/+5
Avoid a naming conflict with for_each_event with similar code in parse-events.c, rename to for_each_event_tps. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20210203052659.2975736-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03Merge remote-tracking branch 'torvalds/master' into perf/coreArnaldo Carvalho de Melo368-1550/+3502
To pick up fixes Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03perf powerpc: Fix gap between kernel end and module startAthira Rajeev2-0/+25
Running "perf mem report" in TUI mode fails with ENOMEM message in powerpc: failed to process sample Running with debug and verbose options points that issue is while allocating memory for sample histograms. The error path is: symbol__inc_addr_samples() -> __symbol__inc_addr_samples() -> annotated_source__histogram() symbol__inc_addr_samples() calls annotated_source__alloc_histograms () to allocate memory for sample histograms using calloc(). Here calloc() fails since the size of symbol is huge. The size of a symbol is calculated as difference between its start and end address. Example histogram allocation that fails is: sym->name is _end sym->start is 0xc0000000027a0000 sym->end is 0xc008000003890000 symbol__size(sym) is 0x80000010f0000 In the above case, the difference between sym->start (0xc0000000027a0000) and sym->end (0xc008000003890000) is huge. This is same problem as in s390 and arm64 which are fixed in commits: b9c0a64901d5 ("perf annotate: Fix s390 gap between kernel end and module start") 78886f3ed37e ("perf symbols: Fix arm64 gap between kernel start and module end") When this symbol was read first, its start and end address was set to address which matches with data from /proc/kallsyms. After symbol__new(): symbol__new: _end 0xc0000000027a0000-0xc0000000027a0000 From /proc/kallsyms: ... c000000002799370 b backtrace_flag c000000002799378 B radix_tree_node_cachep c000000002799380 B __bss_stop c0000000027a0000 B _end c008000003890000 t icmp_checkentry [ip_tables] c008000003890038 t ipt_alloc_initial_table [ip_tables] c008000003890468 T ipt_do_table [ip_tables] c008000003890de8 T ipt_unregister_table_pre_exit [ip_tables] ... Perf calls function symbols__fixup_end() which sets the end of symbol to 0xc008000003890000, which is the next address and this is the start address of first module (icmp_checkentry in above) which will make the huge symbol size of 0x80000010f0000. After symbols__fixup_end: symbols__fixup_end: sym->name: _end sym->start: 0xc0000000027a0000 sym->end: 0xc008000003890000 On powerpc, kernel text segment is located at 0xc000000000000000 whereas the modules are located at very high memory addresses, 0xc00800000xxxxxxx. Since the gap between end of kernel text segment and beginning of first module's address is high, histogram allocation using calloc fails. Fix this by detecting the kernel's last symbol and limiting the range of last kernel symbol to pagesize. Signed-off-by: Athira Rajeev<atrajeev@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Tested-By: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1609208054-1566-1-git-send-email-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03perf inject jit: Add namespaces supportYonatan Goldschmidt5-22/+82
This patch fixes "perf inject --jit" to properly operate on namespaced/containerized processes: * jitdump files are generated by the process, thus they should be looked up in its mount NS. * DSOs of injected MMAP events will later be looked up in the process mount NS, so write them into its NS. * PIDs & TIDs from jitdump events need to be translated to the PID as seen by "perf record" before written into MMAP events. For a process in a different PID NS, the TID & PID given in the jitdump event are actually ignored; I use the TID & PID of the thread which mmap()ed the jitdump file. This is simplified and won't do for forks of the initial process, if they continue using the same jitdump file. Future patches might improve it. This was tested by recording a NodeJS process running with "--perf-prof", inside a Docker container, and by recording another NodeJS process running in the same namespaces as perf itself, to make sure it's not broken for non-containerized processes. Signed-off-by: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io> Acked-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20201105015604.1726943-1-yonatan.goldschmidt@granulate.io Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03perf namespaces: Add 'in_pidns' to nsinfo structYonatan Goldschmidt2-2/+10
Provides an accurate mean to determine if the owner thread is in a different PID namespace. Signed-off-by: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io> Acked-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20201105015418.1725218-1-yonatan.goldschmidt@granulate.io Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03perf tools: Use scandir() to iterate threads when synthesizing PERF_RECORD_ ↵Namhyung Kim1-11/+17
events Like in __event__synthesize_thread(), I think it's better to use scandir() instead of the readdir() loop. In case some malicious task continues to create new threads, the readdir() loop will run over and over to collect tids. The scandir() also has the problem but the window is much smaller since it doesn't do much work during the iteration. Also add filter_task() function as we only care the tasks. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20210202090118.2008551-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03perf tools: Skip PERF_RECORD_MMAP event synthesis for kernel threadsNamhyung Kim1-9/+23
To synthesize information to resolve sample IPs, it needs to scan task and mmap info from the /proc filesystem. For each process, it opens (and reads) status and maps file respectively. But as kernel threads don't have memory maps so we can skip the maps file. To find kernel threads, check "VmPeak:" line in /proc/<PID>/status file. It's about the peak virtual memory usage so only user-level tasks have that. Note that it's possible to miss the line due to partial reads. So we should double-check if it's a really kernel thread when there's no VmPeak line. Thus check "Threads:" line (which follows the VmPeak line whether or not it exists) to be sure it's read enough data - just in case of deeply nested pid namespaces or large number of supplementary groups are involved. This is for user process: $ head -40 /proc/1/status Name: systemd Umask: 0000 State: S (sleeping) Tgid: 1 Ngid: 0 Pid: 1 PPid: 0 TracerPid: 0 Uid: 0 0 0 0 Gid: 0 0 0 0 FDSize: 256 Groups: NStgid: 1 NSpid: 1 NSpgid: 1 NSsid: 1 VmPeak: 234192 kB <-- here VmSize: 169964 kB VmLck: 0 kB VmPin: 0 kB VmHWM: 29528 kB VmRSS: 6104 kB RssAnon: 2756 kB RssFile: 3348 kB RssShmem: 0 kB VmData: 19776 kB VmStk: 1036 kB VmExe: 784 kB VmLib: 9532 kB VmPTE: 116 kB VmSwap: 2400 kB HugetlbPages: 0 kB CoreDumping: 0 THP_enabled: 1 Threads: 1 <-- and here SigQ: 1/62808 SigPnd: 0000000000000000 ShdPnd: 0000000000000000 SigBlk: 7be3c0fe28014a03 SigIgn: 0000000000001000 And this is for kernel thread: $ head -20 /proc/2/status Name: kthreadd Umask: 0000 State: S (sleeping) Tgid: 2 Ngid: 0 Pid: 2 PPid: 0 TracerPid: 0 Uid: 0 0 0 0 Gid: 0 0 0 0 FDSize: 64 Groups: NStgid: 2 NSpid: 2 NSpgid: 0 NSsid: 0 Threads: 1 <-- here SigQ: 1/62808 SigPnd: 0000000000000000 ShdPnd: 0000000000000000 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20210202090118.2008551-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03perf tools: Use /proc/<PID>/task/<TID>/status for PERF_RECORD_ event synthesisNamhyung Kim1-11/+14
To save memory usage, it needs to reduce the number of entries in the proc filesystem. It's using /proc/<PID>/task directory to traverse threads in the process and then kernel creates /proc/<PID>/task/<TID> entries. After that it checks the thread info using the /proc/<TID>/status file rather than /proc/<PID>/task/<TID>/status. As far as I can see, they are the same and contain all the info we need. Using the latter eliminates the unnecessary /proc/<TID> entry. This can be useful especially a large number of threads are used in the system. In my experiment around 1KB of memory on average was saved for each thread (which is not a thread group leader). To do this, pass both pid and tid to perf_event_prepare_comm() if it knows them. In case it doesn't know, passing 0 as pid will do the old way. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20210202090118.2008551-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03perf vendor events arm64: Reference common and uarch events for A76John Garry8-150/+76
Reduce duplication in the JSONs by referencing standard events from armv8-common-and-microarch.json In general the "PublicDescription" fields are not modified when somewhat significantly worded differently than the standard. Apart from that, description and names for events slightly different to standard are changed (to standard) for consistency. Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Nakamura, Shunsuke/中村 俊介 <nakamura.shun@fujitsu.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@openeuler.org Link: https://lore.kernel.org/r/1611835236-34696-5-git-send-email-john.garry@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03perf vendor events arm64: Reference common and uarch events for Ampere eMagJohn Garry7-97/+31
Reduce duplication in the JSONs by referencing standard events from armv8-common-and-microarch.json In general the "PublicDescription" fields are not modified when somewhat significantly worded differently than the standard. Apart from that, description and names for events slightly different to standard are changed (to standard) for consistency. Note that names for events 0x34 and 0x35 are non-standard and remain unchanged. Those events came from the following originally: https://github.com/AmpereComputing/ampere-centos-kernel/blob/4c2479c67bbcf35b35224db12a092b33682b181c/Documentation/arm64/eMAG-ARM-CoreImpDefined.pdf Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Nakamura, Shunsuke/中村 俊介 <nakamura.shun@fujitsu.com> Cc: mathieu.poirier@linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@openeuler.org Link: https://lore.kernel.org/r/1611835236-34696-4-git-send-email-john.garry@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03perf vendor events arm64: Add common and uarch event JSONJohn Garry1-0/+248
Add a common and microarch JSON, which can be referenced from CPU JSONs. For now, brief and public description are as event brief event description from the ARMv8 ARM [0], D7-11. The list of events is not complete, as not all events will be referenced yet. Reference document is at the following: [0] https://documentation-service.arm.com/static/5fa3bd1eb209f547eebd4141?token= Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Nakamura, Shunsuke/中村 俊介 <nakamura.shun@fujitsu.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@openeuler.org Link: https://lore.kernel.org/r/1611835236-34696-3-git-send-email-john.garry@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03perf vendor events arm64: Fix Ampere eMag event typoJohn Garry1-1/+1
The "briefdescription" for event 0x35 has a typo - fix it. Fixes: d35c595bf005 ("perf vendor events arm64: Revise core JSON events for eMAG") Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Nakamura, Shunsuke/中村 俊介 <nakamura.shun@fujitsu.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@openeuler.org Link: https://lore.kernel.org/r/1611835236-34696-2-git-send-email-john.garry@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03perf script: Support DSO filter like in other perf toolsJin Yao2-0/+5
Other perf tool builtins already supported a DSO filter. For example: $ perf report --dsos a,b,c which only considers symbols in these dsos. Now the DSO filter is supported in 'perf script': root@kbl-ppc:~# ./perf script --dsos "[kernel.kallsyms]" perf 18123 [000] 6142863.075104: 1 cycles: ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms]) perf 18123 [000] 6142863.075107: 1 cycles: ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms]) perf 18123 [000] 6142863.075108: 10 cycles: ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms]) perf 18123 [000] 6142863.075109: 273 cycles: ffffffff9ca7730a native_write_msr+0xa ([kernel.kallsyms]) perf 18123 [000] 6142863.075110: 7684 cycles: ffffffff9ca3c9c0 native_sched_clock+0x50 ([kernel.kallsyms]) perf 18123 [000] 6142863.075112: 213017 cycles: ffffffff9d765a92 syscall_exit_to_user_mode+0x32 ([kernel.kallsyms]) perf 18123 [001] 6142863.075156: 1 cycles: ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms]) perf 18123 [001] 6142863.075158: 1 cycles: ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms]) perf 18123 [001] 6142863.075159: 17 cycles: ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms]) Committer testing: $ perf script ls 2364888 29303.010949: 1 cycles:u: ffffffffa4bbc6a9 [unknown] ([unknown]) ls 2364888 29303.010957: 1 cycles:u: ffffffffa429ef48 [unknown] ([unknown]) ls 2364888 29303.010961: 1 cycles:u: ffffffffa4260133 [unknown] ([unknown]) ls 2364888 29303.010964: 5 cycles:u: ffffffffa429efad [unknown] ([unknown]) ls 2364888 29303.010967: 41 cycles:u: ffffffffa42a4586 [unknown] ([unknown]) ls 2364888 29303.010972: 435 cycles:u: ffffffffa429efe0 [unknown] ([unknown]) ls 2364888 29303.010978: 5142 cycles:u: 7f9b95bc2abf __GI___tunables_init+0x11f (/usr/lib64/ld-2.32.so) ls 2364888 29303.011006: 38551 cycles:u: ffffffffa4290f61 [unknown] ([unknown]) ls 2364888 29303.011486: 238234 cycles:u: 7f9b95bb7741 _dl_relocate_object+0xa71 (/usr/lib64/ld-2.32.so) ls 2364888 29303.011937: 415870 cycles:u: 7f9b95a1c80e __strcoll_l+0xe (/usr/lib64/libc-2.32.so) $ Before: $ perf script --dsos /usr/lib64/libc-2.32.so |& head -5 Error: unknown option `dsos' Usage: perf script [<options>] or: perf script [<options>] record <script> [<record-options>] <command> or: perf script [<options>] report <script> [script-args] $ After: $ perf script --dsos /usr/lib64/libc-2.32.so ls 2364888 29303.011937: 415870 cycles:u: 7f9b95a1c80e __strcoll_l+0xe (/usr/lib64/libc-2.32.so) $ Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210124232750.19170-2-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03perf tools: Fix DSO filtering when not finding a map for a sampled addressArnaldo Carvalho de Melo1-0/+2
When we lookup an address and don't find a map we should filter that sample if the user specified a list of --dso entries to filter on, fix it. Before: $ perf script sleep 274800 2843.556162: 1 cycles:u: ffffffffbb26bff4 [unknown] ([unknown]) sleep 274800 2843.556168: 1 cycles:u: ffffffffbb2b047d [unknown] ([unknown]) sleep 274800 2843.556171: 1 cycles:u: ffffffffbb2706b2 [unknown] ([unknown]) sleep 274800 2843.556174: 6 cycles:u: ffffffffbb2b0267 [unknown] ([unknown]) sleep 274800 2843.556176: 59 cycles:u: ffffffffbb2b03b1 [unknown] ([unknown]) sleep 274800 2843.556180: 691 cycles:u: ffffffffbb26bff4 [unknown] ([unknown]) sleep 274800 2843.556189: 9160 cycles:u: 7fa9550eeaa3 __GI___tunables_init+0xf3 (/usr/lib64/ld-2.32.so) sleep 274800 2843.556312: 86937 cycles:u: 7fa9550e157b _dl_lookup_symbol_x+0x4b (/usr/lib64/ld-2.32.so) $ So we have some samples we somehow didn't find in a map for, if we now do: $ perf report --stdio --dso /usr/lib64/ld-2.32.so # dso: /usr/lib64/ld-2.32.so # # Total Lost Samples: 0 # # Samples: 8 of event 'cycles:u' # Event count (approx.): 96856 # # Overhead Command Symbol # ........ ....... ........................ # 89.76% sleep [.] _dl_lookup_symbol_x 9.46% sleep [.] __GI___tunables_init 0.71% sleep [k] 0xffffffffbb26bff4 0.06% sleep [k] 0xffffffffbb2b03b1 0.01% sleep [k] 0xffffffffbb2b0267 0.00% sleep [k] 0xffffffffbb2706b2 0.00% sleep [k] 0xffffffffbb2b047d $ After this patch we get the right output with just entries for the DSOs specified in --dso: $ perf report --stdio --dso /usr/lib64/ld-2.32.so # dso: /usr/lib64/ld-2.32.so # # Total Lost Samples: 0 # # Samples: 8 of event 'cycles:u' # Event count (approx.): 96856 # # Overhead Command Symbol # ........ ....... ........................ # 89.76% sleep [.] _dl_lookup_symbol_x 9.46% sleep [.] __GI___tunables_init $ # Fixes: 96415e4d3f5fdf9c ("perf symbols: Avoid unnecessary symbol loading when dso list is specified") Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210128131209.GD775562@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03perf stat: Add Topdown metrics events as default eventsKan Liang5-0/+26
The Topdown Microarchitecture Analysis (TMA) Method is a structured analysis methodology to identify critical performance bottlenecks in out-of-order processors. From the Ice Lake and later platforms, the Topdown information can be retrieved from the dedicated "metrics" register, which isn't impacted by other events. Also, the Topdown metrics support both per thread/process and per core measuring. Adding Topdown metrics events as default events can enrich the default measuring information, and would not cost any extra multiplexing. Introduce arch_evlist__add_default_attrs() to allow architecture specific default events. Add the Topdown metrics events in the X86 specific arch_evlist__add_default_attrs(). Other architectures can add their own default events later separately. With the patch: $ perf stat sleep 1 Performance counter stats for 'sleep 1': 0.82 msec task-clock:u # 0.001 CPUs utilized 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 61 page-faults:u # 0.074 M/sec 319,941 cycles:u # 0.388 GHz 242,802 instructions:u # 0.76 insn per cycle 54,380 branches:u # 66.028 M/sec 4,043 branch-misses:u # 7.43% of all branches 1,585,555 slots:u # 1925.189 M/sec 238,941 topdown-retiring:u # 15.0% retiring 410,378 topdown-bad-spec:u # 25.8% bad speculation 634,222 topdown-fe-bound:u # 39.9% frontend bound 304,675 topdown-be-bound:u # 19.2% backend bound 1.001791625 seconds time elapsed 0.000000000 seconds user 0.001572000 seconds sys Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20210121133752.118327-1-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03perf test: Add parse-metric memory bandwidth testcaseJohn Garry1-0/+24
Event duration_time in a metric expression requires special handling. Improve test coverage by including a metric whose expression includes duration_time. The actual metric is a copied from the L1D_Cache_Fill_BW metric on my broadwell machine. Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linuxarm@openeuler.org Link: http://lore.kernel.org/lkml/1611578842-5749-1-git-send-email-john.garry@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>