diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2018-04-02 11:06:34 -0700 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2018-04-02 11:06:34 -0700 |
| commit | 486adcea4a63bec206cba6f0d7f301fb945ae9d3 (patch) | |
| tree | 1cdf7fed3628390186b5493087612f3681e9e912 /tools/perf/builtin-sched.c | |
| parent | 701f3b314905ac05f09fc052c87b022825d831f2 (diff) | |
| parent | 1159e09476536250c2a0173d4298d15114df7a89 (diff) | |
| download | linux-486adcea4a63bec206cba6f0d7f301fb945ae9d3.tar.bz2 | |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"The main kernel side changes were:
- Modernize the kprobe and uprobe creation/destruction tooling ABIs:
The existing text based APIs (kprobe_events and uprobe_events in
tracefs), are naive, limited ABIs in that they require user-space
to clean up after themselves, which is both difficult and fragile
if the tool is buggy or exits unexpectedly. In other words they are
not really suited for modern, robust tooling.
So introduce a modern, file descriptor based ABI that does not have
these limitations: introduce the 'perf_kprobe' and 'perf_uprobe'
PMUs and extend the perf_event_open() syscall to create events with
a kprobe/uprobe attached to them. These [k,u]probe are associated
with this file descriptor, so they are not available in tracefs.
(Song Liu)
- Intel Cannon Lake CPU support (Harry Pan)
- Intel PT cleanups (Alexander Shishkin)
- Improve the performance of pinned/flexible event groups by using RB
trees (Alexey Budankov)
- Add PERF_EVENT_IOC_MODIFY_ATTRIBUTES which allows the modification
of hardware breakpoints, which new ABI variant massively speeds up
existing tooling that uses hardware breakpoints to instrument (and
debug) memory usage.
(Milind Chabbi, Jiri Olsa)
- Various Intel PEBS handling fixes and improvements, and other Intel
PMU improvements (Kan Liang)
- Various perf core improvements and optimizations (Peter Zijlstra)
- ... misc cleanups, fixes and updates.
There's over 200 tooling commits, here's an (imperfect) list of
highlights:
- 'perf annotate' improvements:
* Recognize and handle jumps to other functions as calls, which
improves the navigation along jumps and back. (Arnaldo Carvalho
de Melo)
* Add the 'P' hotkey in TUI annotation to dump annotation output
into a file, to ease e-mail reporting of annotation details.
(Arnaldo Carvalho de Melo)
* Add an IPC/cycles column to the TUI (Jin Yao)
* Improve s390 assembly annotation (Thomas Richter)
* Refactor the output formatting logic to better separate it into
interactive and non-interactive features and add the --stdio2
output variant to demonstrate this. (Arnaldo Carvalho de Melo)
- 'perf script' improvements:
* Add Python 3 support (Jaroslav Škarvada)
* Add --show-round-event (Jiri Olsa)
- 'perf c2c' improvements:
* Add NUMA analysis support (Jiri Olsa)
- 'perf trace' improvements:
* Improve PowerPC support (Ravi Bangoria)
- 'perf inject' improvements:
* Integrate ARM CoreSight traces (Robert Walker)
- 'perf stat' improvements:
* Add the --interval-count option (yuzhoujian)
* Add the --timeout option (yuzhoujian)
- 'perf sched' improvements (Changbin Du)
- Vendor events improvements :
* Add IBM s390 vendor events (Thomas Richter)
* Add and improve arm64 vendor events (John Garry, Ganapatrao
Kulkarni)
* Update POWER9 vendor events (Sukadev Bhattiprolu)
- Intel PT tooling improvements (Adrian Hunter)
- PMU handling improvements (Agustin Vega-Frias)
- Record machine topology in perf.data (Jiri Olsa)
- Various overwrite related cleanups (Kan Liang)
- Add arm64 dwarf post unwind support (Kim Phillips, Jean Pihet)
- ... and lots of other changes, cleanups and fixes, see the shortlog
and Git history for details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (262 commits)
perf/x86/intel: Enable C-state residency events for Cannon Lake
perf/x86/intel: Add Cannon Lake support for RAPL profiling
perf/x86/pt, coresight: Clean up address filter structure
perf vendor events s390: Add JSON files for IBM z14
perf vendor events s390: Add JSON files for IBM z13
perf vendor events s390: Add JSON files for IBM zEC12 zBC12
perf vendor events s390: Add JSON files for IBM z196
perf vendor events s390: Add JSON files for IBM z10EC z10BC
perf mmap: Be consistent when checking for an unmaped ring buffer
perf mmap: Fix accessing unmapped mmap in perf_mmap__read_done()
perf build: Fix check-headers.sh opts assignment
perf/x86: Update rdpmc_always_available static key to the modern API
perf annotate: Use absolute addresses to calculate jump target offsets
perf annotate: Defer searching for comma in raw line till it is needed
perf annotate: Support jumping from one function to another
perf annotate: Add "_local" to jump/offset validation routines
perf python: Reference Py_None before returning it
perf annotate: Mark jumps to outher functions with the call arrow
perf annotate: Pass function descriptor to its instruction parsing routines
perf annotate: No need to calculate notes->start twice
...
Diffstat (limited to 'tools/perf/builtin-sched.c')
| -rw-r--r-- | tools/perf/builtin-sched.c | 133 |
1 files changed, 91 insertions, 42 deletions
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c index 83283fedb00f..4dfdee668b0c 100644 --- a/tools/perf/builtin-sched.c +++ b/tools/perf/builtin-sched.c @@ -254,6 +254,10 @@ struct thread_runtime { u64 total_delay_time; int last_state; + + char shortname[3]; + bool comm_changed; + u64 migrations; }; @@ -897,6 +901,37 @@ struct sort_dimension { struct list_head list; }; +/* + * handle runtime stats saved per thread + */ +static struct thread_runtime *thread__init_runtime(struct thread *thread) +{ + struct thread_runtime *r; + + r = zalloc(sizeof(struct thread_runtime)); + if (!r) + return NULL; + + init_stats(&r->run_stats); + thread__set_priv(thread, r); + + return r; +} + +static struct thread_runtime *thread__get_runtime(struct thread *thread) +{ + struct thread_runtime *tr; + + tr = thread__priv(thread); + if (tr == NULL) { + tr = thread__init_runtime(thread); + if (tr == NULL) + pr_debug("Failed to malloc memory for runtime data.\n"); + } + + return tr; +} + static int thread_lat_cmp(struct list_head *list, struct work_atoms *l, struct work_atoms *r) { @@ -1480,6 +1515,7 @@ static int map_switch_event(struct perf_sched *sched, struct perf_evsel *evsel, { const u32 next_pid = perf_evsel__intval(evsel, sample, "next_pid"); struct thread *sched_in; + struct thread_runtime *tr; int new_shortname; u64 timestamp0, timestamp = sample->time; s64 delta; @@ -1519,22 +1555,28 @@ static int map_switch_event(struct perf_sched *sched, struct perf_evsel *evsel, if (sched_in == NULL) return -1; + tr = thread__get_runtime(sched_in); + if (tr == NULL) { + thread__put(sched_in); + return -1; + } + sched->curr_thread[this_cpu] = thread__get(sched_in); printf(" "); new_shortname = 0; - if (!sched_in->shortname[0]) { + if (!tr->shortname[0]) { if (!strcmp(thread__comm_str(sched_in), "swapper")) { /* * Don't allocate a letter-number for swapper:0 * as a shortname. Instead, we use '.' for it. */ - sched_in->shortname[0] = '.'; - sched_in->shortname[1] = ' '; + tr->shortname[0] = '.'; + tr->shortname[1] = ' '; } else { - sched_in->shortname[0] = sched->next_shortname1; - sched_in->shortname[1] = sched->next_shortname2; + tr->shortname[0] = sched->next_shortname1; + tr->shortname[1] = sched->next_shortname2; if (sched->next_shortname1 < 'Z') { sched->next_shortname1++; @@ -1552,6 +1594,7 @@ static int map_switch_event(struct perf_sched *sched, struct perf_evsel *evsel, for (i = 0; i < cpus_nr; i++) { int cpu = sched->map.comp ? sched->map.comp_cpus[i] : i; struct thread *curr_thread = sched->curr_thread[cpu]; + struct thread_runtime *curr_tr; const char *pid_color = color; const char *cpu_color = color; @@ -1569,9 +1612,14 @@ static int map_switch_event(struct perf_sched *sched, struct perf_evsel *evsel, else color_fprintf(stdout, cpu_color, "*"); - if (sched->curr_thread[cpu]) - color_fprintf(stdout, pid_color, "%2s ", sched->curr_thread[cpu]->shortname); - else + if (sched->curr_thread[cpu]) { + curr_tr = thread__get_runtime(sched->curr_thread[cpu]); + if (curr_tr == NULL) { + thread__put(sched_in); + return -1; + } + color_fprintf(stdout, pid_color, "%2s ", curr_tr->shortname); + } else color_fprintf(stdout, color, " "); } @@ -1580,14 +1628,15 @@ static int map_switch_event(struct perf_sched *sched, struct perf_evsel *evsel, timestamp__scnprintf_usec(timestamp, stimestamp, sizeof(stimestamp)); color_fprintf(stdout, color, " %12s secs ", stimestamp); - if (new_shortname || (verbose > 0 && sched_in->tid)) { + if (new_shortname || tr->comm_changed || (verbose > 0 && sched_in->tid)) { const char *pid_color = color; if (thread__has_color(sched_in)) pid_color = COLOR_PIDS; color_fprintf(stdout, pid_color, "%s => %s:%d", - sched_in->shortname, thread__comm_str(sched_in), sched_in->tid); + tr->shortname, thread__comm_str(sched_in), sched_in->tid); + tr->comm_changed = false; } if (sched->map.comp && new_cpu) @@ -1691,6 +1740,37 @@ static int perf_sched__process_tracepoint_sample(struct perf_tool *tool __maybe_ return err; } +static int perf_sched__process_comm(struct perf_tool *tool __maybe_unused, + union perf_event *event, + struct perf_sample *sample, + struct machine *machine) +{ + struct thread *thread; + struct thread_runtime *tr; + int err; + + err = perf_event__process_comm(tool, event, sample, machine); + if (err) + return err; + + thread = machine__find_thread(machine, sample->pid, sample->tid); + if (!thread) { + pr_err("Internal error: can't find thread\n"); + return -1; + } + + tr = thread__get_runtime(thread); + if (tr == NULL) { + thread__put(thread); + return -1; + } + + tr->comm_changed = true; + thread__put(thread); + + return 0; +} + static int perf_sched__read_events(struct perf_sched *sched) { const struct perf_evsel_str_handler handlers[] = { @@ -2200,37 +2280,6 @@ static void save_idle_callchain(struct idle_thread_runtime *itr, callchain_cursor__copy(&itr->cursor, &callchain_cursor); } -/* - * handle runtime stats saved per thread - */ -static struct thread_runtime *thread__init_runtime(struct thread *thread) -{ - struct thread_runtime *r; - - r = zalloc(sizeof(struct thread_runtime)); - if (!r) - return NULL; - - init_stats(&r->run_stats); - thread__set_priv(thread, r); - - return r; -} - -static struct thread_runtime *thread__get_runtime(struct thread *thread) -{ - struct thread_runtime *tr; - - tr = thread__priv(thread); - if (tr == NULL) { - tr = thread__init_runtime(thread); - if (tr == NULL) - pr_debug("Failed to malloc memory for runtime data.\n"); - } - - return tr; -} - static struct thread *timehist_get_thread(struct perf_sched *sched, struct perf_sample *sample, struct machine *machine, @@ -3291,7 +3340,7 @@ int cmd_sched(int argc, const char **argv) struct perf_sched sched = { .tool = { .sample = perf_sched__process_tracepoint_sample, - .comm = perf_event__process_comm, + .comm = perf_sched__process_comm, .namespaces = perf_event__process_namespaces, .lost = perf_event__process_lost, .fork = perf_sched__process_fork_event, |