Age | Commit message (Collapse) | Author | Files | Lines |
|
Like in 'perf report' and 'perf top', Add this option to limit the
number of functions it displays based on the overhead value in percent.
This affects only stdio and stdio2 output modes. Without this, it
shows very long disassembly lines for every function in the data
file. If users don't want this behavior, they can set a value in
percent to suppress that.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220502232015.697243-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Return the result from hist_entry_iter__add() directly instead of taking
this in another redundant variable.
Signed-off-by: tangmeng <tangmeng@uniontech.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220216030425.27779-1-tangmeng@uniontech.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Make the --tui command line flags dependent HAVE_SLANG_SUPPORT. This was
reported as confusing in:
https://lore.kernel.org/linux-perf-users/YevaTkzdXmFKdGpc@zx-spectrum.none/
Reported-by: xaizek <xaizek@posteo.net>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: xaizek <xaizek@posteo.net>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220123191849.3655855-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Only perf report checked the validity of these arguments so apply the
same check to all tools that read them for consistency.
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20211018134844.2627174-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The repipe argument is only used by perf inject and the all others
passes 'false'. Let's remove it from the function signature and add
__perf_session__new() to be called from perf inject directly.
This is a preparation of the change the pipe input/output.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210719223153.1618812-2-namhyung@kernel.org
[ Fixed up some trivial conflicts as this patchset fell thru the cracks ;-( ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The "auxtrace_info" and "auxtrace" functions are not set in "tool" member of
"annotate". As a result, perf annotate does not support parsing itrace data.
Before:
# perf record -e arm_spe_0/branch_filter=1/ -a sleep 1
[ perf record: Woken up 9 times to write data ]
[ perf record: Captured and wrote 20.874 MB perf.data ]
# perf annotate --stdio
Error:
The perf.data data has no samples!
Solution:
1. Add itrace options in help,
2. Set hook functions of "id_index", "auxtrace_info" and "auxtrace" in perf_tool.
After:
# perf record --all-user -e arm_spe_0/branch_filter=1/ ls
Couldn't synthesize bpf events.
perf.data
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.010 MB perf.data ]
# perf annotate --stdio
Percent | Source code & Disassembly of libc-2.28.so for branch-miss (1 samples, percent: local period)
------------------------------------------------------------------------------------------------------------
:
:
:
: Disassembly of section .text:
:
: 0000000000066180 <__getdelim@@GLIBC_2.17>:
0.00 : 66180: stp x29, x30, [sp, #-96]!
0.00 : 66184: cmp x0, #0x0
0.00 : 66188: ccmp x1, #0x0, #0x4, ne // ne = any
0.00 : 6618c: mov x29, sp
0.00 : 66190: stp x24, x25, [sp, #56]
0.00 : 66194: stp x26, x27, [sp, #72]
0.00 : 66198: str x28, [sp, #88]
0.00 : 6619c: b.eq 66450 <__getdelim@@GLIBC_2.17+0x2d0> // b.none
0.00 : 661a0: stp x22, x23, [x29, #40]
0.00 : 661a4: mov x22, x1
0.00 : 661a8: ldr w1, [x3]
0.00 : 661ac: mov w23, w2
0.00 : 661b0: stp x20, x21, [x29, #24]
0.00 : 661b4: mov x20, x3
0.00 : 661b8: mov x21, x0
0.00 : 661bc: tbnz w1, #15, 66360 <__getdelim@@GLIBC_2.17+0x1e0>
0.00 : 661c0: ldr x0, [x3, #136]
0.00 : 661c4: ldr x2, [x0, #8]
0.00 : 661c8: str x19, [x29, #16]
0.00 : 661cc: mrs x19, tpidr_el0
0.00 : 661d0: sub x19, x19, #0x700
0.00 : 661d4: cmp x2, x19
0.00 : 661d8: b.eq 663f0 <__getdelim@@GLIBC_2.17+0x270> // b.none
0.00 : 661dc: mov w1, #0x1 // #1
0.00 : 661e0: ldaxr w2, [x0]
0.00 : 661e4: cmp w2, #0x0
0.00 : 661e8: b.ne 661f4 <__getdelim@@GLIBC_2.17+0x74> // b.any
0.00 : 661ec: stxr w3, w1, [x0]
0.00 : 661f0: cbnz w3, 661e0 <__getdelim@@GLIBC_2.17+0x60>
0.00 : 661f4: b.ne 66448 <__getdelim@@GLIBC_2.17+0x2c8> // b.any
0.00 : 661f8: ldr x0, [x20, #136]
0.00 : 661fc: ldr w1, [x20]
0.00 : 66200: ldr w2, [x0, #4]
0.00 : 66204: str x19, [x0, #8]
0.00 : 66208: add w2, w2, #0x1
0.00 : 6620c: str w2, [x0, #4]
0.00 : 66210: tbnz w1, #5, 66388 <__getdelim@@GLIBC_2.17+0x208>
0.00 : 66214: ldr x19, [x29, #16]
0.00 : 66218: ldr x0, [x21]
0.00 : 6621c: cbz x0, 66228 <__getdelim@@GLIBC_2.17+0xa8>
0.00 : 66220: ldr x0, [x22]
0.00 : 66224: cbnz x0, 6623c <__getdelim@@GLIBC_2.17+0xbc>
0.00 : 66228: mov x0, #0x78 // #120
0.00 : 6622c: str x0, [x22]
0.00 : 66230: bl 20710 <malloc@plt>
0.00 : 66234: str x0, [x21]
0.00 : 66238: cbz x0, 66428 <__getdelim@@GLIBC_2.17+0x2a8>
0.00 : 6623c: ldr x27, [x20, #8]
0.00 : 66240: str x19, [x29, #16]
0.00 : 66244: ldr x19, [x20, #16]
0.00 : 66248: sub x19, x19, x27
0.00 : 6624c: cmp x19, #0x0
0.00 : 66250: b.le 66398 <__getdelim@@GLIBC_2.17+0x218>
0.00 : 66254: mov x25, #0x0 // #0
0.00 : 66258: b 662d8 <__getdelim@@GLIBC_2.17+0x158>
0.00 : 6625c: nop
0.00 : 66260: add x24, x19, x25
0.00 : 66264: ldr x3, [x22]
0.00 : 66268: add x26, x24, #0x1
0.00 : 6626c: ldr x0, [x21]
0.00 : 66270: cmp x3, x26
0.00 : 66274: b.cs 6629c <__getdelim@@GLIBC_2.17+0x11c> // b.hs, b.nlast
0.00 : 66278: lsl x3, x3, #1
0.00 : 6627c: cmp x3, x26
0.00 : 66280: csel x26, x3, x26, cs // cs = hs, nlast
0.00 : 66284: mov x1, x26
0.00 : 66288: bl 206f0 <realloc@plt>
0.00 : 6628c: cbz x0, 66438 <__getdelim@@GLIBC_2.17+0x2b8>
0.00 : 66290: str x0, [x21]
0.00 : 66294: ldr x27, [x20, #8]
0.00 : 66298: str x26, [x22]
0.00 : 6629c: mov x2, x19
0.00 : 662a0: mov x1, x27
0.00 : 662a4: add x0, x0, x25
0.00 : 662a8: bl 87390 <explicit_bzero@@GLIBC_2.25+0x50>
0.00 : 662ac: ldr x0, [x20, #8]
0.00 : 662b0: add x19, x0, x19
0.00 : 662b4: str x19, [x20, #8]
0.00 : 662b8: cbnz x28, 66410 <__getdelim@@GLIBC_2.17+0x290>
0.00 : 662bc: mov x0, x20
0.00 : 662c0: bl 73b80 <__underflow@@GLIBC_2.17>
0.00 : 662c4: cmn w0, #0x1
0.00 : 662c8: b.eq 66410 <__getdelim@@GLIBC_2.17+0x290> // b.none
0.00 : 662cc: ldp x27, x19, [x20, #8]
0.00 : 662d0: mov x25, x24
0.00 : 662d4: sub x19, x19, x27
0.00 : 662d8: mov x2, x19
0.00 : 662dc: mov w1, w23
0.00 : 662e0: mov x0, x27
0.00 : 662e4: bl 807b0 <memchr@@GLIBC_2.17>
0.00 : 662e8: cmp x0, #0x0
0.00 : 662ec: mov x28, x0
0.00 : 662f0: sub x0, x0, x27
0.00 : 662f4: csinc x19, x19, x0, eq // eq = none
0.00 : 662f8: mov x0, #0x7fffffffffffffff // #9223372036854775807
0.00 : 662fc: sub x0, x0, x25
0.00 : 66300: cmp x19, x0
0.00 : 66304: b.lt 66260 <__getdelim@@GLIBC_2.17+0xe0> // b.tstop
0.00 : 66308: adrp x0, 17f000 <sys_sigabbrev@@GLIBC_2.17+0x320>
0.00 : 6630c: ldr x0, [x0, #3624]
0.00 : 66310: mrs x2, tpidr_el0
0.00 : 66314: ldr x19, [x29, #16]
0.00 : 66318: mov w3, #0x4b // #75
0.00 : 6631c: ldr w1, [x20]
0.00 : 66320: mov x24, #0xffffffffffffffff // #-1
0.00 : 66324: str w3, [x2, x0]
0.00 : 66328: tbnz w1, #15, 66340 <__getdelim@@GLIBC_2.17+0x1c0>
0.00 : 6632c: ldr x0, [x20, #136]
0.00 : 66330: ldr w1, [x0, #4]
0.00 : 66334: sub w1, w1, #0x1
0.00 : 66338: str w1, [x0, #4]
0.00 : 6633c: cbz w1, 663b8 <__getdelim@@GLIBC_2.17+0x238>
0.00 : 66340: mov x0, x24
0.00 : 66344: ldr x28, [sp, #88]
0.00 : 66348: ldp x20, x21, [x29, #24]
0.00 : 6634c: ldp x22, x23, [x29, #40]
0.00 : 66350: ldp x24, x25, [sp, #56]
0.00 : 66354: ldp x26, x27, [sp, #72]
0.00 : 66358: ldp x29, x30, [sp], #96
0.00 : 6635c: ret
100.00 : 66360: tbz w1, #5, 66218 <__getdelim@@GLIBC_2.17+0x98>
0.00 : 66364: ldp x20, x21, [x29, #24]
0.00 : 66368: mov x24, #0xffffffffffffffff // #-1
0.00 : 6636c: ldp x22, x23, [x29, #40]
0.00 : 66370: mov x0, x24
0.00 : 66374: ldp x24, x25, [sp, #56]
0.00 : 66378: ldp x26, x27, [sp, #72]
0.00 : 6637c: ldr x28, [sp, #88]
0.00 : 66380: ldp x29, x30, [sp], #96
0.00 : 66384: ret
0.00 : 66388: mov x24, #0xffffffffffffffff // #-1
0.00 : 6638c: ldr x19, [x29, #16]
0.00 : 66390: b 66328 <__getdelim@@GLIBC_2.17+0x1a8>
0.00 : 66394: nop
0.00 : 66398: mov x0, x20
0.00 : 6639c: bl 73b80 <__underflow@@GLIBC_2.17>
0.00 : 663a0: cmn w0, #0x1
0.00 : 663a4: b.eq 66438 <__getdelim@@GLIBC_2.17+0x2b8> // b.none
0.00 : 663a8: ldp x27, x19, [x20, #8]
0.00 : 663ac: sub x19, x19, x27
0.00 : 663b0: b 66254 <__getdelim@@GLIBC_2.17+0xd4>
0.00 : 663b4: nop
0.00 : 663b8: str xzr, [x0, #8]
0.00 : 663bc: ldxr w2, [x0]
0.00 : 663c0: stlxr w3, w1, [x0]
0.00 : 663c4: cbnz w3, 663bc <__getdelim@@GLIBC_2.17+0x23c>
0.00 : 663c8: cmp w2, #0x1
0.00 : 663cc: b.le 66340 <__getdelim@@GLIBC_2.17+0x1c0>
0.00 : 663d0: mov x1, #0x81 // #129
0.00 : 663d4: mov x2, #0x1 // #1
0.00 : 663d8: mov x3, #0x0 // #0
0.00 : 663dc: mov x8, #0x62 // #98
0.00 : 663e0: svc #0x0
0.00 : 663e4: ldp x20, x21, [x29, #24]
0.00 : 663e8: ldp x22, x23, [x29, #40]
0.00 : 663ec: b 66370 <__getdelim@@GLIBC_2.17+0x1f0>
0.00 : 663f0: ldr w2, [x0, #4]
0.00 : 663f4: add w2, w2, #0x1
0.00 : 663f8: str w2, [x0, #4]
0.00 : 663fc: tbz w1, #5, 66214 <__getdelim@@GLIBC_2.17+0x94>
0.00 : 66400: mov x24, #0xffffffffffffffff // #-1
0.00 : 66404: ldr x19, [x29, #16]
0.00 : 66408: b 66330 <__getdelim@@GLIBC_2.17+0x1b0>
0.00 : 6640c: nop
0.00 : 66410: ldr x0, [x21]
0.00 : 66414: strb wzr, [x0, x24]
0.00 : 66418: ldr w1, [x20]
0.00 : 6641c: ldr x19, [x29, #16]
0.00 : 66420: b 66328 <__getdelim@@GLIBC_2.17+0x1a8>
0.00 : 66424: nop
0.00 : 66428: mov x24, #0xffffffffffffffff // #-1
0.00 : 6642c: ldr w1, [x20]
0.00 : 66430: b 66328 <__getdelim@@GLIBC_2.17+0x1a8>
0.00 : 66434: nop
0.00 : 66438: mov x24, #0xffffffffffffffff // #-1
0.00 : 6643c: ldr w1, [x20]
0.00 : 66440: ldr x19, [x29, #16]
0.00 : 66444: b 66328 <__getdelim@@GLIBC_2.17+0x1a8>
0.00 : 66448: bl e3ba0 <pthread_setcanceltype@@GLIBC_2.17+0x30>
0.00 : 6644c: b 661f8 <__getdelim@@GLIBC_2.17+0x78>
0.00 : 66450: adrp x0, 17f000 <sys_sigabbrev@@GLIBC_2.17+0x320>
0.00 : 66454: ldr x0, [x0, #3624]
0.00 : 66458: mrs x1, tpidr_el0
0.00 : 6645c: mov w2, #0x16 // #22
0.00 : 66460: mov x24, #0xffffffffffffffff // #-1
0.00 : 66464: str w2, [x1, x0]
0.00 : 66468: b 66370 <__getdelim@@GLIBC_2.17+0x1f0>
0.00 : 6646c: ldr w1, [x20]
0.00 : 66470: mov x4, x0
0.00 : 66474: tbnz w1, #15, 6648c <__getdelim@@GLIBC_2.17+0x30c>
0.00 : 66478: ldr x0, [x20, #136]
0.00 : 6647c: ldr w1, [x0, #4]
0.00 : 66480: sub w1, w1, #0x1
0.00 : 66484: str w1, [x0, #4]
0.00 : 66488: cbz w1, 66494 <__getdelim@@GLIBC_2.17+0x314>
0.00 : 6648c: mov x0, x4
0.00 : 66490: bl 20e40 <gnu_get_libc_version@@GLIBC_2.17+0x130>
0.00 : 66494: str xzr, [x0, #8]
0.00 : 66498: ldxr w2, [x0]
0.00 : 6649c: stlxr w3, w1, [x0]
0.00 : 664a0: cbnz w3, 66498 <__getdelim@@GLIBC_2.17+0x318>
0.00 : 664a4: cmp w2, #0x1
0.00 : 664a8: b.le 6648c <__getdelim@@GLIBC_2.17+0x30c>
0.00 : 664ac: mov x1, #0x81 // #129
0.00 : 664b0: mov x2, #0x1 // #1
0.00 : 664b4: mov x3, #0x0 // #0
0.00 : 664b8: mov x8, #0x62 // #98
0.00 : 664bc: svc #0x0
0.00 : 664c0: b 6648c <__getdelim@@GLIBC_2.17+0x30c>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210615091704.259202-1-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To make the output more readable, I think it's better to remove 0's in
the output. Also the dummy event has no event stats so it just wasts
the space. Let's use the --skip-empty option to suppress it.
$ perf report --stat --skip-empty
Aggregated stats:
TOTAL events: 16530
MMAP events: 226
COMM events: 1596
EXIT events: 2
THROTTLE events: 121
UNTHROTTLE events: 117
FORK events: 1595
SAMPLE events: 719
MMAP2 events: 12147
CGROUP events: 2
FINISHED_ROUND events: 2
THREAD_MAP events: 1
CPU_MAP events: 1
TIME_CONV events: 1
cycles stats:
SAMPLE events: 719
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427013717.1651674-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Each struct hists have events_stats but most of the fields were not
used. It's to count number of samples and periods whether filtered or
not. And other fields are used only by evlist.
So it'd be better to split hists_stats and events_stats to reduce
wasted memory in the struct hists. This makes the output of event
statistics in the perf report compact by skipping 0 events in each
evsel/hists.
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427013717.1651674-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In hist__find_annotations(), since different 'struct hist_entry' entries
may point to same symbol, we free notes->src to signal already processed
this symbol in stdio mode; when annotate, entry will skipped if
notes->src is NULL to avoid repeated output.
However, there is a problem, for example, run the following command:
# perf record -e branch-misses -e branch-instructions -a sleep 1
perf.data file contains different types of sample event.
If the same IP sample event exists in branch-misses and branch-instructions,
this event uses the same symbol. When annotate branch-misses events, notes->src
corresponding to this event is set to null, as a result, when annotate
branch-instructions events, this event is skipped and no annotate is output.
Solution of this patch is to remove zfree in hists__find_annotations and
change sort order to "dso,symbol" to avoid duplicate output when different
processes correspond to the same symbol.
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: zhangjinhao2@huawei.com
Link: http://lore.kernel.org/lkml/20210319123527.173883-1-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
'perf annotate' supports --symbol but it's impossible to filter a C++
symbol. With --no-demangle one can filter easily by mangled function
name.
Signed-off-by: Martin Liška <mliska@suse.cz>
Link: http://lore.kernel.org/lkml/c3c7e959-9f7f-18e2-e795-f604275cbac3@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Fix ~124 single-word typos and a few spelling errors in the perf tooling code,
accumulated over the years.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210321113734.GA248990@gmail.com
Link: http://lore.kernel.org/lkml/20210323160915.GA61903@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
As it is a 'struct evsel' method, not part of tools/lib/perf/, aka
libperf, to whom the perf_ prefix belongs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
evsel__*()
As those is a 'struct evsel' methods, not part of tools/lib/perf/, aka
libperf, to whom the perf_ prefix belongs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
As those are 'struct evsel' methods, not part of tools/lib/perf/, aka
libperf, to whom the perf_ prefix belongs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
As they are not 'struct evsel' methods, not part of tools/lib/perf/, aka
libperf, to whom the perf_ prefix belongs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
For all the perf-config options that can also be set from command line
option, the preference is given to command line version in case of any
conflict. But that's opposite in case of perf annotate. i.e. the more
preference is given to default option rather than command line option.
Fix it.
Before:
$ ./perf config
annotate.show_nr_samples=false
$ ./perf annotate shash --show-nr-samples
Percent│
│24: mov -0xc(%rbp),%eax
49.19 │ imul $0x1003f,%eax,%ecx
│ mov -0x18(%rbp),%rax
After:
Samples│
│24: mov -0xc(%rbp),%eax
1 │ imul $0x1003f,%eax,%ecx
│ mov -0x18(%rbp),%rax
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yisheng Xie <xieyisheng1@huawei.com>
Link: http://lore.kernel.org/lkml/20200213064306.160480-7-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
perf default config set by user in [annotate] section is totally ignored
by annotate code. Fix it.
Before:
$ ./perf config
annotate.hide_src_code=true
annotate.show_nr_jumps=true
annotate.show_nr_samples=true
$ ./perf annotate shash
│ unsigned h = 0;
│ movl $0x0,-0xc(%rbp)
│ while (*s)
│ ↓ jmp 44
│ h = 65599 * h + *s++;
11.33 │24: mov -0xc(%rbp),%eax
43.50 │ imul $0x1003f,%eax,%ecx
│ mov -0x18(%rbp),%rax
After:
│ movl $0x0,-0xc(%rbp)
│ ↓ jmp 44
1 │1 24: mov -0xc(%rbp),%eax
4 │ imul $0x1003f,%eax,%ecx
│ mov -0x18(%rbp),%rax
Note that we have removed show_nr_samples and show_total_period from
annotation_options because they are not used. Instead of them we use
symbol_conf.show_nr_samples and symbol_conf.show_total_period.
Committer testing:
Using 'perf annotate --stdio2' to use the TUI rendering but emitting the output to stdio:
# perf config
#
# perf config annotate.hide_src_code=true
# perf config
annotate.hide_src_code=true
#
# perf config annotate.show_nr_jumps=true
# perf config annotate.show_nr_samples=true
# perf config
annotate.hide_src_code=true
annotate.show_nr_jumps=true
annotate.show_nr_samples=true
#
#
Before:
# perf annotate --stdio2 ObjectInstance::weak_pointer_was_finalized
Samples: 1 of event 'cycles', 4000 Hz, Event count (approx.): 830873, [percent: local period]
ObjectInstance::weak_pointer_was_finalized() /usr/lib64/libgjs.so.0.0.0
Percent
00000000000609f0 <ObjectInstance::weak_pointer_was_finalized()@@Base>:
endbr64
cmpq $0x0,0x20(%rdi)
↓ je 10
xor %eax,%eax
← retq
xchg %ax,%ax
100.00 10: push %rbp
cmpq $0x0,0x18(%rdi)
mov %rdi,%rbp
↓ jne 20
1b: xor %eax,%eax
pop %rbp
← retq
nop
20: lea 0x18(%rdi),%rdi
→ callq JS_UpdateWeakPointerAfterGC(JS::Heap<JSObject*
cmpq $0x0,0x18(%rbp)
↑ jne 1b
mov %rbp,%rdi
→ callq ObjectBase::jsobj_addr() const@plt
mov $0x1,%eax
pop %rbp
← retq
#
After:
# perf annotate --stdio2 ObjectInstance::weak_pointer_was_finalized 2> /dev/null
Samples: 1 of event 'cycles', 4000 Hz, Event count (approx.): 830873, [percent: local period]
ObjectInstance::weak_pointer_was_finalized() /usr/lib64/libgjs.so.0.0.0
Samples endbr64
cmpq $0x0,0x20(%rdi)
↓ je 10
xor %eax,%eax
← retq
xchg %ax,%ax
1 1 10: push %rbp
cmpq $0x0,0x18(%rdi)
mov %rdi,%rbp
↓ jne 20
1 1b: xor %eax,%eax
pop %rbp
← retq
nop
1 20: lea 0x18(%rdi),%rdi
→ callq JS_UpdateWeakPointerAfterGC(JS::Heap<JSObject*
cmpq $0x0,0x18(%rbp)
↑ jne 1b
mov %rbp,%rdi
→ callq ObjectBase::jsobj_addr() const@plt
mov $0x1,%eax
pop %rbp
← retq
#
# perf config annotate.show_nr_jumps
annotate.show_nr_jumps=true
# perf config annotate.show_nr_jumps=false
# perf config annotate.show_nr_jumps
annotate.show_nr_jumps=false
#
# perf annotate --stdio2 ObjectInstance::weak_pointer_was_finalized 2> /dev/null
Samples: 1 of event 'cycles', 4000 Hz, Event count (approx.): 830873, [percent: local period]
ObjectInstance::weak_pointer_was_finalized() /usr/lib64/libgjs.so.0.0.0
Samples endbr64
cmpq $0x0,0x20(%rdi)
↓ je 10
xor %eax,%eax
← retq
xchg %ax,%ax
1 10: push %rbp
cmpq $0x0,0x18(%rdi)
mov %rdi,%rbp
↓ jne 20
1b: xor %eax,%eax
pop %rbp
← retq
nop
20: lea 0x18(%rdi),%rdi
→ callq JS_UpdateWeakPointerAfterGC(JS::Heap<JSObject*
cmpq $0x0,0x18(%rbp)
↑ jne 1b
mov %rbp,%rdi
→ callq ObjectBase::jsobj_addr() const@plt
mov $0x1,%eax
pop %rbp
← retq
#
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yisheng Xie <xieyisheng1@huawei.com>
Link: http://lore.kernel.org/lkml/20200213064306.160480-6-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The objdump utility has useful --prefix / --prefix-strip options to
allow changing source code file names hardcoded into executables' debug
info. Add options to 'perf report', 'perf top' and 'perf annotate',
which are then passed to objdump.
$ mkdir foo
$ echo 'main() { for (;;); }' > foo/foo.c
$ gcc -g foo/foo.c
foo/foo.c:1:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
1 | main() { for (;;); }
| ^~~~
$ perf record ./a.out
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.230 MB perf.data (5721 samples) ]
$ mv foo bar
$ perf annotate
<does not show source code>
$ perf annotate --prefix=/home/ak/lsrc/git/bar --prefix-strip=5
<does show source code>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
LPU-Reference: 20200107210444.214071-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
So that we pass that substructure around and with it consolidate lots of
functions that receive a (map, symbol) pair and now can receive just a
'struct map_symbol' pointer.
This further paves the way to add 'struct map_groups' to 'struct
map_symbol' so that we can have all we need for annotation so that we
can ditch 'struct map'->groups, i.e. have the map_groups pointer in a
more central place, avoiding the pointer in the 'struct map' that have
tons of instances.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-fs90ttd9q12l7989fo7pw81q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
'symbol' pointers
We are already passing things like:
symbol__annotate(ms->sym, ms->map, ...)
So shorten the signature of such functions to receive the 'map_symbol'
pointer.
This also paves the way to having the 'struct map_groups' pointer in the
'struct map_symbol' so that we can get rid of 'struct map'->groups.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-23yx8v1t41nzpkpi7rdrozww@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We can get the per sample cycles by hist__account_cycles(). It's also
useful to know the total cycles of all samples in order to get the
cycles coverage for a single program block in further. For example:
coverage = per block sampled cycles / total sampled cycles
This patch creates a new argument 'total_cycles' in hist__account_cycles(),
which will be added with the cycles of each sample.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191107074719.26139-4-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This patch is to return error code of perf_new_session function on
failure instead of NULL.
Test Results:
Before Fix:
$ perf c2c report -input
failed to open nput: No such file or directory
$ echo $?
0
$
After Fix:
$ perf c2c report -input
failed to open nput: No such file or directory
$ echo $?
254
$
Committer notes:
Fix 'perf tests topology' case, where we use that TEST_ASSERT_VAL(...,
session), i.e. we need to pass zero in case of failure, which was the
case before when NULL was returned by perf_session__new() for failure,
but now we need to negate the result of IS_ERR(session) to respect that
TEST_ASSERT_VAL) expectation of zero meaning failure.
Reported-by: Nageswara R Sastry <rnsastry@linux.vnet.ibm.com>
Signed-off-by: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Nageswara R Sastry <rnsastry@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shawn Landden <shawn@git.icu>
Cc: Song Liu <songliubraving@fb.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Link: http://lore.kernel.org/lkml/20190822071223.17892.45782.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We use what is defined there, were getting it by luck, indirectly, fix
it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-56g4jshmktniundmiw7h845k@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The mem_info struct goes to mem-events.h and branch_info goes to
branch.h, where they belong, this way we can remove several headers from
symbols.h and trim the include dependency tree more.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-aupw71xnravcsu2xoabfmhpc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Now that thread.h isn't included by any other header, we can check where
it is really needed, i.e. we can remove it and be sure that it isn't
being obtained indirectly.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-kh333ivjbw05wsggckpziu86@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
So that we can reduce the header dependency tree further, in the process
noticed that lots of places were getting even things like build-id
routines and 'struct perf_tool' definition indirectly, so fix all those
too.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ti0btma9ow5ndrytyoqdk62j@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Rename struct perf_evsel to struct evsel, so we don't have a name clash
when we add struct perf_evsel in libperf.
Committer notes:
Added fixes for arm64, provided by Jiri.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Eroding a bit more the tools/perf/util/util.h hodpodge header.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-natazosyn9rwjka25tvcnyi0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The hist__account_cycles() function is executed when the
hist_iter__branch_callback() is called.
But it looks it's not necessary. In hist__account_cycles, it already
walks on all branch entries.
This patch moves the hist__account_cycles out of callback, now the data
processing is much faster than before.
Previous code has an issue that the ch[offset].num++ (in
__symbol__account_cycles) is executed repeatedly since
hist__account_cycles is called in each hist_iter__branch_callback, so
the counting of ch[offset].num is not correct (too big).
With this patch, the issue is fixed. And we don't need the code of
"ch->reset >= ch->num / 2" to check if there are too many overlaps (in
annotation__count_and_fill), otherwise some data would be hidden.
Now, we can try, for example:
perf record -b ...
perf annotate or perf report -s symbol
The before/after output should be no change.
v3:
---
Fix the crash in stdio mode.
Like previous code, it needs the checking of ui__has_annotation()
before hist__account_cycles()
v2:
---
1. Cover the similar perf report
2. Remove the checking code "ch->reset >= ch->num / 2"
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1552684577-29041-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a 'path' member to 'struct perf_data'. It will keep the configured
path for the data (const char *). The path in struct perf_data_file is
now dynamically allocated (duped) from it.
This scheme is useful/used in following patches where struct
perf_data::path holds the 'configure' directory path and struct
perf_data_file::path holds the allocated path for specific files.
Also it actually makes the code little simpler.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190221094145.9151-3-jolsa@kernel.org
[ Fixup data-convert-bt.c missing conversion ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Lots of places get the map.h file indirectly, and since we're going to
remove it from machine.h, then those need to include it directly, do it
now, before we remove that dep.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ob8jehdjda8h5jsrv9dqj9tf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
At the cost of an extra pointer, we can avoid the O(logN) cost of
finding the first element in the tree (smallest node), which is
something heavily required for histograms. Specifically, the following
are converted to rb_root_cached, and users accordingly:
hist::entries_in_array
hist::entries_in
hist::entries
hist::entries_collapsed
hist_entry::hroot_in
hist_entry::hroot_out
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20181206191819.30182-7-dave@stgolabs.net
[ Added some missing conversions to rb_first_cached() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
At the cost of an extra pointer, we can avoid the O(logN) cost of
finding the first element in the tree (smallest node).
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20181206191819.30182-6-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Now that we keep a perf_tool pointer inside perf_session, there's no
need to have a perf_tool argument in the event_op2 callback. Remove it.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180913125450.21342-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add --percent-type option to set annotation percent type from following
choices:
global-period, local-period, global-hits, local-hits
Examples:
$ perf annotate --percent-type period-local --stdio | head -1
Percent | Source code ... es, percent: local period)
$ perf annotate --percent-type hits-local --stdio | head -1
Percent | Source code ... es, percent: local hits)
$ perf annotate --percent-type hits-global --stdio | head -1
Percent | Source code ... es, percent: global hits)
$ perf annotate --percent-type period-global --stdio | head -1
Percent | Source code ... es, percent: global period)
The local/global keywords set if the percentage is computed in the scope
of the function (local) or the whole data (global).
The period/hits keywords set the base the percentage is computed on -
the samples period or the number of samples (hits).
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20180804130521.11408-20-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
perf_event__process_feature() accesses feat_ops[HEADER_LAST_FEATURE]
which is not defined and thus perf is crashing. HEADER_LAST_FEATURE is
used as an end marker for the perf report but it's unused for perf
script/annotate. Ignore HEADER_LAST_FEATURE for perf script/annotate,
just like it is done in 'perf report'.
Before:
# perf record -o - ls | perf script
<SNIP 'ls' output>
Segmentation fault (core dumped)
#
After:
# perf record -o - ls | perf script
<SNIP 'ls' output>
Segmentation fault (core dumped)
ls 7031 4392.099856: 250000 cpu-clock:uhH: 7f5e0ce7cd60
ls 7031 4392.100355: 250000 cpu-clock:uhH: 7f5e0c706ef7
#
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 57b5de463925 ("perf report: Support forced leader feature in pipe mode")
Link: http://lkml.kernel.org/r/20180625124220.6434-4-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
One more step in grouping annotation options.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-sogzdhugoavm6fyw60jnb0vs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
So that things changed in the command line may percolate to the browser
code without using globals.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-5daawc40zhl6gcs600com1ua@git.kernel.org
[ Merged fix for NO_SLANG=1 build provided by Jiri Olsa ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Continuing to group annotation specific stuff into a struct.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-p3cdhltj58jt0byjzg3g7obx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Continuing to group annotation options in an annotation specific struct.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-astei92tzxp4yccag5pxb2h7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Accross all the routines, this way we can have eventually have a
consistent set of defaults for all UIs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-6qgtixurjgdk5u0n3rw78ges@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The code gets shorter and we'll be able to use evsel->evlist in a
followup patch.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-t0s7vy19wq5kak74kavm8swf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
With the '--group' option, even for non-explicit group, 'perf annotate'
will enable the group output.
For example,
$ perf record -e cycles,branches ./div
$ perf annotate main --stdio --group
: Disassembly of section .text:
:
: 00000000004004b0 <main>:
: main():
:
: return i;
: }
:
: int main(void)
: {
0.00 0.00 : 4004b0: push %rbx
: int i;
: int flag;
: volatile double x = 1212121212, y = 121212;
:
: s_randseed = time(0);
0.00 0.00 : 4004b1: xor %edi,%edi
: srand(s_randseed);
0.00 0.00 : 4004b3: mov $0x77359400,%ebx
:
: return i;
: }
:
But if without --group, there is only one event reported.
$ perf annotate main --stdio
: Disassembly of section .text:
:
: 00000000004004b0 <main>:
: main():
:
: return i;
: }
:
: int main(void)
: {
0.00 : 4004b0: push %rbx
: int i;
: int flag;
: volatile double x = 1212121212, y = 121212;
:
: s_randseed = time(0);
0.00 : 4004b1: xor %edi,%edi
: srand(s_randseed);
0.00 : 4004b3: mov $0x77359400,%ebx
:
: return i;
: }
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1526914666-31839-4-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Remove the split of symbol tables for data (MAP__VARIABLE) and for
functions (MAP__FUNCTION), its unneeded and there were various places
doing two lookups to find a symbol, so simplify this.
We still will consider only the symbols that matched the filters in
place, i.e. see the (elf_(sec,sym)|symbol_type)__filter() routines in
the patch, just so that we consider only the same symbols as before,
to reduce the possibility of regressions.
All the tests on 50-something build environments, in varios versions
of lots of distros and cross build environments were performed without
build regressions, as usual with all pull requests the other tests were
also performed: 'perf test' and 'make -C tools/perf build-test'.
Also this was done at a great granularity so that regressions can be
bisected more easily.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-hiq0fy2rsleupnqqwuojo1ne@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is already present in 'perf top', albeit undocumented (will fix),
and is useful to use /proc/kcore instead of vmlinux and then get what is
really in place, not what the kernel starts with, before alternatives,
ftrace .text patching, etc, see the differences:
# perf annotate --stdio2 _raw_spin_lock_irqsave
_raw_spin_lock_irqsave() /lib/modules/4.16.0-rc4/build/vmlinux
Event: anon group { cycles, instructions }
0.00 3.17 → callq __fentry__
0.00 7.94 push %rbx
7.69 36.51 → callq __page_file_index
mov %rax,%rbx
7.69 3.17 → callq *ffffffff82225cd0
xor %eax,%eax
mov $0x1,%edx
80.77 49.21 lock cmpxchg %edx,(%rdi)
test %eax,%eax
↓ jne 2b
3.85 0.00 mov %rbx,%rax
pop %rbx
← retq
2b: mov %eax,%esi
→ callq queued_spin_lock_slowpath
mov %rbx,%rax
pop %rbx
← retq
[root@jouet ~]# perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave
_raw_spin_lock_irqsave() /proc/kcore
Event: anon group { cycles, instructions }
0.00 3.17 nop
0.00 7.94 push %rbx
0.00 23.81 pushfq
7.69 12.70 pop %rax
nop
mov %rax,%rbx
7.69 3.17 cli
nop
xor %eax,%eax
mov $0x1,%edx
80.77 49.21 lock cmpxchg %edx,(%rdi)
test %eax,%eax
↓ jne 2b
3.85 0.00 mov %rbx,%rax
pop %rbx
← retq
2b: mov %eax,%esi
→ callq *ffffffff820e96b0
mov %rbx,%rax
pop %rbx
← retq
#
Diff of the output of those commands:
# perf annotate --stdio2 _raw_spin_lock_irqsave > /tmp/vmlinux
# perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave > /tmp/kcore
# diff -y /tmp/vmlinux /tmp/kcore
_raw_spin_lock_irqsave() vmlinux | _raw_spin_lock_irqsave() /proc/kcore
Event: anon group { cycles, instructions } Event: anon group { cycles, instructions }
0.00 3.17 → callq __fentry__ | 0.00 3.17 nop
0.00 7.94 push %rbx 0.00 7.94 push %rbx
7.69 36.51 → callq __page_file_index | 0.00 23.81 pushfq
> 7.69 12.70 pop %rax
> nop
mov %rax,%rbx mov %rax,%rbx
7.69 3.17 → callq *ffffffff82225cd0 | 7.69 3.17 cli
> nop
xor %eax,%eax xor %eax,%eax
mov $0x1,%edx mov $0x1,%edx
80.77 49.21 lock cmpxchg %edx,(%rdi) 80.77 49.21 lock cmpxchg %edx,(%rdi)
test %eax,%eax test %eax,%eax
↓ jne 2b ↓ jne 2b
3.85 0.00 mov %rbx,%rax 3.85 0.00 mov %rbx,%rax
pop %rbx pop %rbx
← retq ← retq
2b: mov %eax,%esi 2b: mov %eax,%esi
→ callq queued_spin_lock_slowpath| → callq *ffffffff820e96b0
mov %rbx,%rax mov %rbx,%rax
pop %rbx pop %rbx
← retq ← retq
#
This should be further streamlined by doing both annotations and
allowing the TUI to toggle initial/current, and show the patched
instructions in a slightly different color.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-wz8d269hxkcwaczr0r4rhyjg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
One more thing that goes from the TUI code to be used more widely,
for instance it'll affect the default options used by:
perf annotate --stdio2
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-0nsz0dm0akdbo30vgja2a10e@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This uses the TUI augmented formatting routines, modulo interactivity.
# perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave
_raw_spin_lock_irqsave() /proc/kcore
Event: cycles:ppp
Percent
Disassembly of section load0:
ffffffff9a8734b0 <load0>:
nop
push %rbx
50.00 pushfq
pop %rax
nop
mov %rax,%rbx
cli
nop
xor %eax,%eax
mov $0x1,%edx
50.00 lock cmpxchg %edx,(%rdi)
test %eax,%eax
↓ jne 2b
mov %rbx,%rax
pop %rbx
← retq
2b: mov %eax,%esi
→ callq queued_spin_lock_slowpath
mov %rbx,%rax
pop %rbx
← retq
Tested-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-6cte5o8z84mbivbvqlg14uh1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Unlike the perf report interactive annotate mode, the perf annotate
doesn't display the IPC/Cycle even if branch info is recorded in perf
data file.
perf record -b ...
perf annotate function
It should show IPC/cycle, but it doesn't.
This patch lets perf annotate support the displaying of IPC/Cycle if
branch info is in perf data.
For example,
perf annotate compute_flag
Percent│ IPC Cycle
│
│
│ Disassembly of section .text:
│
│ 0000000000400640 <compute_flag>:
│ compute_flag():
│ volatile int count;
│ static unsigned int s_randseed;
│
│ __attribute__((noinline))
│ int compute_flag()
│ {
22.96 │1.18 584 sub $0x8,%rsp
│ int i;
│
│ i = rand() % 2;
23.02 │1.18 1 → callq rand@plt
│
│ return i;
27.05 │3.37 mov %eax,%edx
│ }
│3.37 add $0x8,%rsp
│ {
│ int i;
│
│ i = rand() % 2;
│
│ return i;
│3.37 shr $0x1f,%edx
│3.37 add %edx,%eax
│3.37 and $0x1,%eax
│3.37 sub %edx,%eax
│ }
26.97 │3.37 2 ← retq
Note that, this patch only supports TUI mode. For stdio, now it just keeps
original behavior. Will support it in a follow-up patch.
$ perf annotate compute_flag --stdio
Percent | Source code & Disassembly of div for cycles:ppp (7993 samples)
------------------------------------------------------------------------------
:
:
:
: Disassembly of section .text:
:
: 0000000000400640 <compute_flag>:
: compute_flag():
: volatile int count;
: static unsigned int s_randseed;
:
: __attribute__((noinline))
: int compute_flag()
: {
0.29 : 400640: sub $0x8,%rsp # +100.00%
: int i;
:
: i = rand() % 2;
42.93 : 400644: callq 400490 <rand@plt> # -100.00% (p:100.00%)
:
: return i;
0.10 : 400649: mov %eax,%edx # +100.00%
: }
0.94 : 40064b: add $0x8,%rsp
: {
: int i;
:
: i = rand() % 2;
:
: return i;
27.02 : 40064f: shr $0x1f,%edx
0.15 : 400652: add %edx,%eax
1.24 : 400654: and $0x1,%eax
2.08 : 400657: sub %edx,%eax
: }
25.26 : 400659: retq # -100.00% (p:100.00%)
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/20180223170210.GC7045@tassilo.jf.intel.com
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1519724327-7773-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|