summaryrefslogtreecommitdiffstats
path: root/tools/perf/util/annotate.c
AgeCommit message (Collapse)AuthorFilesLines
2018-04-05perf annotate: Show group details on the title lineArnaldo Carvalho de Melo1-2/+5
To match what is shown in the main 'perf report/top' title lines, i.e. if a group is being shown, either a real group (recorded with "-e '{a,b,c}') or a forced group (using 'perf report --group' for a perf.data file recorded without {}) we will show multiple columns, one per event, but we were failing to show the group details, so, for: # perf report --header-only | grep cmdline # cmdline : /home/acme/bin/perf record -e {cycles,instructions,cache-misses} # perf report --group The first line was showing just "cycles", now it shows the correct line, which is: Samples: 578 of events 'anon group { cycles, instructions, cache-misses }', 4000 Hz, Event count (approx.): 487421794 syscall_return_via_sysret /lib/modules/4.16.0-rc7/build/vmlinux 0.22 2.97 0.00 │ ↓ jmp 6c │ mov %cr3,%rdi 1.33 10.89 4.00 │ ↓ jmp 62 │ mov %rdi,%rax <SNIP> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 6920e2854e9a ("perf annotate browser: Show extra title line with event information") Link: https://lkml.kernel.org/n/tip-i41tqh17c2dabnyzjh99r1oz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-03perf annotate stdio2: Print more descriptive event information headerArnaldo Carvalho de Melo1-7/+3
To match the recently added event header information to --tui, e.g.: # perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave Samples: 128 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 48617682 _raw_spin_lock_irqsave() /proc/kcore 0.78 nop 7.03 push %rbx 3.12 pushfq 6.25 pop %rax nop mov %rax,%rbx 3.12 cli nop xor %eax,%eax mov $0x1,%edx 79.69 lock cmpxchg %edx,(%rdi) test %eax,%eax ↓ jne 2b mov %rbx,%rax pop %rbx ← retq 2b: mov %eax,%esi → callq *ffffffffb30eaed0 mov %rbx,%rax pop %rbx ← retq # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-ujy46x7cldyhyxelyf2b9quy@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-03perf annotate: Introduce annotation__scnprintf_samples_period() methodArnaldo Carvalho de Melo1-0/+38
To print a string using the total period (nr_events) and the number of samples for a given annotation, i.e. for a given symbol, the counterpart to hists__scnprintf_samples_period(), that is for all the samples in a session (be it a live session, think 'perf top' or a perf.data file, think 'perf report'). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935 Link: https://lkml.kernel.org/n/tip-goj2wu4fxutc8vd46mw3yg14@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-23perf annotate: Use absolute addresses to calculate jump target offsetsArnaldo Carvalho de Melo1-3/+2
These types of jumps were confusing the annotate browser: entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux Percent│ffffffff81a00020: swapgs <SNIP> │ffffffff81a00128: ↓ jae ffffffff81a00139 <syscall_return_via_sysret+0x53> <SNIP> │ffffffff81a00155: → jmpq *0x825d2d(%rip) # ffffffff82225e88 <pv_cpu_ops+0xe8> I.e. the syscall_return_via_sysret function is actually "inside" the entry_SYSCALL_64 function, and the offsets in jumps like these (+0x53) are relative to syscall_return_via_sysret, not to syscall_return_via_sysret. Or this may be some artifact in how the assembler marks the start and end of a function and how this ends up in the ELF symtab for vmlinux, i.e. syscall_return_via_sysret() isn't "inside" entry_SYSCALL_64, but just right after it. From readelf -sw vmlinux: 80267: ffffffff81a00020 315 NOTYPE GLOBAL DEFAULT 1 entry_SYSCALL_64 316: ffffffff81a000e6 0 NOTYPE LOCAL DEFAULT 1 syscall_return_via_sysret 0xffffffff81a00020 + 315 > 0xffffffff81a000e6 So instead of looking for offsets after that last '+' sign, calculate offsets for jump target addresses that are inside the function being disassembled from the absolute address, 0xffffffff81a00139 in this case, subtracting from it the objdump address for the start of the function being disassembled, entry_SYSCALL_64() in this case. So, before this patch: entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux Percent│ pop %r10 │ pop %r9 │ pop %r8 │ pop %rax │ pop %rsi │ pop %rdx │ pop %rsi │ mov %rsp,%rdi │ mov %gs:0x5004,%rsp │ pushq 0x28(%rdi) │ pushq (%rdi) │ push %rax │ ↑ jmp 6c │ mov %cr3,%rdi │ ↑ jmp 62 │ mov %rdi,%rax │ and $0x7ff,%rdi │ bt %rdi,%gs:0x2219a │ ↑ jae 53 │ btr %rdi,%gs:0x2219a │ mov %rax,%rdi │ ↑ jmp 5b After: entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux 0.65 │ → jne swapgs_restore_regs_and_return_to_usermode │ pop %r10 │ pop %r9 │ pop %r8 │ pop %rax │ pop %rsi │ pop %rdx │ pop %rsi │ mov %rsp,%rdi │ mov %gs:0x5004,%rsp │ pushq 0x28(%rdi) │ pushq (%rdi) │ push %rax │ ↓ jmp 132 │ mov %cr3,%rdi │ ┌──jmp 128 │ │ mov %rdi,%rax │ │ and $0x7ff,%rdi │ │ bt %rdi,%gs:0x2219a │ │↓ jae 119 │ │ btr %rdi,%gs:0x2219a │ │ mov %rax,%rdi │ │↓ jmp 121 │119:│ mov %rax,%rdi │ │ bts $0x3f,%rdi │121:│ or $0x800,%rdi │128:└─→or $0x1000,%rdi │ mov %rdi,%cr3 │132: pop %rax │ pop %rdi │ pop %rsp │ → jmpq *0x825d2d(%rip) # ffffffff82225e88 <pv_cpu_ops+0xe8> With those at least navigating to the right destination, an improvement for these cases seems to be to be to somehow mark those inner functions, which in this case could be: entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux │syscall_return_via_sysret: │ pop %r15 │ pop %r14 │ pop %r13 │ pop %r12 │ pop %rbp │ pop %rbx │ pop %rsi │ pop %r10 │ pop %r9 │ pop %r8 │ pop %rax │ pop %rsi │ pop %rdx │ pop %rsi │ mov %rsp,%rdi │ mov %gs:0x5004,%rsp │ pushq 0x28(%rdi) │ pushq (%rdi) │ push %rax │ ↓ jmp 132 │ mov %cr3,%rdi │ ┌──jmp 128 │ │ mov %rdi,%rax │ │ and $0x7ff,%rdi │ │ bt %rdi,%gs:0x2219a │ │↓ jae 119 │ │ btr %rdi,%gs:0x2219a │ │ mov %rax,%rdi │ │↓ jmp 121 │119:│ mov %rax,%rdi │ │ bts $0x3f,%rdi │121:│ or $0x800,%rdi │128:└─→or $0x1000,%rdi │ mov %rdi,%cr3 │132: pop %rax │ pop %rdi │ pop %rsp │ → jmpq *0x825d2d(%rip) # ffffffff82225e88 <pv_cpu_ops+0xe8> This all gets much better viewed if one uses 'perf report --ignore-vmlinux' forcing the usage of /proc/kcore + /proc/kallsyms, when the above actually gets down to: # perf report --ignore-vmlinux ## do '/64', will show the function names containing '64', ## navigate to /entry_SYSCALL_64_after_hwframe.annotation, ## press 'A' to annotate, then 'P' to print that annotation ## to a file ## From another xterm (or see on screen, this 'P' thing is for ## getting rid of those right side scroll bars/spaces): # cat /entry_SYSCALL_64_after_hwframe.annotation entry_SYSCALL_64_after_hwframe() /proc/kcore Event: cycles:ppp Percent Disassembly of section load0: ffffffff9aa00044 <load0>: 11.97 push %rax 4.85 push %rdi push %rsi 2.59 push %rdx 2.27 push %rcx 0.32 pushq $0xffffffffffffffda 1.29 push %r8 xor %r8d,%r8d 1.62 push %r9 0.65 xor %r9d,%r9d 1.62 push %r10 xor %r10d,%r10d 5.50 push %r11 xor %r11d,%r11d 3.56 push %rbx xor %ebx,%ebx 4.21 push %rbp xor %ebp,%ebp 2.59 push %r12 0.97 xor %r12d,%r12d 3.24 push %r13 xor %r13d,%r13d 2.27 push %r14 xor %r14d,%r14d 4.21 push %r15 xor %r15d,%r15d 0.97 mov %rsp,%rdi 5.50 → callq do_syscall_64 14.56 mov 0x58(%rsp),%rcx 7.44 mov 0x80(%rsp),%r11 0.32 cmp %rcx,%r11 → jne swapgs_restore_regs_and_return_to_usermode 0.32 shl $0x10,%rcx 0.32 sar $0x10,%rcx 3.24 cmp %rcx,%r11 → jne swapgs_restore_regs_and_return_to_usermode 2.27 cmpq $0x33,0x88(%rsp) 1.29 → jne swapgs_restore_regs_and_return_to_usermode mov 0x30(%rsp),%r11 8.74 cmp %r11,0x90(%rsp) → jne swapgs_restore_regs_and_return_to_usermode 0.32 test $0x10100,%r11 → jne swapgs_restore_regs_and_return_to_usermode 0.32 cmpq $0x2b,0xa0(%rsp) 0.65 → jne swapgs_restore_regs_and_return_to_usermode I.e. using kallsyms makes the function start/end be done differently than using what is in the vmlinux ELF symtab and actually the hits goes to entry_SYSCALL_64_after_hwframe, which is a GLOBAL() after the start of entry_SYSCALL_64: ENTRY(entry_SYSCALL_64) UNWIND_HINT_EMPTY <SNIP> pushq $__USER_CS /* pt_regs->cs */ pushq %rcx /* pt_regs->ip */ GLOBAL(entry_SYSCALL_64_after_hwframe) pushq %rax /* pt_regs->orig_ax */ PUSH_AND_CLEAR_REGS rax=$-ENOSYS And it goes and ends at: cmpq $__USER_DS, SS(%rsp) /* SS must match SYSRET */ jne swapgs_restore_regs_and_return_to_usermode /* * We win! This label is here just for ease of understanding * perf profiles. Nothing jumps here. */ syscall_return_via_sysret: /* rcx and r11 are already restored (see code above) */ UNWIND_HINT_EMPTY POP_REGS pop_rdi=0 skip_r11rcx=1 So perhaps some people should really just play with '--ignore-vmlinux' to force /proc/kcore + kallsyms. One idea is to do both, i.e. have a vmlinux annotation and a kcore+kallsyms one, when possible, and even show the patched location, etc. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-r11knxv8voesav31xokjiuo6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-23perf annotate: Defer searching for comma in raw line till it is neededArnaldo Carvalho de Melo1-1/+2
That strchr() in jump__scnprintf() needs to be nuked somehow, as it, IIRC is already done in jump__parse() and if needed at scnprintf() time, should be stashed in the struct filled in parse() time. For now jus defer it to just before where it is used. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-j0t5hagnphoz9xw07bh3ha3g@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-23perf annotate: Support jumping from one function to anotherArnaldo Carvalho de Melo1-2/+4
For instance: entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux 5.50 │ → callq do_syscall_64 14.56 │ mov 0x58(%rsp),%rcx 7.44 │ mov 0x80(%rsp),%r11 0.32 │ cmp %rcx,%r11 │ → jne swapgs_restore_regs_and_return_to_usermode 0.32 │ shl $0x10,%rcx 0.32 │ sar $0x10,%rcx 3.24 │ cmp %rcx,%r11 │ → jne swapgs_restore_regs_and_return_to_usermode 2.27 │ cmpq $0x33,0x88(%rsp) 1.29 │ → jne swapgs_restore_regs_and_return_to_usermode │ mov 0x30(%rsp),%r11 8.74 │ cmp %r11,0x90(%rsp) │ → jne swapgs_restore_regs_and_return_to_usermode 0.32 │ test $0x10100,%r11 │ → jne swapgs_restore_regs_and_return_to_usermode 0.32 │ cmpq $0x2b,0xa0(%rsp) 0.65 │ → jne swapgs_restore_regs_and_return_to_usermode It'll behave just like a "call" instruction, i.e. press enter or right arrow over one such line and the browser will navigate to the annotated disassembly of that function, which when exited, via left arrow or esc, will come back to the calling function. Now to support jump to an offset on a different function... Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-78o508mqvr8inhj63ddtw7mo@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-23perf annotate: Add "_local" to jump/offset validation routinesArnaldo Carvalho de Melo1-5/+4
Because they all really check if we can access data structures/visual constructs where a "jump" instruction targets code in the same function, i.e. things like: __pthread_mutex_lock /usr/lib64/libpthread-2.26.so 1.95 │ mov __pthread_force_elision,%ecx │ ┌──test %ecx,%ecx 0.07 │ ├──je 60 │ │ test $0x300,%esi │ │↓ jne 60 │ │ or $0x100,%esi │ │ mov %esi,0x10(%rdi) │ 42:│ mov %esi,%edx │ │ lea 0x16(%r8),%rsi │ │ mov %r8,%rdi │ │ and $0x80,%edx │ │ add $0x8,%rsp │ │→ jmpq __lll_lock_elision │ │ nop 0.29 │ 60:└─→and $0x80,%esi 0.07 │ mov $0x1,%edi 0.29 │ xor %eax,%eax 2.53 │ lock cmpxchg %edi,(%r8) And not things like that "jmpq __lll_lock_elision", that instead should behave like a "call" instruction and "jump" to the disassembly of "___lll_lock_elision". Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-3cwx39u3h66dfw9xjrlt7ca2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-21perf annotate: Mark jumps to outher functions with the call arrowArnaldo Carvalho de Melo1-3/+52
Things like this in _cpp_lex_token (gcc's cc1 program): cpp_named_operator2name@@Base+0xa72 Point to a place that is after the cpp_named_operator2name boundaries, i.e. in the ELF symbol table for cc1 cpp_named_operator2name is marked as being 32-bytes long, but it in fact is much larger than that, so we seem to need a symbols__find() routine that looks for >= current->start and < next_symbol->start, possibly just for C++ objects? For now lets just make some progress by marking jumps to outside the current function as call like. Actual navigation will come next, with further understanding of how the symbol searching and disassembly should be done. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-aiys0a0bsgm3e00hbi6fg7yy@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-21perf annotate: Pass function descriptor to its instruction parsing routinesArnaldo Carvalho de Melo1-13/+17
We need that to figure out if jumps have targets in a different function. E.g. _cpp_lex_token(), in /usr/libexec/gcc/x86_64-redhat-linux/5.3.1/cc1 has a line like this: jne c469be <cpp_named_operator2name@@Base+0xa72> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-ris0ioziyp469pofpzix2atb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-21perf annotate: No need to calculate notes->start twiceArnaldo Carvalho de Melo1-5/+4
Since we already set notes->start to map__rip_2objdump(map, sym->start) in symbol__annotate2(), no need to calculate that address again in symbol__calc_lines(), just use notes->start. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-ycxlg8mm5ueuj21w6gi62l7g@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-21perf annotate browser: Add 'P' hotkey to dump annotation to fileArnaldo Carvalho de Melo1-0/+31
Just like we have in the histograms browser used as the main screen for 'perf top --tui' and 'perf report --tui', to print the current annotation to a file with a named composed by the symbol name and the ".annotation" suffix. Here is one example of pressing 'A' on 'perf top' to live annotate a kernel function and then press 'P' to dump that annotation, the resulting file: # cat _raw_spin_lock_irqsave.annotation _raw_spin_lock_irqsave() /proc/kcore Event: cycles:ppp 7.14 nop 21.43 push %rbx 7.14 pushfq pop %rax nop mov %rax,%rbx cli nop xor %eax,%eax mov $0x1,%edx 64.29 lock cmpxchg %edx,(%rdi) test %eax,%eax ↓ jne 2b mov %rbx,%rax pop %rbx ← retq 2b: mov %eax,%esi → callq queued_spin_lock_slowpath mov %rbx,%rax pop %rbx ← retq # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-zzmnrwugb5vtk7bvg0rbx150@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-21perf annotate: Add function header to --stdio2Arnaldo Carvalho de Melo1-0/+8
# perf annotate --stdio2 _raw_spin_lock_irqsave _raw_spin_lock_irqsave() /lib/modules/4.16.0-rc4/build/vmlinux Event: anon group { cycles, instructions } 0.00 3.17 → callq __fentry__ 0.00 7.94 push %rbx 7.69 36.51 → callq __page_file_index mov %rax,%rbx 7.69 3.17 → callq *ffffffff82225cd0 xor %eax,%eax mov $0x1,%edx 80.77 49.21 lock cmpxchg %edx,(%rdi) test %eax,%eax ↓ jne 2b 3.85 0.00 mov %rbx,%rax pop %rbx ← retq 2b: mov %eax,%esi → callq queued_spin_lock_slowpath mov %rbx,%rax pop %rbx ← retq # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-i86yfyzl8m194ioxgj1jo32f@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-21perf annotate: Use the default annotation options for --stdio2Arnaldo Carvalho de Melo1-3/+1
With an empty '[annotate]' section in ~/.perfconfig: # perf record -a --all-kernel -e '{cycles,instructions}:P' sleep 5 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 2.243 MB perf.data (5513 samples) ] # perf annotate --stdio2 _raw_spin_lock | head -20 Disassembly of section .text: ffffffff81868790 <_raw_spin_lock>: _raw_spin_lock(): EXPORT_SYMBOL(_raw_spin_trylock_bh); #endif #ifndef CONFIG_INLINE_SPIN_LOCK void __lockfunc _raw_spin_lock(raw_spinlock_t *lock) { → callq __fentry__ atomic_cmpxchg(): return xadd(&v->counter, -i); } static __always_inline int atomic_cmpxchg(atomic_t *v, int old, int new) { # perf annotate --stdio2 _raw_spin_lock | head -20 → callq __fentry__ xor %eax,%eax mov $0x1,%edx 87.50 100.00 lock cmpxchg %edx,(%rdi) 6.25 0.00 test %eax,%eax ↓ jne 16 6.25 0.00 repz retq 16: mov %eax,%esi ↑ jmpq ffffffff810e96b0 <queued_spin_lock_slowpath> # # cat ~/.perfconfig [annotate] hide_src_code = false show_linenr = true # perf annotate --stdio2 _raw_spin_lock | head -20 3 Disassembly of section .text: 5 ffffffff81868790 <_raw_spin_lock>: 6 _raw_spin_lock(): 143 EXPORT_SYMBOL(_raw_spin_trylock_bh); 144 #endif 146 #ifndef CONFIG_INLINE_SPIN_LOCK 147 void __lockfunc _raw_spin_lock(raw_spinlock_t *lock) 148 { → callq __fentry__ 150 atomic_cmpxchg(): 187 return xadd(&v->counter, -i); 188 } 190 static __always_inline int atomic_cmpxchg(atomic_t *v, int old, int new) 191 { # # cat ~/.perfconfig [annotate] hide_src_code = true show_total_period = true # perf annotate --stdio2 _raw_spin_lock | head -20 → callq __fentry__ xor %eax,%eax mov $0x1,%edx 1411316 152339 lock cmpxchg %edx,(%rdi) 344694 0 test %eax,%eax ↓ jne 16 80806 0 repz retq 16: mov %eax,%esi ↑ jmpq ffffffff810e96b0 <queued_spin_lock_slowpath> # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-nu4rxg5zkdtgs1b2gc40p7v7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-21perf annotate: Move the default annotate options to the libraryArnaldo Carvalho de Melo1-0/+62
One more thing that goes from the TUI code to be used more widely, for instance it'll affect the default options used by: perf annotate --stdio2 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-0nsz0dm0akdbo30vgja2a10e@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-21perf annotate: Introduce the --stdio2 output modeArnaldo Carvalho de Melo1-0/+92
This uses the TUI augmented formatting routines, modulo interactivity. # perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave _raw_spin_lock_irqsave() /proc/kcore Event: cycles:ppp Percent Disassembly of section load0: ffffffff9a8734b0 <load0>: nop push %rbx 50.00 pushfq pop %rax nop mov %rax,%rbx cli nop xor %eax,%eax mov $0x1,%edx 50.00 lock cmpxchg %edx,(%rdi) test %eax,%eax ↓ jne 2b mov %rbx,%rax pop %rbx ← retq 2b: mov %eax,%esi → callq queued_spin_lock_slowpath mov %rbx,%rax pop %rbx ← retq Tested-by: Jin Yao <yao.jin@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-6cte5o8z84mbivbvqlg14uh1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-20perf annotate: Use a ops table for annotation_line__write()Arnaldo Carvalho de Melo1-21/+23
To simplify the passing of arguments, the --stdio2 code will have to set all the fields with operations printing to stdout. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-pcs3c7vdy9ucygxflo4nl1o7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-20perf annotate: Finish the generalization of annotate_browser__write()Arnaldo Carvalho de Melo1-9/+104
We pass some more callbacks and all of annotate_browser__write() seems to be free of TUI code (except for some arrow constants, will fix). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-5uo6yvwnxtsbe8y6v0ysaakf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-20perf annotate: Introduce annotation_line__print_start() out of TUI codeArnaldo Carvalho de Melo1-0/+75
For the --tui and --stdio2 cases using callbacks for print() and set_percent_color() end up being the easiest path, real GUI remains as an exercise. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-1o7az1ng55g2g6ppr2jpeuct@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-20perf annotate: Introduce annotation_line__max_percent()Arnaldo Carvalho de Melo1-0/+14
Out of the annotate_browser__write() routine, to be used in the --stdio2 mode. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-0he0wyy4haswqi1qb35x37do@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-20perf annotate: Introduce symbol__annotate2 methodArnaldo Carvalho de Melo1-0/+39
That does all the extended boilerplate the TUI browser did, leaving the symbol__annotate() function to be used by the old --stdio output mode. Now the upcoming --stdio2 output mode should just use this one to set things up. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-e2x8wuf6gvdhzdryo229vj4i@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-20perf annotate: Introduce init_column_widths() method out of TUI codeArnaldo Carvalho de Melo1-0/+17
More non-TUI stuff goes to the UI-agnostic library Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-hngv7rpqvtta69ouj7ne770q@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-20perf annotate: Move update_column_widths() to the generic libArnaldo Carvalho de Melo1-0/+13
Previous patch left it where it was to ease review, move it to its right place. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-ikdjr014p7k5kachgyjrgiey@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-20perf annotate: Introduce set_offsets() method out of TUI codeArnaldo Carvalho de Melo1-0/+28
More non-strictly TUI code being moved to the UI neutral annotation library, to be used in the upcoming --stdio2 output mode. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-ek20dnd8z2y5v54pcepihybz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-20perf annotate: Move mark_jump_targets from the TUI to the annotation libraryArnaldo Carvalho de Melo1-0/+44
This also is not TUI specific, should be used in the upcoming --stdio2 mode. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-v827xec8z3hxrmgp7bwa6ohs@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-20perf annotate: Move compute_ipc() to annotation libraryArnaldo Carvalho de Melo1-0/+60
Out of the TUI code, as it has nothing specific to that UI and should be used in the other output modes as well. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-0jahghvqdodb8vu2591pkv3d@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-19perf annotate: Use ops->target.name when available for unresolved call targetsArnaldo Carvalho de Melo1-0/+3
There is a bug where when using 'perf annotate timerqueue_add' the target for its only routine called with the 'callq' instruction, 'rb_insert_color', doesn't get resolved from its address when parsing that 'callq' instruction. That symbol resolution works when using 'perf report --tui' and then doing annotation for 'timerqueue_add' from there, the vmlinux dso->symbols rb_tree somehow gets in a state that we can't find that address, that is a bug that has to be further investigated. But since the objdump output has the function name, i.e. the raw objdump disassembled line looks like: So, before: # perf annotate timerqueue_add │ mov %rbx,%rdi │ mov %rbx,(%rdx) │ → callq *ffffffff8184dc80 │ mov 0x8(%rbp),%rdx │ test %rdx,%rdx │ ↓ je 67 # perf report │ mov %rbx,%rdi │ mov %rbx,(%rdx) │ → callq rb_insert_color │ mov 0x8(%rbp),%rdx │ test %rdx,%rdx │ ↓ je 67 And after both look the same: # perf annotate timerqueue_add │ mov %rbx,%rdi │ mov %rbx,(%rdx) │ → callq rb_insert_color │ mov 0x8(%rbp),%rdx │ test %rdx,%rdx │ ↓ je 67 From 'perf report' one can annotate and navigate to that 'rb_insert_color' function, but not directly from 'perf annotate timerqueue_add', that remains to be investigated and fixed. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-nkktz6355rhqtq7o8atr8f8r@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-16perf annotate: Use asprintf when formatting objdump command lineArnaldo Carvalho de Melo1-5/+12
We were using a local buffer with an arbitrary size, that would have to get increased to avoid truncation as warned by gcc 8: util/annotate.c: In function 'symbol__disassemble': util/annotate.c:1488:4: error: '%s' directive output may be truncated writing up to 4095 bytes into a region of size between 3966 and 8086 [-Werror=format-truncation=] "%s %s%s --start-address=0x%016" PRIx64 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ util/annotate.c:1498:20: symfs_filename, symfs_filename); ~~~~~~~~~~~~~~ util/annotate.c:1490:50: note: format string is defined here " -l -d %s %s -C \"%s\" 2>/dev/null|grep -v \"%s:\"|expand", ^~ In file included from /usr/include/stdio.h:861, from util/color.h:5, from util/sort.h:8, from util/annotate.c:14: /usr/include/bits/stdio2.h:67:10: note: '__builtin___snprintf_chk' output 116 or more bytes (assuming 8331) into a destination of size 8192 return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ __bos (__s), __fmt, __va_arg_pack ()); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ So switch to asprintf, that will make sure enough space is available. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-qagoy2dmbjpc9gdnaj0r3mml@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf annotate: Fix s390 target function disassemblyThomas Richter1-1/+1
'perf annotate' displays function call assembler instructions with a right arrow. Hitting enter on this line/instruction causes the browser to disassemble this target function and show it on the screen. On s390 this results in an error message 'The called function was not found.' The function call assembly line parsing does not handle the s390 bras and brasl instructions. Function call__parse expects the target as first operand: callq e9140 <__fxstat> S390 has a register number as first operand: brasl %r14,41d60 <abort> Therefore the target addresses on s390 are always zero which is an invalid address. Introduce a s390 specific call parsing function which skips the first operand on s390. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180307134325.96106-1-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-05perf annotate: Find 'call' instruction target symbol at parsing timeArnaldo Carvalho de Melo1-17/+21
So that we do it just once, not everytime we press enter or -> on a 'call' instruction line. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-uysyojl1e6nm94amzzzs08tf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-08perf report: Fix a wrong offset issue when using /proc/kcoreJin Yao1-1/+2
When a valid vmlinux is not found, 'perf report' falls back to look at /proc/kcore. In this case, it will report the impossible large offset. For example: # perf record -b -e cycles:k find /etc/ > /dev/null # perf report --stdio --branch-history 22.77% _vm_normal_page+18446603336221188162 | ---page_remove_rmap +18446603336221188324 page_remove_rmap +18446603336221188487 (cycles:5) unlock_page_memcg +18446603336221188096 page_remove_rmap +18446603336221188327 (cycles:1) The issue is the value which is passed to parameter 'addr' in __get_srcline() is the objdump address. It's not correct if we calculate the offset by using 'addr - sym->start'. This patch creates a new parameter 'ip' in __get_srcline(). It is not converted to objdump address. With this patch, the perf report output is: 22.77% _vm_normal_page+66 | ---page_remove_rmap +228 page_remove_rmap +391 (cycles:5) unlock_page_memcg +0 page_remove_rmap +231 (cycles:1) page_remove_rmap +236 Committer testing: Make sure you get any valid vmlinux out of the way, using '-v' on the 'perf report' case and deleting it from places where perf searches them, like your kernel build dir and the build-id cache, in ~/.debug/. Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1514564812-17344-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27perf env: Adopt perf_env__arch() from the annotate codeArnaldo Carvalho de Melo1-16/+0
And use it in the libunwind case, with both passing a valid perf_env to extract the arch to be normalized from and passing NULL with the same semantic as in the annotate code: to get it from uname() uts.machine. Now the code to generate per arch errno translation tables (int/string) can use it to decode perf.data files recorded in a different arch than that where 'perf trace' (or any other analysis tool) runs. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-p2epffgash69w38kvj3ntpc9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27perf annotate: Use perf_env when obtaining the arch nameArnaldo Carvalho de Melo1-9/+8
Paving the way to reuse these routines in other areas, like when generating errno tables. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-rh1qv051vb8gfdcswskrn53h@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27perf annotate: Get the cpuid from evsel->evlist->env in symbol__annotate()Arnaldo Carvalho de Melo1-3/+4
To reduce its function signature, since we get this from 'evsel' which is already one of its arguments. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-070eap7t6uicg9c3w086xy2z@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05perf annotate: Fix objdump comment parsing for Intel mov dissassemblyThomas Richter1-3/+5
The command 'perf annotate' parses the output of objdump and also investigates the comments produced by objdump. For example the output of objdump produces (on x86): 23eee: 4c 8b 3d 13 01 21 00 mov 0x210113(%rip),%r15 # 234008 <stderr@@GLIBC_2.2.5+0x9a8> and the function mov__parse() is called to investigate the complete line. Mov__parse() breaks this line into several parts and finally calls function comment__symbol() to parse the data after the comment character '#'. Comment__symbol() expects a hexadecimal address followed by a symbol in '<' and '>' brackets. However the 2nd parameter given to function comment__symbol() always points to the comment character '#'. The address parsing always returns 0 because the character '#' is not a digit and strtoull() fails without being noticed. Fix this by advancing the second parameter to function comment__symbol() by one byte before invocation and add an error check after strtoull() has been called. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Fixes: 6de783b6f50f ("perf annotate: Resolve symbols using objdump comment") Link: http://lkml.kernel.org/r/20171128075632.72182-1-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-17perf tools: Move symbol__calc_percent() call to outside symbol__disassemble()Jiri Olsa1-6/+3
We need to call symbol__calc_percent() periodicaly for top, so it's no longer convenient to keep it in symbol__disassemble(). Let's separate the symbol__disassemble() to allocate and init the symbol annotation structs and symbol__calc_percent() to compute the lines percentages based on symbol hists data. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-gtnp8t4tb00q6lag07psn5nq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-17perf tools: Change (symbol|annotation)__calc_percent return type to voidJiri Olsa1-9/+8
There's no need for symbol__calc_percent and annotation__calc_percent functions to return any value, since it's always zero. Changing both function to return void. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-z0gs28hh24m4gia1t1ctraye@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-16perf annotate: Do not truncate instruction names at 6 charsRavi Bangoria1-9/+9
There are many instructions, esp on PowerPC, whose mnemonics are longer than 6 characters. Using precision limit causes truncation of such mnemonics. Fix this by removing precision limit. Note that, 'width' is still 6, so alignment won't get affected for length <= 6. Before: li r11,-1 xscvdp vs1,vs1 add. r10,r10,r11 After: li r11,-1 xscvdpsxds vs1,vs1 add. r10,r10,r11 Reported-by: Donald Stence <dstence@us.ibm.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Taeung Song <treeze.taeung@gmail.com> Link: http://lkml.kernel.org/r/20171114032540.4564-1-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-16perf annotate: Align source and offset linesJiri Olsa1-10/+24
Align source with offset lines, which are more advanced, because of the address column. Before: : static void *worker_thread(void *__tdata) : { 0.00 : 48a971: push %rbp 0.00 : 48a972: mov %rsp,%rbp 0.00 : 48a975: sub $0x30,%rsp 0.00 : 48a979: mov %rdi,-0x28(%rbp) 0.00 : 48a97d: mov %fs:0x28,%rax 0.00 : 48a986: mov %rax,-0x8(%rbp) 0.00 : 48a98a: xor %eax,%eax : struct thread_data *td = __tdata; 0.00 : 48a98c: mov -0x28(%rbp),%rax 0.00 : 48a990: mov %rax,-0x10(%rbp) : int m = 0, i; 0.00 : 48a994: movl $0x0,-0x1c(%rbp) : int ret; : : for (i = 0; i < loops; i++) { 0.00 : 48a99b: movl $0x0,-0x18(%rbp) After: : static void *worker_thread(void *__tdata) : { 0.00 : 48a971: push %rbp 0.00 : 48a972: mov %rsp,%rbp 0.00 : 48a975: sub $0x30,%rsp 0.00 : 48a979: mov %rdi,-0x28(%rbp) 0.00 : 48a97d: mov %fs:0x28,%rax 0.00 : 48a986: mov %rax,-0x8(%rbp) 0.00 : 48a98a: xor %eax,%eax : struct thread_data *td = __tdata; 0.00 : 48a98c: mov -0x28(%rbp),%rax 0.00 : 48a990: mov %rax,-0x10(%rbp) : int m = 0, i; 0.00 : 48a994: movl $0x0,-0x1c(%rbp) : int ret; : : for (i = 0; i < loops; i++) { 0.00 : 48a99b: movl $0x0,-0x18(%rbp) It makes bigger different when displaying script sources, where the comment lines looks oddly shifted from the lines which actually hold code. I'll send script support separately. Committer note: Do not use a fixed column width for the addresses, as kernel ones se more than 10 columns, look at the last offset and get the right width. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171011150158.11895-36-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-16perf annotate: Factor annotation_line__print from disasm_line__printJiri Olsa1-36/+33
Move generic annotation line display code into annotation_line__print function. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171011150158.11895-25-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-16perf annotate: Add annotation_line__print functionJiri Olsa1-6/+22
Separating struct annotation_line display function, it will hold the generic line display code. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171011150158.11895-24-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-16perf annotate: Remove disasm__calc_percent functionJiri Olsa1-44/+0
Remove disasm__calc_percent() function, because it's no longer needed. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171011150158.11895-22-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-16perf annotate: Remove disasm__calc_percent() from disasm_line__print()Jiri Olsa1-44/+15
Remove disasm__calc_percent() from disasm_line__print(), because we already have the data calculated in struct annotation_line. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171011150158.11895-20-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-16perf annotate: Add symbol__calc_lines functionJiri Olsa1-120/+66
Replace symbol__get_source_line() with symbol__calc_lines(), which calculates the source line tree over the struct annotation_line. This will allow us to remove redundant struct source_line in following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171011150158.11895-19-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-16perf annotate: Add symbol__calc_percent functionJiri Olsa1-1/+61
Add symbol__calc_percent function, that calculates annotation data for symbol and put the data in the struct annotation_line::samples array. Committer notes: Made symbol__calc_percent non static to be used in the next two patches, which will get some fixups from jolsa, doing it this way to keep this bisectable. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171011150158.11895-18-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-13perf annotate: Add samples into struct annotation_lineJiri Olsa1-0/+8
Add samples array into struct annotation_line to hold the annotation data. The data is populated in the following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171011150158.11895-17-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-13perf annotate: Add annotated_source__purge functionJiri Olsa1-6/+6
Mov disasm__purge() to annotated_source__purge() to make it work over a generic struct annotation_line. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171011150158.11895-16-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-13perf annotate: Add annotation_line__(new|delete) functionsJiri Olsa1-7/+56
Changing the way the annotation lines are allocated and adding annotation_line__(new|delete) functions to deal with this. Before the allocation schema was as follows: ----------------------------------------------------------- struct disasm_line | struct annotation_line | private space ----------------------------------------------------------- Where the private space is used in TUI code to store computed annotation data for events. The stdio code computes the data on the fly. The goal is to compute and store annotation line's data directly in the struct annotation_line itself, so this patch changes the line allocation schema as follows: ------------------------------------------------------------ privsize space | struct disasm_line | struct annotation_line ------------------------------------------------------------ Moving struct annotation_line to the end, because in following changes we will move here the non-fixed length event's data. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171011150158.11895-15-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-13perf annotate: Add annotation_line__add functionJiri Olsa1-3/+3
Rename disasm__add() into annotation_line__add() to make it work over a generic struct annotation_line. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171011150158.11895-13-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-13perf annotate: Add annotation_line__next functionJiri Olsa1-6/+7
Rename disasm__get_next_ip_line() to annotation_line__next() to make it work over a generic struct annotation_line. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171011150158.11895-12-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-13perf annotate: Add evsel into struct annotation_line_argsJiri Olsa1-3/+8
Add evsel into struct annotate_args to reduce the number of arguments that need to travel all the way to line allocation. This change also allow us to move the arch name initialization under symbol__annotate function. Link: http://lkml.kernel.org/n/tip-a9ok53rrgt1s5e8uglyvy6qt@git.kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171011150158.11895-11-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>