summaryrefslogtreecommitdiffstats
path: root/kernel/trace/trace_functions_graph.c
AgeCommit message (Collapse)AuthorFilesLines
2015-04-22Merge tag 'trace-v4.1-2' of ↵Linus Torvalds1-2/+6
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "This adds three fixes for the tracing code. The first is a bug when ftrace_dump_on_oops is triggered in atomic context and function graph tracer is the tracer that is being reported. The second fix is bad parsing of the trace_events from the kernel command line, where it would ignore specific events if the system name is used with defining the event(it enables all events within the system). The last one is a fix to the TRACE_DEFINE_ENUM(), where a check was missing to see if the ptr was incremented to the end of the string, but the loop increments it again and can miss the nul delimiter to stop processing" * tag 'trace-v4.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix possible out of bounds memory access when parsing enums tracing: Fix incorrect enabling of trace events by boot cmdline tracing: Handle ftrace_dump() atomic context in graph_trace_open()
2015-04-16tracing: Handle ftrace_dump() atomic context in graph_trace_open()Rabin Vincent1-2/+6
graph_trace_open() can be called in atomic context from ftrace_dump(). Use GFP_ATOMIC for the memory allocations when that's the case, in order to avoid the following splat. BUG: sleeping function called from invalid context at mm/slab.c:2849 in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/0 Backtrace: .. [<8004dc94>] (__might_sleep) from [<801371f4>] (kmem_cache_alloc_trace+0x160/0x238) r7:87800040 r6:000080d0 r5:810d16e8 r4:000080d0 [<80137094>] (kmem_cache_alloc_trace) from [<800cbd60>] (graph_trace_open+0x30/0xd0) r10:00000100 r9:809171a8 r8:00008e28 r7:810d16f0 r6:00000001 r5:810d16e8 r4:810d16f0 [<800cbd30>] (graph_trace_open) from [<800c79c4>] (trace_init_global_iter+0x50/0x9c) r8:00008e28 r7:808c853c r6:00000001 r5:810d16e8 r4:810d16f0 r3:800cbd30 [<800c7974>] (trace_init_global_iter) from [<800c7aa0>] (ftrace_dump+0x90/0x2ec) r4:810d2580 r3:00000000 [<800c7a10>] (ftrace_dump) from [<80414b2c>] (sysrq_ftrace_dump+0x1c/0x20) r10:00000100 r9:809171a8 r8:808f6e7c r7:00000001 r6:00000007 r5:0000007a r4:808d5394 [<80414b10>] (sysrq_ftrace_dump) from [<800169b8>] (return_to_handler+0x0/0x18) [<80415498>] (__handle_sysrq) from [<800169b8>] (return_to_handler+0x0/0x18) r8:808c8100 r7:808c8444 r6:00000101 r5:00000010 r4:84eb3210 [<80415668>] (handle_sysrq) from [<800169b8>] (return_to_handler+0x0/0x18) [<8042a760>] (pl011_int) from [<800169b8>] (return_to_handler+0x0/0x18) r10:809171bc r9:809171a8 r8:00000001 r7:00000026 r6:808c6000 r5:84f01e60 r4:8454fe00 [<8007782c>] (handle_irq_event_percpu) from [<80077b44>] (handle_irq_event+0x4c/0x6c) r10:808c7ef0 r9:87283e00 r8:00000001 r7:00000000 r6:8454fe00 r5:84f01e60 r4:84f01e00 [<80077af8>] (handle_irq_event) from [<8007aa28>] (handle_fasteoi_irq+0xf0/0x1ac) r6:808f52a4 r5:84f01e60 r4:84f01e00 r3:00000000 [<8007a938>] (handle_fasteoi_irq) from [<80076dc0>] (generic_handle_irq+0x3c/0x4c) r6:00000026 r5:00000000 r4:00000026 r3:8007a938 [<80076d84>] (generic_handle_irq) from [<80077128>] (__handle_domain_irq+0x8c/0xfc) r4:808c1e38 r3:0000002e [<8007709c>] (__handle_domain_irq) from [<800087b8>] (gic_handle_irq+0x34/0x6c) r10:80917748 r9:00000001 r8:88802100 r7:808c7ef0 r6:808c8fb0 r5:00000015 r4:8880210c r3:808c7ef0 [<80008784>] (gic_handle_irq) from [<80014044>] (__irq_svc+0x44/0x7c) Link: http://lkml.kernel.org/r/1428953721-31349-1-git-send-email-rabin@rab.in Link: http://lkml.kernel.org/r/1428957012-2319-1-git-send-email-rabin@rab.in Cc: stable@vger.kernel.org # 3.13+ Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-02-03tracing: Convert the tracing facility over to use tracefsSteven Rostedt (Red Hat)1-4/+3
debugfs was fine for the tracing facility as a quick way to get an interface. Now that tracing has matured, it should separate itself from debugfs such that it can be mounted separately without needing to mount all of debugfs with it. That is, users resist using tracing because it requires mounting debugfs. Having tracing have its own file system lets users get the features of tracing without needing to bring in the rest of the kernel's debug infrastructure. Another reason for tracefs is that debubfs does not support mkdir. Currently, to create instances, one does a mkdir in the tracing/instance directory. This is implemented via a hack that forces debugfs to do something it is not intended on doing. By converting over to tracefs, this hack can be removed and mkdir can be properly implemented. This patch does not address this yet, but it lays the ground work for that to be done. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-01-22tracing: Use IS_ERR() check for return value of tracing_init_dentry()Steven Rostedt (Red Hat)1-1/+1
tracing_init_dentry() will soon return NULL as a valid pointer for the top level tracing directroy. NULL can not be used as an error value. Instead, switch to ERR_PTR() and check the return status with IS_ERR(). Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-12-10Merge tag 'trace-seq-buf-3.19' of ↵Linus Torvalds1-4/+7
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull nmi-safe seq_buf printk update from Steven Rostedt: "This code is a fork from the trace-3.19 pull as it needed the trace_seq clean ups from that branch. This code solves the issue of performing stack dumps from NMI context. The issue is that printk() is not safe from NMI context as if the NMI were to trigger when a printk() was being performed, the NMI could deadlock from the printk() internal locks. This has been seen in practice. With lots of review from Petr Mladek, this code went through several iterations, and we feel that it is now at a point of quality to be accepted into mainline. Here's what is contained in this patch set: - Creates a "seq_buf" generic buffer utility that allows a descriptor to be passed around where functions can write their own "printk()" formatted strings into it. The generic version was pulled out of the trace_seq() code that was made specifically for tracing. - The seq_buf code was change to model the seq_file code. I have a patch (not included for 3.19) that converts the seq_file.c code over to use seq_buf.c like the trace_seq.c code does. This was done to make sure that seq_buf.c is compatible with seq_file.c. I may try to get that patch in for 3.20. - The seq_buf.c file was moved to lib/ to remove it from being dependent on CONFIG_TRACING. - The printk() was updated to allow for a per_cpu "override" of the internal calls. That is, instead of writing to the console, a call to printk() may do something else. This made it easier to allow the NMI to change what printk() does in order to call dump_stack() without needing to update that code as well. - Finally, the dump_stack from all CPUs via NMI code was converted to use the seq_buf code. The caller to trigger the NMI code would wait till all the NMIs finished, and then it would print the seq_buf data to the console safely from a non NMI context One added bonus is that this code also makes the NMI dump stack work on PREEMPT_RT kernels. As printk() includes sleeping locks on PREEMPT_RT, printk() only writes to console if the console does not use any rt_mutex converted spin locks. Which a lot do" * tag 'trace-seq-buf-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: x86/nmi: Fix use of unallocated cpumask_var_t printk/percpu: Define printk_func when printk is not defined x86/nmi: Perform a safe NMI stack trace on all CPUs printk: Add per_cpu printk func to allow printk to be diverted seq_buf: Move the seq_buf code to lib/ seq-buf: Make seq_buf_bprintf() conditional on CONFIG_BINARY_PRINTF tracing: Add seq_buf_get_buf() and seq_buf_commit() helper functions tracing: Have seq_buf use full buffer seq_buf: Add seq_buf_can_fit() helper function tracing: Add paranoid size check in trace_printk_seq() tracing: Use trace_seq_used() and seq_buf_used() instead of len tracing: Clean up tracing_fill_pipe_page() seq_buf: Create seq_buf_used() to find out how much was written tracing: Add a seq_buf_clear() helper and clear len and readpos in init tracing: Convert seq_buf fields to be like seq_file fields tracing: Convert seq_buf_path() to be like seq_path() tracing: Create seq_buf layer in trace_seq
2014-12-03tracing: Add additional marks to signal very large time deltasByungchul Park1-20/+3
Currently, function graph tracer prints "!" or "+" just before function execution time to signal a function overhead, depending on the time. And some tracers tracing latency also print "!" or "+" just after time to signal overhead, depending on the interval between events. Even it is usually enough to do that, we sometimes need to signal for bigger execution time than 100 micro seconds. For example, I used function graph tracer to detect if there is any case that exit_mm() takes too much time. I did following steps in /sys/kernel/debug/tracing. It was easier to detect very large excution time with patched kernel than with original kernel. $ echo exit_mm > set_graph_function $ echo function_graph > current_tracer $ echo > trace $ cat trace_pipe > $LOGFILE ... (do something and terminate logging) $ grep "\\$" $LOGFILE 3) $ 22082032 us | } /* kernel_map_pages */ 3) $ 22082040 us | } /* free_pages_prepare */ 3) $ 22082113 us | } /* free_hot_cold_page */ 3) $ 22083455 us | } /* free_hot_cold_page_list */ 3) $ 22083895 us | } /* release_pages */ 3) $ 22177873 us | } /* free_pages_and_swap_cache */ 3) $ 22178929 us | } /* unmap_single_vma */ 3) $ 22198885 us | } /* unmap_vmas */ 3) $ 22206949 us | } /* exit_mmap */ 3) $ 22207659 us | } /* mmput */ 3) $ 22207793 us | } /* exit_mm */ And then, it was easy to find out that a schedule-out occured by sub_preempt_count() within kernel_map_pages(). To detect very large function exection time caused by either problematic function implementation or scheduling issues, this patch can be useful. Link: http://lkml.kernel.org/r/1416789259-24038-1-git-send-email-byungchul.park@lge.com Signed-off-by: Byungchul Park <byungchul.park@lge.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-19tracing: Use trace_seq_used() and seq_buf_used() instead of lenSteven Rostedt (Red Hat)1-1/+4
As the seq_buf->len will soon be +1 size when there's an overflow, we must use trace_seq_used() or seq_buf_used() methods to get the real length. This will prevent buffer overflow issues if just the len of the seq_buf descriptor is used to copy memory. Link: http://lkml.kernel.org/r/20141114121911.09ba3d38@gandalf.local.home Reported-by: Petr Mladek <pmladek@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-19tracing: Create seq_buf layer in trace_seqSteven Rostedt (Red Hat)1-3/+3
Create a seq_buf layer that trace_seq sits on. The seq_buf will not be limited to page size. This will allow other usages of seq_buf instead of a hard set PAGE_SIZE one that trace_seq has. Link: http://lkml.kernel.org/r/20141104160221.864997179@goodmis.org Link: http://lkml.kernel.org/r/20141114011412.170377300@goodmis.org Tested-by: Jiri Kosina <jkosina@suse.cz> Acked-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Petr Mladek <pmladek@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-19tracing: Have function_graph use trace_seq_has_overflowed()Steven Rostedt (Red Hat)1-265/+117
Instead of doing individual checks all over the place that makes the code very messy. Just check trace_seq_has_overflowed() at the end or in strategic places. This makes the code much cleaner and also helps with getting closer to removing the return values of trace_seq_printf() and friends. Link: http://lkml.kernel.org/r/20141114011410.987913836@goodmis.org Reviewed-by: Petr Mladek <pmladek@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-14function_graph: Fix micro seconds notationsByungchul Park1-6/+6
Usually, "msecs" notation means milli-seconds, and "usecs" notation means micro-seconds. Since the unit used in the code is micro-seconds, the notation should be replaced from msecs to usecs. Link: http://lkml.kernel.org/r/1415171926-9782-2-git-send-email-byungchul.park@lge.com Signed-off-by: Byungchul Park <byungchul.park@lge.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-14ftrace-graph: show latency-format on print_graph_irq()Daniel Bristot de Oliveira1-0/+9
On the function_graph tracer, the print_graph_irq() function prints a trace line with the flag ==========> on an irq handler entry, and the flag <========== on an irq handler return. But when the latency-format is enable, it is not printing the latency-format flags, causing the following error in the trace output: 0) ==========> | 0) d... | smp_apic_timer_interrupt() { This patch fixes this issue by printing the latency-format flags when it is enable. Link: http://lkml.kernel.org/r/7c2e226dac20c940b6242178fab7f0e3c9b5ce58.1415233316.git.bristot@redhat.com Reviewed-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com> Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-14trace: Replace single-character seq_puts with seq_putcRasmus Villemoes1-2/+2
Printing a single character to a seqfile might as well be done with seq_putc instead of seq_puts; this avoids a strlen() call and a memory access. It also shaves another few bytes off the generated code. Link: http://lkml.kernel.org/r/1415479332-25944-4-git-send-email-linux@rasmusvillemoes.dk Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-13tracing: Replace seq_printf by simpler equivalentsRasmus Villemoes1-14/+14
Using seq_printf to print a simple string or a single character is a lot more expensive than it needs to be, since seq_puts and seq_putc exist. These patches do seq_printf(m, s) -> seq_puts(m, s) seq_printf(m, "%s", s) -> seq_puts(m, s) seq_printf(m, "%c", c) -> seq_putc(m, c) Subsequent patches will simplify further. Link: http://lkml.kernel.org/r/1415479332-25944-2-git-send-email-linux@rasmusvillemoes.dk Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-18tracing: Convert local function_graph functions to staticSteven Rostedt (Red Hat)1-4/+4
Local functions should be static. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-18tracing: let user specify tracing_thresh after selecting function_graphStanislav Fomichev1-0/+7
Currently, tracing_thresh works only if we specify it before selecting function_graph tracer. If we do the opposite, tracing_thresh will change it's value, but it will not be applied. To fix it, we add update_thresh callback which is called whenever tracing_thresh is updated and for function_graph tracer we register handler which reinitializes tracer depending on tracing_thresh. Link: http://lkml.kernel.org/p/20140718111727.GA3206@stfomichev-desktop.yandex.net Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-18ftrace-graph: Remove usage of ftrace_stop() in ftrace_graph_stop()Steven Rostedt (Red Hat)1-5/+0
All archs now use ftrace_graph_is_dead() to stop function graph tracing. Remove the usage of ftrace_stop() as that is no longer needed. Cc: Frederic Weisbecker <fweisbec@gmail.com> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-17ftrace-graph: Remove dependency of ftrace_stop() from ftrace_graph_stop()Steven Rostedt (Red Hat)1-0/+35
ftrace_stop() is going away as it disables parts of function tracing that affects users that should not be affected. But ftrace_graph_stop() is built on ftrace_stop(). Here's another example of killing all of function tracing because something went wrong with function graph tracing. Instead of disabling all users of function tracing on function graph error, disable only function graph tracing. A new function is created called ftrace_graph_is_dead(). This is called in strategic paths to prevent function graph from doing more harm and allowing at least a warning to be printed before the system crashes. NOTE: ftrace_stop() is still used until all the archs are converted over to use ftrace_graph_is_dead(). After that, ftrace_stop() will be removed. Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-20tracing: Add funcgraph_tail option to print function name after closing bracesRobert Elliott1-3/+6
In the function-graph tracer, add a funcgraph_tail option to print the function name on all } lines, not just functions whose first line is no longer in the trace buffer. If a function calls other traced functions, its total time appears on its } line. This change allows grep to be used to determine the function for which the line corresponds. Update Documentation/trace/ftrace.txt to describe this new option. Link: http://lkml.kernel.org/p/20140520221041.8359.6782.stgit@beardog.cce.hp.com Signed-off-by: Robert Elliott <elliott@hp.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-20tracing: Eliminate duplicate TRACE_GRAPH_PRINT_xx definesRobert Elliott1-9/+0
Eliminate duplicate TRACE_GRAPH_PRINT_xx defines in trace_functions_graph.c that are already in trace.h. Add TRACE_GRAPH_PRINT_IRQS to trace.h, which is the only one that is missing. Link: http://lkml.kernel.org/p/20140520221031.8359.24733.stgit@beardog.cce.hp.com Signed-off-by: Robert Elliott <elliott@hp.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-30tracing: Remove mock up poll wait functionSteven Rostedt (Red Hat)1-1/+0
Now that the ring buffer has a built in way to wake up readers when there's data, using irq_work such that it is safe to do it in any context. But it was still using the old "poor man's" wait polling that checks every 1/10 of a second to see if it should wake up a waiter. This makes the latency for a wake up excruciatingly long. No need to do that anymore. Completely remove the different wait_poll types from the tracers and have them all use the default one now. Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20tracing: Pass trace_array to set_flag callbackSteven Rostedt (Red Hat)1-1/+2
As options (flags) may affect instances instead of being global the set_flag() callbacks need to receive the trace_array descriptor of the instance they will be modifying. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-06tracing: Do not use signed enums with unsigned long long in fgragh outputSteven Rostedt (Red Hat)1-11/+11
The duration field of print_graph_duration() can also be used to do the space filling by passing an enum in it: DURATION_FILL_FULL DURATION_FILL_START DURATION_FILL_END The problem is that these are enums and defined as negative, but the duration field is unsigned long long. Most archs are fine with this but blackfin fails to compile because of it: kernel/built-in.o: In function `print_graph_duration': kernel/trace/trace_functions_graph.c:782: undefined reference to `__ucmpdi2' Overloading a unsigned long long with an signed enum is just bad in principle. We can accomplish the same thing by using part of the flags field instead. Cc: Mike Frysinger <vapier@gentoo.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-05tracing: Update event filters for multibufferTom Zanussi1-2/+2
The trace event filters are still tied to event calls rather than event files, which means you don't get what you'd expect when using filters in the multibuffer case: Before: # echo 'bytes_alloc > 8192' > /sys/kernel/debug/tracing/events/kmem/kmalloc/filter # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter bytes_alloc > 8192 # mkdir /sys/kernel/debug/tracing/instances/test1 # echo 'bytes_alloc > 2048' > /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter bytes_alloc > 2048 # cat /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter bytes_alloc > 2048 Setting the filter in tracing/instances/test1/events shouldn't affect the same event in tracing/events as it does above. After: # echo 'bytes_alloc > 8192' > /sys/kernel/debug/tracing/events/kmem/kmalloc/filter # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter bytes_alloc > 8192 # mkdir /sys/kernel/debug/tracing/instances/test1 # echo 'bytes_alloc > 2048' > /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter bytes_alloc > 8192 # cat /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter bytes_alloc > 2048 We'd like to just move the filter directly from ftrace_event_call to ftrace_event_file, but there are a couple cases that don't yet have multibuffer support and therefore have to continue using the current event_call-based filters. For those cases, a new USE_CALL_FILTER bit is added to the event_call flags, whose main purpose is to keep the old behavior for those cases until they can be updated with multibuffer support; at that point, the USE_CALL_FILTER flag (and the new associated call_filter_check_discard() function) can go away. The multibuffer support also made filter_current_check_discard() redundant, so this change removes that function as well and replaces it with filter_check_discard() (or call_filter_check_discard() as appropriate). Link: http://lkml.kernel.org/r/f16e9ce4270c62f46b2e966119225e1c3cca7e60.1382620672.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-18ftrace: Add set_graph_notrace filterNamhyung Kim1-3/+53
The set_graph_notrace filter is analogous to set_ftrace_notrace and can be used for eliminating uninteresting part of function graph trace output. It also works with set_graph_function nicely. # cd /sys/kernel/debug/tracing/ # echo do_page_fault > set_graph_function # perf ftrace live true 2) | do_page_fault() { 2) | __do_page_fault() { 2) 0.381 us | down_read_trylock(); 2) 0.055 us | __might_sleep(); 2) 0.696 us | find_vma(); 2) | handle_mm_fault() { 2) | handle_pte_fault() { 2) | __do_fault() { 2) | filemap_fault() { 2) | find_get_page() { 2) 0.033 us | __rcu_read_lock(); 2) 0.035 us | __rcu_read_unlock(); 2) 1.696 us | } 2) 0.031 us | __might_sleep(); 2) 2.831 us | } 2) | _raw_spin_lock() { 2) 0.046 us | add_preempt_count(); 2) 0.841 us | } 2) 0.033 us | page_add_file_rmap(); 2) | _raw_spin_unlock() { 2) 0.057 us | sub_preempt_count(); 2) 0.568 us | } 2) | unlock_page() { 2) 0.084 us | page_waitqueue(); 2) 0.126 us | __wake_up_bit(); 2) 1.117 us | } 2) 7.729 us | } 2) 8.397 us | } 2) 8.956 us | } 2) 0.085 us | up_read(); 2) + 12.745 us | } 2) + 13.401 us | } ... # echo handle_mm_fault > set_graph_notrace # perf ftrace live true 1) | do_page_fault() { 1) | __do_page_fault() { 1) 0.205 us | down_read_trylock(); 1) 0.041 us | __might_sleep(); 1) 0.344 us | find_vma(); 1) 0.069 us | up_read(); 1) 4.692 us | } 1) 5.311 us | } ... Link: http://lkml.kernel.org/r/1381739066-7531-5-git-send-email-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-18tracing: Add ref_data to function and fgraph tracer structsSteven Rostedt (Red Hat)1-1/+1
The selftest for function and function graph tracers are defined as __init, as they are only executed at boot up. The "tracer" structs that are associated to those tracers are not setup as __init as they are used after boot. To stop mismatch warnings, those structures need to be annotated with __ref_data. Currently, the tracer structures are defined to __read_mostly, as they do not really change. But in the future they should be converted to consts, but that will take a little work because they have a "next" pointer that gets updated when they are registered. That will have to wait till the next major release. Link: http://lkml.kernel.org/r/1373596735.17876.84.camel@gandalf.local.home Reported-by: kbuild test robot <fengguang.wu@intel.com> Reported-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-18tracing: Use trace_seq_puts()/trace_seq_putc() where possiblezhangwei(Jovi)1-26/+26
For string without format specifiers, use trace_seq_puts() or trace_seq_putc(). Link: http://lkml.kernel.org/r/51E3B3AC.1000605@huawei.com Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> [ fixed a trace_seq_putc(s, " ") to trace_seq_putc(s, ' ') ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Consolidate max_tr into main trace_array structureSteven Rostedt (Red Hat)1-6/+6
Currently, the way the latency tracers and snapshot feature works is to have a separate trace_array called "max_tr" that holds the snapshot buffer. For latency tracers, this snapshot buffer is used to swap the running buffer with this buffer to save the current max latency. The only items needed for the max_tr is really just a copy of the buffer itself, the per_cpu data pointers, the time_start timestamp that states when the max latency was triggered, and the cpu that the max latency was triggered on. All other fields in trace_array are unused by the max_tr, making the max_tr mostly bloat. This change removes the max_tr completely, and adds a new structure called trace_buffer, that holds the buffer pointer, the per_cpu data pointers, the time_start timestamp, and the cpu where the latency occurred. The trace_array, now has two trace_buffers, one for the normal trace and one for the max trace or snapshot. By doing this, not only do we remove the bloat from the max_trace but the instances of traces can now use their own snapshot feature and not have just the top level global_trace have the snapshot feature and latency tracers for itself. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Replace the static global per_cpu arrays with allocated per_cpuSteven Rostedt1-2/+2
The global and max-tr currently use static per_cpu arrays for the CPU data descriptors. But in order to get new allocated trace_arrays, they need to be allocated per_cpu arrays. Instead of using the static arrays, switch the global and max-tr to use allocated data. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-01-29tracing/fgraph: Adjust fgraph depth before calling trace return callbackSteven Rostedt (Red Hat)1-1/+7
While debugging the virtual cputime with the function graph tracer with a max_depth of 1 (most common use of the max_depth so far), I found that I was missing kernel execution because of a race condition. The code for the return side of the function has a slight race: ftrace_pop_return_trace(&trace, &ret, frame_pointer); trace.rettime = trace_clock_local(); ftrace_graph_return(&trace); barrier(); current->curr_ret_stack--; The ftrace_pop_return_trace() initializes the trace structure for the callback. The ftrace_graph_return() uses the trace structure for its own use as that structure is on the stack and is local to this function. Then the curr_ret_stack is decremented which is what the trace.depth is set to. If an interrupt comes in after the ftrace_graph_return() but before the curr_ret_stack, then the called function will get a depth of 2. If max_depth is set to 1 this function will be ignored. The problem is that the trace has already been called, and the timestamp for that trace will not reflect the time the function was about to re-enter userspace. Calls to the interrupt will not be traced because the max_depth has prevented this. To solve this issue, the ftrace_graph_return() can safely be moved after the current->curr_ret_stack has been updated. This way the timestamp for the return callback will reflect the actual time. If an interrupt comes in after the curr_ret_stack update and ftrace_graph_return(), it will be traced. It may look a little confusing to see it within the other function, but at least it will not be lost. Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-01-21tracing/fgraph: Add max_graph_depth to limit function_graph depthSteven Rostedt1-2/+58
Add the file max_graph_depth to the debug tracing directory that lets the user define the depth of the function graph. A very useful operation is to set the depth to 1. Then it traces only the first function that is called when entering the kernel. This can be used to determine what system operations interrupt a process. For example, to work on NOHZ processes (single tasks running without a timer tick), if any interrupt goes off and preempts that task, this code will show it happening. # cd /sys/kernel/debug/tracing # echo 1 > max_graph_depth # echo function_graph > current_tracer # cat per_cpu/cpu/<cpu-of-process>/trace Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-10-31tracing: Cache comms only after an event occurredSteven Rostedt1-2/+2
Whenever an event is registered, the comm of tasks are saved at every task switch instead of saving them at every event. But if an event isn't executed much, the comm cache will be filled up by tasks that did not record the event and you lose out on the comms that did. Here's an example, if you enable the following events: echo 1 > /debug/tracing/events/kvm/kvm_cr/enable echo 1 > /debug/tracing/events/net/net_dev_xmit/enable Note, there's no kvm running on this machine so the first event will never be triggered, but because it is enabled, the storing of comms will continue. If we now disable the network event: echo 0 > /debug/tracing/events/net/net_dev_xmit/enable and look at the trace: cat /debug/tracing/trace sshd-2672 [001] ..s2 375.731616: net_dev_xmit: dev=eth0 skbaddr=ffff88005cbb6de0 len=242 rc=0 sshd-2672 [001] ..s1 375.731617: net_dev_xmit: dev=br0 skbaddr=ffff88005cbb6de0 len=242 rc=0 sshd-2672 [001] ..s2 375.859356: net_dev_xmit: dev=eth0 skbaddr=ffff88005cbb6de0 len=242 rc=0 sshd-2672 [001] ..s1 375.859357: net_dev_xmit: dev=br0 skbaddr=ffff88005cbb6de0 len=242 rc=0 sshd-2672 [001] ..s2 375.947351: net_dev_xmit: dev=eth0 skbaddr=ffff88005cbb6de0 len=242 rc=0 sshd-2672 [001] ..s1 375.947352: net_dev_xmit: dev=br0 skbaddr=ffff88005cbb6de0 len=242 rc=0 sshd-2672 [001] ..s2 376.035383: net_dev_xmit: dev=eth0 skbaddr=ffff88005cbb6de0 len=242 rc=0 sshd-2672 [001] ..s1 376.035383: net_dev_xmit: dev=br0 skbaddr=ffff88005cbb6de0 len=242 rc=0 sshd-2672 [001] ..s2 377.563806: net_dev_xmit: dev=eth0 skbaddr=ffff88005cbb6de0 len=226 rc=0 sshd-2672 [001] ..s1 377.563807: net_dev_xmit: dev=br0 skbaddr=ffff88005cbb6de0 len=226 rc=0 sshd-2672 [001] ..s2 377.563834: net_dev_xmit: dev=eth0 skbaddr=ffff88005cbb6be0 len=114 rc=0 sshd-2672 [001] ..s1 377.563842: net_dev_xmit: dev=br0 skbaddr=ffff88005cbb6be0 len=114 rc=0 We see that process 2672 which triggered the events has the comm "sshd". But if we run hackbench for a bit and look again: cat /debug/tracing/trace <...>-2672 [001] ..s2 375.731616: net_dev_xmit: dev=eth0 skbaddr=ffff88005cbb6de0 len=242 rc=0 <...>-2672 [001] ..s1 375.731617: net_dev_xmit: dev=br0 skbaddr=ffff88005cbb6de0 len=242 rc=0 <...>-2672 [001] ..s2 375.859356: net_dev_xmit: dev=eth0 skbaddr=ffff88005cbb6de0 len=242 rc=0 <...>-2672 [001] ..s1 375.859357: net_dev_xmit: dev=br0 skbaddr=ffff88005cbb6de0 len=242 rc=0 <...>-2672 [001] ..s2 375.947351: net_dev_xmit: dev=eth0 skbaddr=ffff88005cbb6de0 len=242 rc=0 <...>-2672 [001] ..s1 375.947352: net_dev_xmit: dev=br0 skbaddr=ffff88005cbb6de0 len=242 rc=0 <...>-2672 [001] ..s2 376.035383: net_dev_xmit: dev=eth0 skbaddr=ffff88005cbb6de0 len=242 rc=0 <...>-2672 [001] ..s1 376.035383: net_dev_xmit: dev=br0 skbaddr=ffff88005cbb6de0 len=242 rc=0 <...>-2672 [001] ..s2 377.563806: net_dev_xmit: dev=eth0 skbaddr=ffff88005cbb6de0 len=226 rc=0 <...>-2672 [001] ..s1 377.563807: net_dev_xmit: dev=br0 skbaddr=ffff88005cbb6de0 len=226 rc=0 <...>-2672 [001] ..s2 377.563834: net_dev_xmit: dev=eth0 skbaddr=ffff88005cbb6be0 len=114 rc=0 <...>-2672 [001] ..s1 377.563842: net_dev_xmit: dev=br0 skbaddr=ffff88005cbb6be0 len=114 rc=0 The stored "sshd" comm has been flushed out and we get a useless "<...>". But by only storing comms after a trace event occurred, we can run hackbench all day and still get the same output. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-10-31tracing: Allow tracers to start at core initcallSteven Rostedt1-1/+1
There's times during debugging that it is helpful to see traces of early boot functions. But the tracers are initialized at device_initcall() which is quite late during the boot process. Setting the kernel command line parameter ftrace=function will not show anything until the function tracer is initialized. This prevents being able to trace functions before device_initcall(). There's no reason that the tracers need to be initialized so late in the boot process. Move them up to core_initcall() as they still need to come after early_initcall() which initializes the tracing buffers. Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-08-23ftrace: Do not test frame pointers if -mfentry is usedSteven Rostedt1-1/+4
The function graph has a test to check if the frame pointer is corrupted, which can happen with various options of gcc with mcount. But this is not an issue with -mfentry as -mfentry does not need nor use frame pointers for function graph tracing. Link: http://lkml.kernel.org/r/20120807194059.773895870@goodmis.org Acked-by: H. Peter Anvin <hpa@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-06-28tracing: Remove NR_CPUS array from trace_iteratorSteven Rostedt1-1/+1
Replace the NR_CPUS array of buffer_iter from the trace_iterator with an allocated array. This will just create an array of possible CPUS instead of the max number specified. The use of NR_CPUS in that array caused allocation failures for machines that were tight on memory. This did not cause any failures to the system itself (no crashes), but caused unnecessary failures for reading the trace files. Added a helper function called 'trace_buffer_iter()' that returns the buffer_iter item or NULL if it is not defined or the array was not allocated. Some routines do not require the array (tracing_open_pipe() for one). Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-07tracing: Still trace filtered irq functions when irq trace is disabledSteven Rostedt1-1/+1
If a function is set to be traced by the set_graph_function, but the option funcgraph-irqs is zero, and the traced function happens to be called from a interrupt, it will not be traced. The point of funcgraph-irqs is to not trace interrupts when we are preempted by an irq, not to not trace functions we want to trace that happen to be *in* a irq. Luckily the current->trace_recursion element is perfect to add a flag to help us be able to trace functions within an interrupt even when we are not tracing interrupts that preempt the trace. Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-14tracing, function_graph: Add context-info support for function_graph tracerJiri Olsa1-22/+31
The function_graph tracer does not follow global context-info option. Adding TRACE_ITER_CONTEXT_INFO trace_flags check to enable it. With following commands: # echo function_graph > ./current_tracer # echo 0 > options/context-info # cat trace This is what it looked like before: # tracer: function_graph # # TIME CPU DURATION FUNCTION CALLS # | | | | | | | | 1) 0.079 us | } /* __vma_link_rb */ 1) 0.056 us | copy_page_range(); 1) | security_vm_enough_memory() { ... This is what it looks like now: # tracer: function_graph # } /* update_ts_time_stats */ timekeeping_max_deferment(); ... Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1307113131-10045-6-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-14tracing, function_graph: Remove lock-depth from latency traceJiri Olsa1-4/+3
The lock_depth was removed in commit e6e1e25 tracing: Remove lock_depth from event entry Removing the lock_depth info from function_graph latency header. With following commands: # echo function_graph > ./current_tracer # echo 1 > options/latency-format # cat trace This is what it looked like before: # tracer: function_graph # # function_graph latency trace v1.1.5 on 3.0.0-rc1-tip+ # -------------------------------------------------------------------- # latency: 0 us, #59756/311298, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) # ----------------- # | task: -0 (uid:0 nice:0 policy:0 rt_prio:0) # ----------------- # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / _-=> lock-depth # |||| / # CPU||||| DURATION FUNCTION CALLS # | ||||| | | | | | | 0) .... 0.068 us | } /* __rcu_read_unlock */ ... This is what it looks like now: # tracer: function_graph # # function_graph latency trace v1.1.5 on 3.0.0-rc1-tip+ # -------------------------------------------------------------------- # latency: 0 us, #59747/1744610, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) # ----------------- # | task: -0 (uid:0 nice:0 policy:0 rt_prio:0) # ----------------- # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / # CPU|||| DURATION FUNCTION CALLS # | |||| | | | | | | 0) ..s. 1.641 us | } /* __rcu_process_callbacks */ ... Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1307113131-10045-5-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-14tracing, function_graph: Merge overhead and duration display functionsJiri Olsa1-74/+74
Functions print_graph_overhead() and print_graph_duration() displays data for one field - DURATION. I merged them into single function print_graph_duration(), and added a way to display the empty parts of the field. This way the print_graph_irq() function can use this column to display the IRQ signs if needed and the DURATION field details stays inside the print_graph_duration() function. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1307113131-10045-3-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-14tracing, function_graph: Remove dependency of abstime and duration fields on ↵Jiri Olsa1-16/+3
latency The display of absolute time and duration fields is based on the latency field. This was added during the irqsoff/wakeup tracers graph support changes. It's causing confusion in what fields will be displayed for the function_graph tracer itself. So I'm removing this depency, and adding absolute time and duration fields to the preemptirqsoff preemptoff irqsoff wakeup tracers. With following commands: # echo function_graph > ./current_tracer # cat trace This is what it looked like before: # tracer: function_graph # # TIME CPU DURATION FUNCTION CALLS # | | | | | | | | 0) 0.068 us | } /* page_add_file_rmap */ 0) | _raw_spin_unlock() { ... This is what it looks like now: # tracer: function_graph # # CPU DURATION FUNCTION CALLS # | | | | | | | 0) 0.068 us | } /* add_preempt_count */ 0) 0.993 us | } /* vfsmount_lock_local_lock */ ... For preemptirqsoff preemptoff irqsoff wakeup tracers, this is what it looked like before: SNIP # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / _-=> lock-depth # |||| / # CPU TASK/PID ||||| DURATION FUNCTION CALLS # | | | ||||| | | | | | | 1) <idle>-0 | d..1 0.000 us | acpi_idle_enter_simple(); ... This is what it looks like now: SNIP # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / # TIME CPU TASK/PID |||| DURATION FUNCTION CALLS # | | | | |||| | | | | | | 19.847735 | 1) <idle>-0 | d..1 0.000 us | acpi_idle_enter_simple(); ... Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1307113131-10045-2-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-03-31Fix common misspellingsLucas De Marchi1-1/+1
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2010-10-18tracing: Remove parent recording in latency tracer graph optionsSteven Rostedt1-1/+0
Even though the parent is recorded with the normal function tracing of the latency tracers (irqsoff and wakeup), the function graph recording is bogus. This is due to the function graph messing with the return stack. The latency tracers pass in as the parent CALLER_ADDR0, which works fine for plain function tracing. But this causes bogus output with the graph tracer: 3) <idle>-0 | d.s3. 0.000 us | return_to_handler(); 3) <idle>-0 | d.s3. 0.000 us | _raw_spin_unlock_irqrestore(); 3) <idle>-0 | d.s3. 0.000 us | return_to_handler(); 3) <idle>-0 | d.s3. 0.000 us | trace_hardirqs_on(); The "return_to_handle()" call is the trampoline of the function graph tracer, and is meaningless in this context. Cc: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-10-18tracing: Make graph related irqs/preemptsoff functions globalJiri Olsa1-3/+60
Move trace_graph_function() and print_graph_headers_flags() functions to the trace_function_graph.c to be globaly available. Signed-off-by: Jiri Olsa <jolsa@redhat.com> LKML-Reference: <1285243253-7372-3-git-send-email-jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-10-18tracing: Add proper check for irq_depth routinesJiri Olsa1-4/+20
The check_irq_entry and check_irq_return could be called from graph event context. In such case there's no graph private data allocated. Adding checks to handle this case. Signed-off-by: Jiri Olsa <jolsa@redhat.com> LKML-Reference: <20100924154102.GB1818@jolsa.brq.redhat.com> [ Fixed some grammar in the comments ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-10-13tracing: Fix function-graph build warning on 32-bitBorislav Petkov1-2/+3
Fix kernel/trace/trace_functions_graph.c: In function ‘trace_print_graph_duration’: kernel/trace/trace_functions_graph.c:652: warning: comparison of distinct pointer types lacks a cast when building 36-rc6 on a 32-bit due to the strict type check failing in the min() macro. Signed-off-by: Borislav Petkov <bp@alien8.de> Cc: Chase Douglas <chase.douglas@canonical.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: <20100929080823.GA13595@liondog.tnic> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-09-14tracing: Do not trace in irq when funcgraph-irq option is zeroSteven Rostedt1-1/+22
When the function graph tracer funcgraph-irq option is zero, disable tracing in IRQs. This makes the option have two effects. 1) When reading the trace file, do not display the functions that happen in interrupt context (when detected) 2) [*new*] When recording a trace, skip those that are detected to be in interrupt by the 'in_irq()' function Note, in_irq() is updated at irq_enter() and irq_exit(). There are still functions that are recorded by the function graph tracer that is in interrupt context but outside the irq_enter/exit() routines. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-09-14tracing: Add funcgraph-irq option for function graph tracer.Jiri Olsa1-1/+100
It's handy to be able to disable the irq related output and not to have to jump over each irq related code, when you have no interrest in it. The option is by default enabled, so there's no change to current behaviour. It affects only the final output, so all the irq related data stay in the ring buffer. Signed-off-by: Jiri Olsa <jolsa@redhat.com> LKML-Reference: <20100907145344.GC1912@jolsa.brq.redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-08-19Merge branch 'tip/perf/urgent' of ↵Ingo Molnar1-1/+9
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core
2010-08-19perf, tracing: add missing __percpu markupsNamhyung Kim1-1/+1
ftrace_event_call->perf_events, perf_trace_buf, fgraph_data->cpu_data and some local variables are percpu pointers missing __percpu markups. Add them. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1281498479-28551-1-git-send-email-namhyung@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-08-16Merge branch 'tip/perf/urgent-3' of ↵Steven Rostedt1-1/+9
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into trace/tip/perf/urgent-4 Conflicts: kernel/trace/trace_events.c Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-08-06tracing: Fix an unallocated memory access in function_graphShaohua Li1-1/+9
With CONFIG_DEBUG_PAGEALLOC, I observed an unallocated memory access in function_graph trace. It appears we find a small size entry in ring buffer, but we access it as a big size entry. The access overflows the page size and touches an unallocated page. Cc: <stable@kernel.org> Signed-off-by: Shaohua Li <shaohua.li@intel.com> LKML-Reference: <1280217994.32400.76.camel@sli10-desk.sh.intel.com> [ Added a comment to explain the problem - SDR ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>