From 88d380eb0668b7fe76834c1febef4165508098b6 Mon Sep 17 00:00:00 2001 From: Changbin Du Date: Tue, 1 Jan 2019 23:46:14 +0800 Subject: tracing/doc: Add latency tracer funcgraph example This add an example about how to use funcgraph with latency tracers. Link: http://lkml.kernel.org/r/20190101154614.8887-6-changbin.du@gmail.com Signed-off-by: Changbin Du Signed-off-by: Steven Rostedt (VMware) --- Documentation/trace/ftrace.rst | 51 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) (limited to 'Documentation') diff --git a/Documentation/trace/ftrace.rst b/Documentation/trace/ftrace.rst index 0131df7f5968..6ce2763a2a3e 100644 --- a/Documentation/trace/ftrace.rst +++ b/Documentation/trace/ftrace.rst @@ -1396,6 +1396,57 @@ enabling function tracing, we incur an added overhead. This overhead may extend the latency times. But nevertheless, this trace has provided some very helpful debugging information. +If we prefer function graph output instead of function, we can set +display-graph option:: + with echo 1 > options/display-graph + + # tracer: irqsoff + # + # irqsoff latency trace v1.1.5 on 4.20.0-rc6+ + # -------------------------------------------------------------------- + # latency: 3751 us, #274/274, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4) + # ----------------- + # | task: bash-1507 (uid:0 nice:0 policy:0 rt_prio:0) + # ----------------- + # => started at: free_debug_processing + # => ended at: return_to_handler + # + # + # _-----=> irqs-off + # / _----=> need-resched + # | / _---=> hardirq/softirq + # || / _--=> preempt-depth + # ||| / + # REL TIME CPU TASK/PID |||| DURATION FUNCTION CALLS + # | | | | |||| | | | | | | + 0 us | 0) bash-1507 | d... | 0.000 us | _raw_spin_lock_irqsave(); + 0 us | 0) bash-1507 | d..1 | 0.378 us | do_raw_spin_trylock(); + 1 us | 0) bash-1507 | d..2 | | set_track() { + 2 us | 0) bash-1507 | d..2 | | save_stack_trace() { + 2 us | 0) bash-1507 | d..2 | | __save_stack_trace() { + 3 us | 0) bash-1507 | d..2 | | __unwind_start() { + 3 us | 0) bash-1507 | d..2 | | get_stack_info() { + 3 us | 0) bash-1507 | d..2 | 0.351 us | in_task_stack(); + 4 us | 0) bash-1507 | d..2 | 1.107 us | } + [...] + 3750 us | 0) bash-1507 | d..1 | 0.516 us | do_raw_spin_unlock(); + 3750 us | 0) bash-1507 | d..1 | 0.000 us | _raw_spin_unlock_irqrestore(); + 3764 us | 0) bash-1507 | d..1 | 0.000 us | tracer_hardirqs_on(); + bash-1507 0d..1 3792us : + => free_debug_processing + => __slab_free + => kmem_cache_free + => vm_area_free + => remove_vma + => exit_mmap + => mmput + => flush_old_exec + => load_elf_binary + => search_binary_handler + => __do_execve_file.isra.32 + => __x64_sys_execve + => do_syscall_64 + => entry_SYSCALL_64_after_hwframe preemptoff ---------- -- cgit v1.2.3 From f79b3f338564e7674dbe6375bcf685c2ba483efe Mon Sep 17 00:00:00 2001 From: "Steven Rostedt (VMware)" Date: Mon, 11 Feb 2019 15:00:48 -0500 Subject: ftrace: Allow enabling of filters via index of available_filter_functions Enabling of large number of functions by echoing in a large subset of the functions in available_filter_functions can take a very long time. The process requires testing all functions registered by the function tracer (which is in the 10s of thousands), and doing a kallsyms lookup to convert the ip address into a name, then comparing that name with the string passed in. When a function causes the function tracer to crash the system, a binary bisect of the available_filter_functions can be done to find the culprit. But this requires passing in half of the functions in available_filter_functions over and over again, which makes it basically a O(n^2) operation. With 40,000 functions, that ends up bing 1,600,000,000 opertions! And enabling this can take over 20 minutes. As a quick speed up, if a number is passed into one of the filter files, instead of doing a search, it just enables the function at the corresponding line of the available_filter_functions file. That is: # echo 50 > set_ftrace_filter # cat set_ftrace_filter x86_pmu_commit_txn # head -50 available_filter_functions | tail -1 x86_pmu_commit_txn This allows setting of half the available_filter_functions to take place in less than a second! # time seq 20000 > set_ftrace_filter real 0m0.042s user 0m0.005s sys 0m0.015s # wc -l set_ftrace_filter 20000 set_ftrace_filter Signed-off-by: Steven Rostedt (VMware) --- Documentation/trace/ftrace.rst | 38 ++++++++++++++++++++++++++++++++++++++ kernel/trace/ftrace.c | 30 ++++++++++++++++++++++++++++++ kernel/trace/trace.h | 1 + kernel/trace/trace_events_filter.c | 5 +++++ 4 files changed, 74 insertions(+) (limited to 'Documentation') diff --git a/Documentation/trace/ftrace.rst b/Documentation/trace/ftrace.rst index 6ce2763a2a3e..7c5e6d6ab5d1 100644 --- a/Documentation/trace/ftrace.rst +++ b/Documentation/trace/ftrace.rst @@ -233,6 +233,12 @@ of ftrace. Here is a list of some of the key files: This interface also allows for commands to be used. See the "Filter commands" section for more details. + As a speed up, since processing strings can't be quite expensive + and requires a check of all functions registered to tracing, instead + an index can be written into this file. A number (starting with "1") + written will instead select the same corresponding at the line position + of the "available_filter_functions" file. + set_ftrace_notrace: This has an effect opposite to that of @@ -2835,6 +2841,38 @@ Produces:: We can see that there's no more lock or preempt tracing. +Selecting function filters via index +------------------------------------ + +Because processing of strings is expensive (the address of the function +needs to be looked up before comparing to the string being passed in), +an index can be used as well to enable functions. This is useful in the +case of setting thousands of specific functions at a time. By passing +in a list of numbers, no string processing will occur. Instead, the function +at the specific location in the internal array (which corresponds to the +functions in the "available_filter_functions" file), is selected. + +:: + + # echo 1 > set_ftrace_filter + +Will select the first function listed in "available_filter_functions" + +:: + + # head -1 available_filter_functions + trace_initcall_finish_cb + + # cat set_ftrace_filter + trace_initcall_finish_cb + + # head -50 available_filter_functions | tail -1 + x86_pmu_commit_txn + + # echo 1 50 > set_ftrace_filter + # cat set_ftrace_filter + trace_initcall_finish_cb + x86_pmu_commit_txn Dynamic ftrace with the function graph tracer --------------------------------------------- diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index aac7847c0214..fa79323331b2 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -3701,6 +3701,31 @@ enter_record(struct ftrace_hash *hash, struct dyn_ftrace *rec, int clear_filter) return ret; } +static int +add_rec_by_index(struct ftrace_hash *hash, struct ftrace_glob *func_g, + int clear_filter) +{ + long index = simple_strtoul(func_g->search, NULL, 0); + struct ftrace_page *pg; + struct dyn_ftrace *rec; + + /* The index starts at 1 */ + if (--index < 0) + return 0; + + do_for_each_ftrace_rec(pg, rec) { + if (pg->index <= index) { + index -= pg->index; + /* this is a double loop, break goes to the next page */ + break; + } + rec = &pg->records[index]; + enter_record(hash, rec, clear_filter); + return 1; + } while_for_each_ftrace_rec(); + return 0; +} + static int ftrace_match_record(struct dyn_ftrace *rec, struct ftrace_glob *func_g, struct ftrace_glob *mod_g, int exclude_mod) @@ -3769,6 +3794,11 @@ match_records(struct ftrace_hash *hash, char *func, int len, char *mod) if (unlikely(ftrace_disabled)) goto out_unlock; + if (func_g.type == MATCH_INDEX) { + found = add_rec_by_index(hash, &func_g, clear_filter); + goto out_unlock; + } + do_for_each_ftrace_rec(pg, rec) { if (rec->flags & FTRACE_FL_DISABLED) diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h index a34fa5e76abb..ae7df090b93e 100644 --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -1459,6 +1459,7 @@ enum regex_type { MATCH_MIDDLE_ONLY, MATCH_END_ONLY, MATCH_GLOB, + MATCH_INDEX, }; struct regex { diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c index f052ecb085e9..ade606c33231 100644 --- a/kernel/trace/trace_events_filter.c +++ b/kernel/trace/trace_events_filter.c @@ -825,6 +825,9 @@ enum regex_type filter_parse_regex(char *buff, int len, char **search, int *not) *search = buff; + if (isdigit(buff[0])) + return MATCH_INDEX; + for (i = 0; i < len; i++) { if (buff[i] == '*') { if (!i) { @@ -862,6 +865,8 @@ static void filter_build_regex(struct filter_pred *pred) } switch (type) { + /* MATCH_INDEX should not happen, but if it does, match full */ + case MATCH_INDEX: case MATCH_FULL: r->match = regex_match_full; break; -- cgit v1.2.3 From 5032b38189137c21992d91b624978bace60ff78c Mon Sep 17 00:00:00 2001 From: Tom Zanussi Date: Wed, 13 Feb 2019 17:42:42 -0600 Subject: tracing: Make hist trigger Documentation better reflect actions/handlers The action/handler code refactoring didn't change the action/handler syntax, but did generalize it - the Documentation should reflect that. Link: http://lkml.kernel.org/r/c2fe4144678829c70cad67aaa847dca27d57cb83.1550100284.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi Signed-off-by: Steven Rostedt (VMware) --- Documentation/trace/histogram.rst | 56 ++++++++++++++++++++++++++++++--------- 1 file changed, 43 insertions(+), 13 deletions(-) (limited to 'Documentation') diff --git a/Documentation/trace/histogram.rst b/Documentation/trace/histogram.rst index 7dda76503127..63e522107e59 100644 --- a/Documentation/trace/histogram.rst +++ b/Documentation/trace/histogram.rst @@ -25,7 +25,7 @@ Documentation written by Tom Zanussi hist:keys=[:values=] [:sort=][:size=#entries][:pause][:continue] - [:clear][:name=histname1] [if ] + [:clear][:name=histname1][:.] [if ] When a matching event is hit, an entry is added to a hash table using the key(s) and value(s) named. Keys and values correspond to @@ -1831,21 +1831,51 @@ and looks and behaves just like any other event:: Like any other event, once a histogram is enabled for the event, the output can be displayed by reading the event's 'hist' file. -2.2.3 Hist trigger 'actions' ----------------------------- +2.2.3 Hist trigger 'handlers' and 'actions' +------------------------------------------- -A hist trigger 'action' is a function that's executed whenever a -histogram entry is added or updated. +A hist trigger 'action' is a function that's executed (in most cases +conditionally) whenever a histogram entry is added or updated. -The default 'action' if no special function is explicitly specified is -as it always has been, to simply update the set of values associated -with an entry. Some applications, however, may want to perform -additional actions at that point, such as generate another event, or -compare and save a maximum. +When a histogram entry is added or updated, a hist trigger 'handler' +is what decides whether the corresponding action is actually invoked +or not. -The following additional actions are available. To specify an action -for a given event, simply specify the action between colons in the -hist trigger specification. +Hist trigger handlers and actions are paired together in the general +form: + + . + +To specify a handler.action pair for a given event, simply specify +that handler.action pair between colons in the hist trigger +specification. + +In theory, any handler can be combined with any action, but in +practice, not every handler.action combination is currently supported; +if a given handler.action combination isn't supported, the hist +trigger will fail with -EINVAL; + +The default 'handler.action' if none is explicity specified is as it +always has been, to simply update the set of values associated with an +entry. Some applications, however, may want to perform additional +actions at that point, such as generate another event, or compare and +save a maximum. + +The supported handlers and actions are listed below, and each is +described in more detail in the following paragraphs, in the context +of descriptions of some common and useful handler.action combinations. + +The available handlers are: + + - onmatch(matching.event) - invoke action on any addition or update + - onmax(var) - invoke action if var exceeds current max + +The available actions are: + + - (param list) - generate synthetic event + - save(field,...) - save current event fields + +The following commonly-used handler.action pairs are available: - onmatch(matching.event).(param list) -- cgit v1.2.3 From fd451a3d9291815d13f3f1ce60537b7f19616b23 Mon Sep 17 00:00:00 2001 From: Tom Zanussi Date: Wed, 13 Feb 2019 17:42:47 -0600 Subject: tracing: Add hist trigger snapshot() action Documentation Add Documentation for the hist:handlerXXX($var).snapshot() action. Link: http://lkml.kernel.org/r/445861d7822cd4b6aeaea1cecfcdbda466502148.1550100284.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi Signed-off-by: Steven Rostedt (VMware) --- Documentation/trace/histogram.rst | 110 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 110 insertions(+) (limited to 'Documentation') diff --git a/Documentation/trace/histogram.rst b/Documentation/trace/histogram.rst index 63e522107e59..353317bc3825 100644 --- a/Documentation/trace/histogram.rst +++ b/Documentation/trace/histogram.rst @@ -1874,6 +1874,7 @@ The available actions are: - (param list) - generate synthetic event - save(field,...) - save current event fields + - snapshot() - snapshot the trace buffer The following commonly-used handler.action pairs are available: @@ -2030,6 +2031,115 @@ The following commonly-used handler.action pairs are available: Entries: 2 Dropped: 0 + - onmax(var).snapshot() + + The 'onmax(var).snapshot()' hist trigger action is invoked + whenever the value of 'var' associated with a histogram entry + exceeds the current maximum contained in that variable. + + The end result is that a global snapshot of the trace buffer will + be saved in the tracing/snapshot file if 'var' exceeds the current + maximum for any hist trigger entry. + + Note that in this case the maximum is a global maximum for the + current trace instance, which is the maximum across all buckets of + the histogram. The key of the specific trace event that caused + the global maximum and the global maximum itself are displayed, + along with a message stating that a snapshot has been taken and + where to find it. The user can use the key information displayed + to locate the corresponding bucket in the histogram for even more + detail. + + As an example the below defines a couple of hist triggers, one for + sched_waking and another for sched_switch, keyed on pid. Whenever + a sched_waking event occurs, the timestamp is saved in the entry + corresponding to the current pid, and when the scheduler switches + back to that pid, the timestamp difference is calculated. If the + resulting latency, stored in wakeup_lat, exceeds the current + maximum latency, a snapshot is taken. As part of the setup, all + the scheduler events are also enabled, which are the events that + will show up in the snapshot when it is taken at some point: + + # echo 1 > /sys/kernel/debug/tracing/events/sched/enable + + # echo 'hist:keys=pid:ts0=common_timestamp.usecs \ + if comm=="cyclictest"' >> \ + /sys/kernel/debug/tracing/events/sched/sched_waking/trigger + + # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0: \ + onmax($wakeup_lat).save(next_prio,next_comm,prev_pid,prev_prio, \ + prev_comm):onmax($wakeup_lat).snapshot() \ + if next_comm=="cyclictest"' >> \ + /sys/kernel/debug/tracing/events/sched/sched_switch/trigger + + When the histogram is displayed, for each bucket the max value + and the saved values corresponding to the max are displayed + following the rest of the fields. + + If a snaphot was taken, there is also a message indicating that, + along with the value and event that triggered the global maximum: + + # cat /sys/kernel/debug/tracing/events/sched/sched_switch/hist + { next_pid: 2101 } hitcount: 200 + max: 52 next_prio: 120 next_comm: cyclictest \ + prev_pid: 0 prev_prio: 120 prev_comm: swapper/6 + + { next_pid: 2103 } hitcount: 1326 + max: 572 next_prio: 19 next_comm: cyclictest \ + prev_pid: 0 prev_prio: 120 prev_comm: swapper/1 + + { next_pid: 2102 } hitcount: 1982 \ + max: 74 next_prio: 19 next_comm: cyclictest \ + prev_pid: 0 prev_prio: 120 prev_comm: swapper/5 + + Snapshot taken (see tracing/snapshot). Details: + triggering value { onmax($wakeup_lat) }: 572 \ + triggered by event with key: { next_pid: 2103 } + + Totals: + Hits: 3508 + Entries: 3 + Dropped: 0 + + In the above case, the event that triggered the global maximum has + the key with next_pid == 2103. If you look at the bucket that has + 2103 as the key, you'll find the additional values save()'d along + with the local maximum for that bucket, which should be the same + as the global maximum (since that was the same value that + triggered the global snapshot). + + And finally, looking at the snapshot data should show at or near + the end the event that triggered the snapshot (in this case you + can verify the timestamps between the sched_waking and + sched_switch events, which should match the time displayed in the + global maximum): + + # cat /sys/kernel/debug/tracing/snapshot + + <...>-2103 [005] d..3 309.873125: sched_switch: prev_comm=cyclictest prev_pid=2103 prev_prio=19 prev_state=D ==> next_comm=swapper/5 next_pid=0 next_prio=120 + -0 [005] d.h3 309.873611: sched_waking: comm=cyclictest pid=2102 prio=19 target_cpu=005 + -0 [005] dNh4 309.873613: sched_wakeup: comm=cyclictest pid=2102 prio=19 target_cpu=005 + -0 [005] d..3 309.873616: sched_switch: prev_comm=swapper/5 prev_pid=0 prev_prio=120 prev_state=S ==> next_comm=cyclictest next_pid=2102 next_prio=19 + <...>-2102 [005] d..3 309.873625: sched_switch: prev_comm=cyclictest prev_pid=2102 prev_prio=19 prev_state=D ==> next_comm=swapper/5 next_pid=0 next_prio=120 + -0 [005] d.h3 309.874624: sched_waking: comm=cyclictest pid=2102 prio=19 target_cpu=005 + -0 [005] dNh4 309.874626: sched_wakeup: comm=cyclictest pid=2102 prio=19 target_cpu=005 + -0 [005] dNh3 309.874628: sched_waking: comm=cyclictest pid=2103 prio=19 target_cpu=005 + -0 [005] dNh4 309.874630: sched_wakeup: comm=cyclictest pid=2103 prio=19 target_cpu=005 + -0 [005] d..3 309.874633: sched_switch: prev_comm=swapper/5 prev_pid=0 prev_prio=120 prev_state=S ==> next_comm=cyclictest next_pid=2102 next_prio=19 + -0 [004] d.h3 309.874757: sched_waking: comm=gnome-terminal- pid=1699 prio=120 target_cpu=004 + -0 [004] dNh4 309.874762: sched_wakeup: comm=gnome-terminal- pid=1699 prio=120 target_cpu=004 + -0 [004] d..3 309.874766: sched_switch: prev_comm=swapper/4 prev_pid=0 prev_prio=120 prev_state=S ==> next_comm=gnome-terminal- next_pid=1699 next_prio=120 + gnome-terminal--1699 [004] d.h2 309.874941: sched_stat_runtime: comm=gnome-terminal- pid=1699 runtime=180706 [ns] vruntime=1126870572 [ns] + -0 [003] d.s4 309.874956: sched_waking: comm=rcu_sched pid=9 prio=120 target_cpu=007 + -0 [003] d.s5 309.874960: sched_wake_idle_without_ipi: cpu=7 + -0 [003] d.s5 309.874961: sched_wakeup: comm=rcu_sched pid=9 prio=120 target_cpu=007 + -0 [007] d..3 309.874963: sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=S ==> next_comm=rcu_sched next_pid=9 next_prio=120 + rcu_sched-9 [007] d..3 309.874973: sched_stat_runtime: comm=rcu_sched pid=9 runtime=13646 [ns] vruntime=22531430286 [ns] + rcu_sched-9 [007] d..3 309.874978: sched_switch: prev_comm=rcu_sched prev_pid=9 prev_prio=120 prev_state=R+ ==> next_comm=swapper/7 next_pid=0 next_prio=120 + <...>-2102 [005] d..4 309.874994: sched_migrate_task: comm=cyclictest pid=2103 prio=19 orig_cpu=5 dest_cpu=1 + <...>-2102 [005] d..4 309.875185: sched_wake_idle_without_ipi: cpu=1 + -0 [001] d..3 309.875200: sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=S ==> next_comm=cyclictest next_pid=2103 next_prio=19 + 3. User space creating a trigger -------------------------------- -- cgit v1.2.3 From ff0d35e2e1c7936a4938d5ce5851e0333cd9dce6 Mon Sep 17 00:00:00 2001 From: Tom Zanussi Date: Wed, 13 Feb 2019 17:42:49 -0600 Subject: tracing: Add hist trigger onchange() handler Documentation Add Documentation for the hist:onchange($var) handler. Link: http://lkml.kernel.org/r/ab54b7383b265609fda52648a8fbfbd2631a640f.1550100284.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi Signed-off-by: Steven Rostedt (VMware) --- Documentation/trace/histogram.rst | 98 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 98 insertions(+) (limited to 'Documentation') diff --git a/Documentation/trace/histogram.rst b/Documentation/trace/histogram.rst index 353317bc3825..79476c906b1a 100644 --- a/Documentation/trace/histogram.rst +++ b/Documentation/trace/histogram.rst @@ -1869,6 +1869,7 @@ The available handlers are: - onmatch(matching.event) - invoke action on any addition or update - onmax(var) - invoke action if var exceeds current max + - onchange(var) - invoke action if var changes The available actions are: @@ -2140,6 +2141,103 @@ The following commonly-used handler.action pairs are available: <...>-2102 [005] d..4 309.875185: sched_wake_idle_without_ipi: cpu=1 -0 [001] d..3 309.875200: sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=S ==> next_comm=cyclictest next_pid=2103 next_prio=19 + - onchange(var).save(field,.. .) + + The 'onchange(var).save(field,...)' hist trigger action is invoked + whenever the value of 'var' associated with a histogram entry + changes. + + The end result is that the trace event fields specified as the + onchange.save() params will be saved if 'var' changes for that + hist trigger entry. This allows context from the event that + changed the value to be saved for later reference. When the + histogram is displayed, additional fields displaying the saved + values will be printed. + + - onchange(var).snapshot() + + The 'onchange(var).snapshot()' hist trigger action is invoked + whenever the value of 'var' associated with a histogram entry + changes. + + The end result is that a global snapshot of the trace buffer will + be saved in the tracing/snapshot file if 'var' changes for any + hist trigger entry. + + Note that in this case the changed value is a global variable + associated withe current trace instance. The key of the specific + trace event that caused the value to change and the global value + itself are displayed, along with a message stating that a snapshot + has been taken and where to find it. The user can use the key + information displayed to locate the corresponding bucket in the + histogram for even more detail. + + As an example the below defines a hist trigger on the tcp_probe + event, keyed on dport. Whenever a tcp_probe event occurs, the + cwnd field is checked against the current value stored in the + $cwnd variable. If the value has changed, a snapshot is taken. + As part of the setup, all the scheduler and tcp events are also + enabled, which are the events that will show up in the snapshot + when it is taken at some point: + + # echo 1 > /sys/kernel/debug/tracing/events/sched/enable + # echo 1 > /sys/kernel/debug/tracing/events/tcp/enable + + # echo 'hist:keys=dport:cwnd=snd_cwnd: \ + onchange($cwnd).save(snd_wnd,srtt,rcv_wnd): \ + onchange($cwnd).snapshot()' >> \ + /sys/kernel/debug/tracing/events/tcp/tcp_probe/trigger + + When the histogram is displayed, for each bucket the tracked value + and the saved values corresponding to that value are displayed + following the rest of the fields. + + If a snaphot was taken, there is also a message indicating that, + along with the value and event that triggered the snapshot: + + # cat /sys/kernel/debug/tracing/events/tcp/tcp_probe/hist + { dport: 1521 } hitcount: 8 + changed: 10 snd_wnd: 35456 srtt: 154262 rcv_wnd: 42112 + + { dport: 80 } hitcount: 23 + changed: 10 snd_wnd: 28960 srtt: 19604 rcv_wnd: 29312 + + { dport: 9001 } hitcount: 172 + changed: 10 snd_wnd: 48384 srtt: 260444 rcv_wnd: 55168 + + { dport: 443 } hitcount: 211 + changed: 10 snd_wnd: 26960 srtt: 17379 rcv_wnd: 28800 + + Snapshot taken (see tracing/snapshot). Details: + triggering value { onchange($cwnd) }: 10 + triggered by event with key: { dport: 80 } + + Totals: + Hits: 414 + Entries: 4 + Dropped: 0 + + In the above case, the event that triggered the snapshot has the + key with dport == 80. If you look at the bucket that has 80 as + the key, you'll find the additional values save()'d along with the + changed value for that bucket, which should be the same as the + global changed value (since that was the same value that triggered + the global snapshot). + + And finally, looking at the snapshot data should show at or near + the end the event that triggered the snapshot: + + # cat /sys/kernel/debug/tracing/snapshot + + gnome-shell-1261 [006] dN.3 49.823113: sched_stat_runtime: comm=gnome-shell pid=1261 runtime=49347 [ns] vruntime=1835730389 [ns] + kworker/u16:4-773 [003] d..3 49.823114: sched_switch: prev_comm=kworker/u16:4 prev_pid=773 prev_prio=120 prev_state=R+ ==> next_comm=kworker/3:2 next_pid=135 next_prio=120 + gnome-shell-1261 [006] d..3 49.823114: sched_switch: prev_comm=gnome-shell prev_pid=1261 prev_prio=120 prev_state=R+ ==> next_comm=kworker/6:2 next_pid=387 next_prio=120 + kworker/3:2-135 [003] d..3 49.823118: sched_stat_runtime: comm=kworker/3:2 pid=135 runtime=5339 [ns] vruntime=17815800388 [ns] + kworker/6:2-387 [006] d..3 49.823120: sched_stat_runtime: comm=kworker/6:2 pid=387 runtime=9594 [ns] vruntime=14589605367 [ns] + kworker/6:2-387 [006] d..3 49.823122: sched_switch: prev_comm=kworker/6:2 prev_pid=387 prev_prio=120 prev_state=R+ ==> next_comm=gnome-shell next_pid=1261 next_prio=120 + kworker/3:2-135 [003] d..3 49.823123: sched_switch: prev_comm=kworker/3:2 prev_pid=135 prev_prio=120 prev_state=T ==> next_comm=swapper/3 next_pid=0 next_prio=120 + -0 [004] ..s7 49.823798: tcp_probe: src=10.0.0.10:54326 dest=23.215.104.193:80 mark=0x0 length=32 snd_nxt=0xe3ae2ff5 snd_una=0xe3ae2ecd snd_cwnd=10 ssthresh=2147483647 snd_wnd=28960 srtt=19604 rcv_wnd=29312 + 3. User space creating a trigger -------------------------------- -- cgit v1.2.3 From e91eefd731d933194940805bb1f75a4972dc607c Mon Sep 17 00:00:00 2001 From: Tom Zanussi Date: Wed, 13 Feb 2019 17:42:50 -0600 Subject: tracing: Add alternative synthetic event trace action syntax Add a 'trace(synthetic_event_name, params)' alternative to synthetic_event_name(params). Currently, the syntax used for generating synthetic events is to invoke synthetic_event_name(params) i.e. use the synthetic event name as a function call. Users requested a new form that more explicitly shows that the synthetic event is in effect being traced. In this version, a new 'trace()' keyword is used, and the synthetic event name is passed in as the first argument. In addition, for the sake of consistency with other actions, change the documention to emphasize the trace() form over the function-call form, which remains documented as equivalent. Link: http://lkml.kernel.org/r/d082773e50232a001480cf837679a1e01c1a2eb7.1550100284.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi Signed-off-by: Steven Rostedt (VMware) --- Documentation/trace/histogram.rst | 54 +++++++++++++++++++++++++++------------ kernel/trace/trace.c | 2 +- kernel/trace/trace_events_hist.c | 42 +++++++++++++++++++++++++++--- 3 files changed, 76 insertions(+), 22 deletions(-) (limited to 'Documentation') diff --git a/Documentation/trace/histogram.rst b/Documentation/trace/histogram.rst index 79476c906b1a..0ea59d45aef1 100644 --- a/Documentation/trace/histogram.rst +++ b/Documentation/trace/histogram.rst @@ -1873,31 +1873,45 @@ The available handlers are: The available actions are: - - (param list) - generate synthetic event + - trace(,param list) - generate synthetic event - save(field,...) - save current event fields - snapshot() - snapshot the trace buffer The following commonly-used handler.action pairs are available: - - onmatch(matching.event).(param list) + - onmatch(matching.event).trace(,param list) - The 'onmatch(matching.event).(params)' hist - trigger action is invoked whenever an event matches and the - histogram entry would be added or updated. It causes the named - synthetic event to be generated with the values given in the + The 'onmatch(matching.event).trace(,param + list)' hist trigger action is invoked whenever an event matches + and the histogram entry would be added or updated. It causes the + named synthetic event to be generated with the values given in the 'param list'. The result is the generation of a synthetic event that consists of the values contained in those variables at the - time the invoking event was hit. - - The 'param list' consists of one or more parameters which may be - either variables or fields defined on either the 'matching.event' - or the target event. The variables or fields specified in the - param list may be either fully-qualified or unqualified. If a - variable is specified as unqualified, it must be unique between - the two events. A field name used as a param can be unqualified - if it refers to the target event, but must be fully qualified if - it refers to the matching event. A fully-qualified name is of the - form 'system.event_name.$var_name' or 'system.event_name.field'. + time the invoking event was hit. For example, if the synthetic + event name is 'wakeup_latency', a wakeup_latency event is + generated using onmatch(event).trace(wakeup_latency,arg1,arg2). + + There is also an equivalent alternative form available for + generating synthetic events. In this form, the synthetic event + name is used as if it were a function name. For example, using + the 'wakeup_latency' synthetic event name again, the + wakeup_latency event would be generated by invoking it as if it + were a function call, with the event field values passed in as + arguments: onmatch(event).wakeup_latency(arg1,arg2). The syntax + for this form is: + + onmatch(matching.event).(param list) + + In either case, the 'param list' consists of one or more + parameters which may be either variables or fields defined on + either the 'matching.event' or the target event. The variables or + fields specified in the param list may be either fully-qualified + or unqualified. If a variable is specified as unqualified, it + must be unique between the two events. A field name used as a + param can be unqualified if it refers to the target event, but + must be fully qualified if it refers to the matching event. A + fully-qualified name is of the form 'system.event_name.$var_name' + or 'system.event_name.field'. The 'matching.event' specification is simply the fully qualified event name of the event that matches the target event for the @@ -1928,6 +1942,12 @@ The following commonly-used handler.action pairs are available: wakeup_new_test($testpid) if comm=="cyclictest"' >> \ /sys/kernel/debug/tracing/events/sched/sched_wakeup_new/trigger + Or, equivalently, using the 'trace' keyword syntax: + + # echo 'hist:keys=$testpid:testpid=pid:onmatch(sched.sched_wakeup_new).\ + trace(wakeup_new_test,$testpid) if comm=="cyclictest"' >> \ + /sys/kernel/debug/tracing/events/sched/sched_wakeup_new/trigger + Creating and displaying a histogram based on those events is now just a matter of using the fields and new synthetic event in the tracing/events/synthetic directory, as usual:: diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index be6779f963c6..0460cc0f28fd 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -4918,7 +4918,7 @@ static const char readme_msg[] = "\t onmax(var) - invoke if var exceeds current max\n" "\t onchange(var) - invoke action if var changes\n\n" "\t The available actions are:\n\n" - "\t (param list) - generate synthetic event\n" + "\t trace(,param list) - generate synthetic event\n" "\t save(field,...) - save current event fields\n" #ifdef CONFIG_TRACER_SNAPSHOT "\t snapshot() - snapshot the trace buffer\n" diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c index 2f3323ca9d24..66386ba1425f 100644 --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -419,6 +419,8 @@ struct action_data { */ unsigned int var_ref_idx; struct synth_event *synth_event; + bool use_trace_keyword; + char *synth_event_name; union { struct { @@ -3700,6 +3702,8 @@ static void action_data_destroy(struct action_data *data) if (data->synth_event) data->synth_event->ref--; + kfree(data->synth_event_name); + kfree(data); } @@ -3781,6 +3785,7 @@ static int track_data_create(struct hist_trigger_data *hist_data, static int parse_action_params(char *params, struct action_data *data) { char *param, *saved_param; + bool first_param = true; int ret = 0; while (params) { @@ -3809,6 +3814,13 @@ static int parse_action_params(char *params, struct action_data *data) goto out; } + if (first_param && data->use_trace_keyword) { + data->synth_event_name = saved_param; + first_param = false; + continue; + } + first_param = false; + data->params[data->n_params++] = saved_param; } out: @@ -3886,6 +3898,9 @@ static int action_parse(char *str, struct action_data *data, } else { char *params = strsep(&str, ")"); + if (str_has_prefix(action_name, "trace")) + data->use_trace_keyword = true; + if (params) { ret = parse_action_params(params, data); if (ret) @@ -4088,13 +4103,19 @@ static int trace_action_create(struct hist_trigger_data *hist_data, unsigned int i, var_ref_idx; unsigned int field_pos = 0; struct synth_event *event; + char *synth_event_name; int ret = 0; lockdep_assert_held(&event_mutex); - event = find_synth_event(data->action_name); + if (data->use_trace_keyword) + synth_event_name = data->synth_event_name; + else + synth_event_name = data->action_name; + + event = find_synth_event(synth_event_name); if (!event) { - hist_err("trace action: Couldn't find synthetic event: ", data->action_name); + hist_err("trace action: Couldn't find synthetic event: ", synth_event_name); return -EINVAL; } @@ -4841,8 +4862,10 @@ static void print_action_spec(struct seq_file *m, seq_puts(m, ","); } } else if (data->action == ACTION_TRACE) { + if (data->use_trace_keyword) + seq_printf(m, "%s", data->synth_event_name); for (i = 0; i < data->n_params; i++) { - if (i) + if (i || data->use_trace_keyword) seq_puts(m, ","); seq_printf(m, "%s", data->params[i]); } @@ -4890,6 +4913,7 @@ static bool actions_match(struct hist_trigger_data *hist_data, for (i = 0; i < hist_data->n_actions; i++) { struct action_data *data = hist_data->actions[i]; struct action_data *data_test = hist_data_test->actions[i]; + char *action_name, *action_name_test; if (data->handler != data_test->handler) return false; @@ -4904,7 +4928,17 @@ static bool actions_match(struct hist_trigger_data *hist_data, return false; } - if (strcmp(data->action_name, data_test->action_name) != 0) + if (data->use_trace_keyword) + action_name = data->synth_event_name; + else + action_name = data->action_name; + + if (data_test->use_trace_keyword) + action_name_test = data_test->synth_event_name; + else + action_name_test = data_test->action_name; + + if (strcmp(action_name, action_name_test) != 0) return false; if (data->handler == HANDLER_ONMATCH) { -- cgit v1.2.3 From b689269bd1bb7b1bffb01c1d2d212e66b54f2eb2 Mon Sep 17 00:00:00 2001 From: Srikar Dronamraju Date: Tue, 19 Feb 2019 11:49:28 +0530 Subject: doc: trace: Fix documentation for uprobe_profile uprobe_profile has filename and number of probe hits information for each uprobe event. The documentation erroneously talks about probe mis-hits. Update the documentation to the correct information. Link: http://lkml.kernel.org/r/1550557168-12345-1-git-send-email-srikar@linux.vnet.ibm.com Cc: Ingo Molnar Cc: Masami Hiramatsu Reported-by: KAUSTUBH RAJENDRA WELANKAR Signed-off-by: Srikar Dronamraju Signed-off-by: Steven Rostedt (VMware) --- Documentation/trace/uprobetracer.rst | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/trace/uprobetracer.rst b/Documentation/trace/uprobetracer.rst index 4c3bfde2ba47..4346e23e3ae7 100644 --- a/Documentation/trace/uprobetracer.rst +++ b/Documentation/trace/uprobetracer.rst @@ -73,10 +73,9 @@ For $comm, the default type is "string"; any other type is invalid. Event Profiling --------------- -You can check the total number of probe hits and probe miss-hits via -/sys/kernel/debug/tracing/uprobe_profile. -The first column is event name, the second is the number of probe hits, -the third is the number of probe miss-hits. +You can check the total number of probe hits per event via +/sys/kernel/debug/tracing/uprobe_profile. The first column is the filename, +the second is the event name, the third is the number of probe hits. Usage examples -------------- -- cgit v1.2.3