Age | Commit message (Collapse) | Author | Files | Lines |
|
Mark ftrace mcount handler functions nokprobe since
probing on these functions with kretprobe pushes
return address incorrectly on kretprobe shadow stack.
Reported-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Tested-by: Andrea Righi <righi.andrea@gmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/155094062044.6137.6419622920568680640.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Merge page ref overflow branch.
Jann Horn reported that he can overflow the page ref count with
sufficient memory (and a filesystem that is intentionally extremely
slow).
Admittedly it's not exactly easy. To have more than four billion
references to a page requires a minimum of 32GB of kernel memory just
for the pointers to the pages, much less any metadata to keep track of
those pointers. Jann needed a total of 140GB of memory and a specially
crafted filesystem that leaves all reads pending (in order to not ever
free the page references and just keep adding more).
Still, we have a fairly straightforward way to limit the two obvious
user-controllable sources of page references: direct-IO like page
references gotten through get_user_pages(), and the splice pipe page
duplication. So let's just do that.
* branch page-refs:
fs: prevent page refcount overflow in pipe_buf_get
mm: prevent get_user_pages() from overflowing page refcount
mm: add 'try_get_page()' helper function
mm: make page ref count overflow check tighter and more explicit
|
|
Change pipe_buf_get() to return a bool indicating whether it succeeded
in raising the refcount of the page (if the thing in the pipe is a page).
This removes another mechanism for overflowing the page refcount. All
callers converted to handle a failure.
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
At Linux Plumbers, Andy Lutomirski approached me and pointed out that the
function call syscall_get_arguments() implemented in x86 was horribly
written and not optimized for the standard case of passing in 0 and 6 for
the starting index and the number of system calls to get. When looking at
all the users of this function, I discovered that all instances pass in only
0 and 6 for these arguments. Instead of having this function handle
different cases that are never used, simply rewrite it to return the first 6
arguments of a system call.
This should help out the performance of tracing system calls by ptrace,
ftrace and perf.
Link: http://lkml.kernel.org/r/20161107213233.754809394@goodmis.org
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Dave Martin <dave.martin@arm.com>
Cc: "Dmitry V. Levin" <ldv@altlinux.org>
Cc: x86@kernel.org
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-c6x-dev@linux-c6x.org
Cc: uclinux-h8-devel@lists.sourceforge.jp
Cc: linux-hexagon@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-mips@vger.kernel.org
Cc: nios2-dev@lists.rocketboards.org
Cc: openrisc@lists.librecores.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-riscv@lists.infradead.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: linux-um@lists.infradead.org
Cc: linux-xtensa@linux-xtensa.org
Cc: linux-arch@vger.kernel.org
Acked-by: Paul Burton <paul.burton@mips.com> # MIPS parts
Acked-by: Max Filippov <jcmvbkbc@gmail.com> # For xtensa changes
Acked-by: Will Deacon <will.deacon@arm.com> # For the arm64 bits
Reviewed-by: Thomas Gleixner <tglx@linutronix.de> # for x86
Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>
Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
The only users that calls syscall_get_arguments() with a variable and not a
hard coded '6' is ftrace_syscall_enter(). syscall_get_arguments() can be
optimized by removing a variable input, and always grabbing 6 arguments
regardless of what the system call actually uses.
Change ftrace_syscall_enter() to pass the 6 args into a local stack array
and copy the necessary arguments into the trace event as needed.
This is needed to remove two parameters from syscall_get_arguments().
Link: http://lkml.kernel.org/r/20161107213233.627583542@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Changed 0 --> NULL to avoid sparse warning
Corrected spelling mistakes reported by checkpatch.pl
Sparse warning below:
sudo make C=2 CF=-D__CHECK_ENDIAN__ M=kernel/trace
CHECK kernel/trace/ftrace.c
kernel/trace/ftrace.c:3007:24: warning: Using plain integer as NULL pointer
kernel/trace/ftrace.c:4758:37: warning: Using plain integer as NULL pointer
Link: http://lkml.kernel.org/r/20190323183523.GA2244@hari-Inspiron-1545
Signed-off-by: Hariprasad Kelam <hariprasad.kelam@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Fix compile warning in create_dyn_event(): 'ret' may be used uninitialized
in this function [-Wuninitialized].
Link: http://lkml.kernel.org/r/1553237900-8555-1-git-send-email-frowand.list@gmail.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 5448d44c3855 ("tracing: Add unified dynamic event framework")
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Commit 656fe2ba85e8 (tracing: Use hist trigger's var_ref array to
destroy var_refs) centralized the destruction of all the var_refs
in one place so that other code didn't have to do it.
The track_data_destroy() added later ignored that and also destroyed
the track_data var_ref, causing a double-free error flagged by KASAN.
==================================================================
BUG: KASAN: use-after-free in destroy_hist_field+0x30/0x70
Read of size 8 at addr ffff888086df2210 by task bash/1694
CPU: 6 PID: 1694 Comm: bash Not tainted 5.1.0-rc1-test+ #15
Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03
07/14/2016
Call Trace:
dump_stack+0x71/0xa0
? destroy_hist_field+0x30/0x70
print_address_description.cold.3+0x9/0x1fb
? destroy_hist_field+0x30/0x70
? destroy_hist_field+0x30/0x70
kasan_report.cold.4+0x1a/0x33
? __kasan_slab_free+0x100/0x150
? destroy_hist_field+0x30/0x70
destroy_hist_field+0x30/0x70
track_data_destroy+0x55/0xe0
destroy_hist_data+0x1f0/0x350
hist_unreg_all+0x203/0x220
event_trigger_open+0xbb/0x130
do_dentry_open+0x296/0x700
? stacktrace_count_trigger+0x30/0x30
? generic_permission+0x56/0x200
? __x64_sys_fchdir+0xd0/0xd0
? inode_permission+0x55/0x200
? security_inode_permission+0x18/0x60
path_openat+0x633/0x22b0
? path_lookupat.isra.50+0x420/0x420
? __kasan_kmalloc.constprop.12+0xc1/0xd0
? kmem_cache_alloc+0xe5/0x260
? getname_flags+0x6c/0x2a0
? do_sys_open+0x149/0x2b0
? do_syscall_64+0x73/0x1b0
? entry_SYSCALL_64_after_hwframe+0x44/0xa9
? _raw_write_lock_bh+0xe0/0xe0
? __kernel_text_address+0xe/0x30
? unwind_get_return_address+0x2f/0x50
? __list_add_valid+0x2d/0x70
? deactivate_slab.isra.62+0x1f4/0x5a0
? getname_flags+0x6c/0x2a0
? set_track+0x76/0x120
do_filp_open+0x11a/0x1a0
? may_open_dev+0x50/0x50
? _raw_spin_lock+0x7a/0xd0
? _raw_write_lock_bh+0xe0/0xe0
? __alloc_fd+0x10f/0x200
do_sys_open+0x1db/0x2b0
? filp_open+0x50/0x50
do_syscall_64+0x73/0x1b0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fa7b24a4ca2
Code: 25 00 00 41 00 3d 00 00 41 00 74 4c 48 8d 05 85 7a 0d 00 8b 00 85 c0
75 6d 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff
0f 87 a2 00 00 00 48 8b 4c 24 28 64 48 33 0c 25
RSP: 002b:00007fffbafb3af0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 000055d3648ade30 RCX: 00007fa7b24a4ca2
RDX: 0000000000000241 RSI: 000055d364a55240 RDI: 00000000ffffff9c
RBP: 00007fffbafb3bf0 R08: 0000000000000020 R09: 0000000000000002
R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000003 R14: 0000000000000001 R15: 000055d364a55240
==================================================================
So remove the track_data_destroy() destroy_hist_field() call for that
var_ref.
Link: http://lkml.kernel.org/r/1deffec420f6a16d11dd8647318d34a66d1989a9.camel@linux.intel.com
Fixes: 466f4528fbc69 ("tracing: Generalize hist trigger onmax and save action")
Reported-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Pull more block layer changes from Jens Axboe:
"This is a collection of both stragglers, and fixes that came in after
I finalized the initial pull. This contains:
- An MD pull request from Song, with a few minor fixes
- Set of NVMe patches via Christoph
- Pull request from Konrad, with a few fixes for xen/blkback
- pblk fix IO calculation fix (Javier)
- Segment calculation fix for pass-through (Ming)
- Fallthrough annotation for blkcg (Mathieu)"
* tag 'for-5.1/block-post-20190315' of git://git.kernel.dk/linux-block: (25 commits)
blkcg: annotate implicit fall through
nvme-tcp: support C2HData with SUCCESS flag
nvmet: ignore EOPNOTSUPP for discard
nvme: add proper write zeroes setup for the multipath device
nvme: add proper discard setup for the multipath device
nvme: remove nvme_ns_config_oncs
nvme: disable Write Zeroes for qemu controllers
nvmet-fc: bring Disconnect into compliance with FC-NVME spec
nvmet-fc: fix issues with targetport assoc_list list walking
nvme-fc: reject reconnect if io queue count is reduced to zero
nvme-fc: fix numa_node when dev is null
nvme-fc: use nr_phys_segments to determine existence of sgl
nvme-loop: init nvmet_ctrl fatal_err_work when allocate
nvme: update comment to make the code easier to read
nvme: put ns_head ref if namespace fails allocation
nvme-trace: fix cdw10 buffer overrun
nvme: don't warn on block content change effects
nvme: add get-feature to admin cmds tracer
md: Fix failed allocation of md_register_thread
It's wrong to add len to sector_nr in raid10 reshape twice
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes and cleanups from Steven Rostedt:
"This contains a series of last minute clean ups, small fixes and error
checks"
* tag 'trace-v5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/probe: Verify alloc_trace_*probe() result
tracing/probe: Check event/group naming rule at parsing
tracing/probe: Check the size of argument name and body
tracing/probe: Check event name length correctly
tracing/probe: Check maxactive error cases
tracing: kdb: Fix ftdump to not sleep
trace/probes: Remove kernel doc style from non kernel doc comment
tracing/probes: Make reserved_field_names static
|
|
Since alloc_trace_*probe() returns -EINVAL only if !event && !group,
it should not happen in trace_*probe_create(). If we catch that case
there is a bug. So use WARN_ON_ONCE() instead of pr_info().
Link: http://lkml.kernel.org/r/155253785078.14922.16902223633734601469.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Check event and group naming rule at parsing it instead
of allocating probes.
Link: http://lkml.kernel.org/r/155253784064.14922.2336893061156236237.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Check the size of argument name and expression is not 0
and smaller than maximum length.
Link: http://lkml.kernel.org/r/155253783029.14922.12650939303827581096.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Ensure given name of event is not too long when parsing it,
and fix to update event name offset correctly when the group
name is given. For example, this makes probe event to check
the "p:foo/" error case correctly.
Link: http://lkml.kernel.org/r/155253782046.14922.14724124823730168629.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Check maxactive on kprobe error case, because maxactive
is only for kretprobe, not for kprobe. Also, maxactive
should not be 0, it should be at least 1.
Link: http://lkml.kernel.org/r/155253780952.14922.15784129810238750331.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
There is a plan to build the kernel with -Wimplicit-fallthrough and
this place in the code produced a warning (W=1).
This commit remove the following warning:
kernel/trace/blktrace.c:725:9: warning: this statement may fall through [-Wimplicit-fallthrough=]
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
As reported back in 2016-11 [1], the "ftdump" kdb command triggers a
BUG for "sleeping function called from invalid context".
kdb's "ftdump" command wants to call ring_buffer_read_prepare() in
atomic context. A very simple solution for this is to add allocation
flags to ring_buffer_read_prepare() so kdb can call it without
triggering the allocation error. This patch does that.
Note that in the original email thread about this, it was suggested
that perhaps the solution for kdb was to either preallocate the buffer
ahead of time or create our own iterator. I'm hoping that this
alternative of adding allocation flags to ring_buffer_read_prepare()
can be considered since it means I don't need to duplicate more of the
core trace code into "trace_kdb.c" (for either creating my own
iterator or re-preparing a ring allocator whose memory was already
allocated).
NOTE: another option for kdb is to actually figure out how to make it
reuse the existing ftrace_dump() function and totally eliminate the
duplication. This sounds very appealing and actually works (the "sr
z" command can be seen to properly dump the ftrace buffer). The
downside here is that ftrace_dump() fully consumes the trace buffer.
Unless that is changed I'd rather not use it because it means "ftdump
| grep xyz" won't be very useful to search the ftrace buffer since it
will throw away the whole trace on the first grep. A future patch to
dump only the last few lines of the buffer will also be hard to
implement.
[1] https://lkml.kernel.org/r/20161117191605.GA21459@google.com
Link: http://lkml.kernel.org/r/20190308193205.213659-1-dianders@chromium.org
Reported-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
"Assorted fixes (really no common topic here)"
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
vfs: Make __vfs_write() static
vfs: fix preadv64v2 and pwritev64v2 compat syscalls with offset == -1
pipe: stop using ->can_merge
splice: don't merge into linked buffers
fs: move generic stat response attr handling to vfs_getattr_nosec
orangefs: don't reinitialize result_mask in ->getattr
fs/devpts: always delete dcache dentry-s in dput()
|
|
CC kernel/trace/trace_kprobe.o
kernel/trace/trace_kprobe.c:41: warning: cannot understand function prototype: 'struct trace_kprobe '
The real problem is that a comment looked like kerneldoc when it shouldn't be...
Link: http://lkml.kernel.org/r/2812.1552381112@turing-police
Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
sparse complains:
CHECK kernel/trace/trace_probe.c
kernel/trace/trace_probe.c:16:12: warning: symbol 'reserved_field_names' was not declared. Should it be static?
Yes, it should be static.
Link: http://lkml.kernel.org/r/2478.1552380778@turing-police
Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"The biggest change for this release is in the histogram code:
- Add "onchange(var)" histogram handler that executes a action when
$var changes.
- Add new "snapshot()" action for histogram handlers, that causes a
snapshot of the ring buffer when triggered. ie.
onchange(var).snapshot() will trigger a snapshot if var changes.
- Add alternative for "trace()" action. Currently, to trigger a
synthetic event, the name of that event is used as the handler
name, which is inconsistent with the other actions.
onchange(var).synthetic(param) where it can now be
onchange(var).trace(synthetic, param). The older method will still
be allowed, as long as the synthetic events do not overlap with
other handler names.
- The histogram documentation at testcases were updated for the new
changes.
Outside of the histogram code, we have:
- Added a quicker way to enable set_ftrace_filter files, that will
make it much quicker to bisect tracing a function that shouldn't be
traced and crashes the kernel. (You can echo in numbers to
set_ftrace_filter, and it will select the corresponding function
that is in available_filter_functions).
- Some better displaying of the tracing data (and more information
was added).
The rest are small fixes and more clean ups to the code"
* tag 'trace-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (37 commits)
tracing: Use strncpy instead of memcpy when copying comm in trace.c
tracing: Use strncpy instead of memcpy when copying comm for hist triggers
tracing: Use strncpy instead of memcpy for string keys in hist triggers
tracing: Use str_has_prefix() in synth_event_create()
x86/ftrace: Fix warning and considate ftrace_jmp_replace() and ftrace_call_replace()
tracing/perf: Use strndup_user() instead of buggy open-coded version
doc: trace: Fix documentation for uprobe_profile
tracing: Fix spelling mistake: "analagous" -> "analogous"
tracing: Comment why cond_snapshot is checked outside of max_lock protection
tracing: Add hist trigger action 'expected fail' test case
tracing: Add alternative synthetic event trace action test case
tracing: Add hist trigger onchange() handler test case
tracing: Add hist trigger snapshot() action test case
tracing: Add SPDX license GPL-2.0 license identifier to inter-event testcases
tracing: Add alternative synthetic event trace action syntax
tracing: Add hist trigger onchange() handler Documentation
tracing: Add hist trigger onchange() handler
tracing: Add hist trigger snapshot() action Documentation
tracing: Add hist trigger snapshot() action
tracing: Add conditional snapshot
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- do not generate unneeded top-level built-in.a
- let git ignore O= directory entirely
- optimize scripts/kallsyms slightly
- exclude DWARF info from *.s regardless of config options
- fix GCC toolchain search path for Clang to prepare ld.lld support
- do not generate modules.order when CONFIG_MODULES is disabled
- simplify single target rules and remove VPATH for external module
build
- allow to add optional flags to dpkg-buildpackage when building
deb-pkg
- move some compiler option tests from Makefile to Kconfig
- various Makefile cleanups
* tag 'kbuild-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (40 commits)
kbuild: remove scripts/basic/% build target
kbuild: use -Werror=implicit-... instead of -Werror-implicit-...
kbuild: clean up scripts/gcc-version.sh
kbuild: remove cc-version macro
kbuild: update comment block of scripts/clang-version.sh
kbuild: remove commented-out INITRD_COMPRESS
kbuild: move -gsplit-dwarf, -gdwarf-4 option tests to Kconfig
kbuild: [bin]deb-pkg: add DPKG_FLAGS variable
kbuild: move ".config not found!" message from Kconfig to Makefile
kbuild: invoke syncconfig if include/config/auto.conf.cmd is missing
kbuild: simplify single target rules
kbuild: remove empty rules for makefiles
kbuild: make -r/-R effective in top Makefile for old Make versions
kbuild: move tools_silent to a more relevant place
kbuild: compute false-positive -Wmaybe-uninitialized cases in Kconfig
kbuild: refactor cc-cross-prefix implementation
kbuild: hardcode genksyms path and remove GENKSYMS variable
scripts/gdb: refactor rules for symlink creation
kbuild: create symlink to vmlinux-gdb.py in scripts_gdb target
scripts/gdb: do not descend into scripts/gdb from scripts
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix/cleanup from Steven Rostedt:
"This is a "pre-pull". It's only one small fix and one small clean up.
I'm testing a few small patches for my real pull request which will
come at a later time. The second patch depends on your tree anyway so
I included it along with the urgent fix.
A small fix Pavel sent me back in august was accidentally lost due to
it being placed with some other patches that failed some tests, and
was rebased out of my local tree. Which was a regression that caused
event filters not to handle negative numbers.
The clean up is from Masami that realized that the code in kprobes
that calls probe_mem_read() wrapper, which is to be used in code used
by both kprobes and uprobes, was only in code for kprobes. It should
not use the wrapper there, but instead call probe_kernel_read()
directly"
* tag 'trace-v5.0-pre' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/kprobes: Use probe_kernel_read instead of probe_mem_read
tracing: Fix event filters and triggers to handle negative numbers
|
|
Because there may be random garbage beyond a string's null terminator,
code that might use the entire comm array e.g. histogram keys, can
give unexpected results if that garbage is copied in too, so avoid
that possibility by using strncpy instead of memcpy.
Link: http://lkml.kernel.org/r/1d6ebac26570c2a29ce9fb575379f17ef5c8b81b.1551802084.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Suggested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Because there may be random garbage beyond a string's null terminator,
code that might use the entire comm array e.g. histogram keys, can
give unexpected results if that garbage is copied in too, so avoid
that possibility by using strncpy instead of memcpy.
Link: http://lkml.kernel.org/r/1eb9f096a8086c3c82c7fc087c900005143cec54.1551802084.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Because there may be random garbage beyond a string's null terminator,
it's not correct to copy the the complete character array for use as a
hist trigger key. This results in multiple histogram entries for the
'same' string key.
So, in the case of a string key, use strncpy instead of memcpy to
avoid copying in the extra bytes.
Before, using the gdbus entries in the following hist trigger as an
example:
# echo 'hist:key=comm' > /sys/kernel/debug/tracing/events/sched/sched_waking/trigger
# cat /sys/kernel/debug/tracing/events/sched/sched_waking/hist
...
{ comm: ImgDecoder #4 } hitcount: 203
{ comm: gmain } hitcount: 213
{ comm: gmain } hitcount: 216
{ comm: StreamTrans #73 } hitcount: 221
{ comm: mozStorage #3 } hitcount: 230
{ comm: gdbus } hitcount: 233
{ comm: StyleThread#5 } hitcount: 253
{ comm: gdbus } hitcount: 256
{ comm: gdbus } hitcount: 260
{ comm: StyleThread#4 } hitcount: 271
...
# cat /sys/kernel/debug/tracing/events/sched/sched_waking/hist | egrep gdbus | wc -l
51
After:
# cat /sys/kernel/debug/tracing/events/sched/sched_waking/hist | egrep gdbus | wc -l
1
Link: http://lkml.kernel.org/r/50c35ae1267d64eee975b8125e151e600071d4dc.1549309756.git.tom.zanussi@linux.intel.com
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 79e577cbce4c4 ("tracing: Support string type key properly")
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Since we now have a str_has_prefix() that returns the length, we can
use that instead of explicitly calculating it.
Link: http://lkml.kernel.org/r/03418373fd1e80030e7394b8e3e081c5de28a710.1549309756.git.tom.zanussi@linux.intel.com
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Use probe_kernel_read() instead of probe_mem_read() because
probe_mem_read() is a kind of wrapper for switching memory
read function between uprobes and kprobes.
Link: http://lkml.kernel.org/r/20190222011643.3e19ade84a3db3e83518648f@kernel.org
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Then tracing syscall exit event it is extremely useful to filter exit
codes equal to some negative value, to react only to required errors.
But negative numbers does not work:
[root@snorch sys_exit_read]# echo "ret == -1" > filter
bash: echo: write error: Invalid argument
[root@snorch sys_exit_read]# cat filter
ret == -1
^
parse_error: Invalid value (did you forget quotes)?
Similar thing happens when setting triggers.
These is a regression in v4.17 introduced by the commit mentioned below,
testing without these commit shows no problem with negative numbers.
Link: http://lkml.kernel.org/r/20180823102534.7642-1-ptikhomirov@virtuozzo.com
Cc: stable@vger.kernel.org
Fixes: 80765597bc58 ("tracing: Rewrite filter logic to be simpler and faster")
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Since -Wmaybe-uninitialized was introduced by GCC 4.7, we have patched
various false positives:
- commit e74fc973b6e5 ("Turn off -Wmaybe-uninitialized when building
with -Os") turned off this option for -Os.
- commit 815eb71e7149 ("Kbuild: disable 'maybe-uninitialized' warning
for CONFIG_PROFILE_ALL_BRANCHES") turned off this option for
CONFIG_PROFILE_ALL_BRANCHES
- commit a76bcf557ef4 ("Kbuild: enable -Wmaybe-uninitialized warning
for "make W=1"") turned off this option for GCC < 4.9
Arnd provided more explanation in https://lkml.org/lkml/2017/3/14/903
I think this looks better by shifting the logic from Makefile to Kconfig.
Link: https://github.com/ClangBuiltLinux/linux/issues/350
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
|
|
The first version of this method was missing the check for
`ret == PATH_MAX`; then such a check was added, but it didn't call kfree()
on error, so there was still a small memory leak in the error case.
Fix it by using strndup_user() instead of open-coding it.
Link: http://lkml.kernel.org/r/20190220165443.152385-1-jannh@google.com
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 0eadcc7a7bc0 ("perf/core: Fix perf_uprobe_init()")
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
There is a spelling mistake in the mini-howto help text. Fix it.
Link: http://lkml.kernel.org/r/20190217223222.16479-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Before setting tr->cond_snapshot, it must be NULL before it can be updated.
It can go to NULL when a trace event hist trigger is created or removed, and
can only be modified under the max_lock spin lock. But because it can only
be set to something other than NULL under both the max_lock spin lock as
well as the trace_types_lock, we can perform the check if it is not NULL
only under the trace_types_lock and fail out without having to grab the
max_lock spin lock.
This is very subtle, and deserves a comment.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Add a 'trace(synthetic_event_name, params)' alternative to
synthetic_event_name(params).
Currently, the syntax used for generating synthetic events is to
invoke synthetic_event_name(params) i.e. use the synthetic event name
as a function call.
Users requested a new form that more explicitly shows that the
synthetic event is in effect being traced. In this version, a new
'trace()' keyword is used, and the synthetic event name is passed in
as the first argument.
In addition, for the sake of consistency with other actions, change
the documention to emphasize the trace() form over the function-call
form, which remains documented as equivalent.
Link: http://lkml.kernel.org/r/d082773e50232a001480cf837679a1e01c1a2eb7.1550100284.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Add support for a hist:onchange($var) handler, similar to the onmax()
handler but triggering whenever there's any change in $var, not just a
max.
Link: http://lkml.kernel.org/r/dfbc7e4ada242603e9ec3f049b5ad076a07dfd03.1550100284.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Add support for hist:handlerXXX($var).snapshot(), which will take a
snapshot of the current trace buffer whenever handlerXXX is hit.
As a first user, this also adds snapshot() action support for the
onmax() handler i.e. hist:onmax($var).snapshot().
Also, the hist trigger key printing is moved into a separate function
so the snapshot() action can print a histogram key outside the
histogram display - add and use hist_trigger_print_key() for that
purpose.
Link: http://lkml.kernel.org/r/2f1a952c0dcd8aca8702ce81269581a692396d45.1550100284.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Currently, tracing snapshots are context-free - they capture the ring
buffer contents at the time the tracing_snapshot() function was
invoked, and nothing else. Additionally, they're always taken
unconditionally - the calling code can decide whether or not to take a
snapshot, but the data used to make that decision is kept separately
from the snapshot itself.
This change adds the ability to associate with each trace instance
some user data, along with an 'update' function that can use that data
to determine whether or not to actually take a snapshot. The update
function can then update that data along with any other state (as part
of the data presumably), if warranted.
Because snapshots are 'global' per-instance, only one user can enable
and use a conditional snapshot for any given trace instance. To
enable a conditional snapshot (see details in the function and data
structure comments), the user calls tracing_snapshot_cond_enable().
Similarly, to disable a conditional snapshot and free it up for other
users, tracing_snapshot_cond_disable() should be called.
To actually initiate a conditional snapshot, tracing_snapshot_cond()
should be called. tracing_snapshot_cond() will invoke the update()
callback, allowing the user to decide whether or not to actually take
the snapshot and update the user-defined data associated with the
snapshot. If the callback returns 'true', tracing_snapshot_cond()
will then actually take the snapshot and return.
This scheme allows for flexibility in snapshot implementations - for
example, by implementing slightly different update() callbacks,
snapshots can be taken in situations where the user is only interested
in taking a snapshot when a new maximum in hit versus when a value
changes in any way at all. Future patches will demonstrate both
cases.
Link: http://lkml.kernel.org/r/1bea07828d5fd6864a585f83b1eed47ce097eb45.1550100284.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
The action refactor code allowed actions and handlers to be separated,
but the existing onmax handler and save action code is still not
flexible enough to handle arbitrary coupling. This change generalizes
them and in the process makes additional handlers and actions easier
to implement.
The onmax action can be broken up and thought of as two separate
components - a variable to be tracked (the parameter given to the
onmax($var_to_track) function) and an invisible variable created to
save the ongoing result of doing something with that variable, such as
saving the max value of that variable so far seen.
Separating it out like this and renaming it appropriately allows us to
use the same code for similar tracking functions such as
onchange($var_to_track), which would just track the last value seen
rather than the max seen so far, which is useful in some situations.
Additionally, because different handlers and actions may want to save
and access data differently e.g. save and retrieve tracking values as
local variables vs something more global, save_val() and get_val()
interface functions are introduced and max-specific implementations
are used instead.
The same goes for the code that checks whether a maximum has been hit
- a generic check_val() interface and max-checking implementation is
used instead, which allows future patches to make use of he same code
using their own implemetations of similar functionality.
Link: http://lkml.kernel.org/r/980ea73dd8e3f36db3d646f99652f8fed42b77d4.1550100284.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Currently, the onmatch action data binds the onmatch action to data
related to synthetic event generation. Since we want to allow the
onmatch handler to potentially invoke a different action, and because
we expect other handlers to generate synthetic events, we need to
separate the data related to these two functions.
Also rename the onmatch data to something more descriptive, and create
and use common action data destroy function.
Link: http://lkml.kernel.org/r/b9abbf9aae69fe3920cdc8ddbcaad544dd258d78.1550100284.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
The hist trigger action code currently implements two essentially
hard-coded pairs of 'actions' - onmax(), which tracks a variable and
saves some event fields when a max is hit, and onmatch(), which is
hard-coded to generate a synthetic event.
These hardcoded pairs (track max/save fields and detect match/generate
synthetic event) should really be decoupled into separate components
that can then be arbitrarily combined. The first component of each
pair (track max/detect match) is called a 'handler' in the new code,
while the second component (save fields/generate synthetic event) is
called an 'action' in this scheme.
This change refactors the action code to reflect this split by adding
two handlers, HANDLER_ONMATCH and HANDLER_ONMAX, along with two
actions, ACTION_SAVE and ACTION_TRACE.
The new code combines them to produce the existing ONMATCH/TRACE and
ONMAX/SAVE functionality, but doesn't implement the other combinations
now possible. Future patches will expand these to further useful
cases, such as ONMAX/TRACE, as well as add additional handlers and
actions such as ONCHANGE and SNAPSHOT.
Also, add abbreviated documentation for handlers and actions to
README.
Link: http://lkml.kernel.org/r/98bfdd48c1b4ff29fc5766442f99f5bc3c34b76b.1550100284.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Commit d716ff71dd12 ("tracing: Remove taking of trace_types_lock in
pipe files") use the current tracer instead of the copy in
tracing_open_pipe(), but it forget to remove the freeing sentence in
the error path.
There's an error path that can call kfree(iter->trace) after the iter->trace
was assigned to tr->current_trace, which would be bad to free.
Link: http://lkml.kernel.org/r/1550060946-45984-1-git-send-email-yi.zhang@huawei.com
Cc: stable@vger.kernel.org
Fixes: d716ff71dd12 ("tracing: Remove taking of trace_types_lock in pipe files")
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"Two more tracing fixes
- Have kprobes not use copy_from_user() to access kernel addresses,
because kprobes can legitimately poke at bad kernel memory, which
will fault. Copy from user code should never fault in kernel space.
Using probe_mem_read() can handle kernel address space faulting.
- Put back the entries counter in the tracing output that was
accidentally removed"
* tag 'trace-v5.0-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix number of entries in trace header
kprobe: Do not use uaccess functions to access kernel memory that can fault
|
|
Enabling of large number of functions by echoing in a large subset of the
functions in available_filter_functions can take a very long time. The
process requires testing all functions registered by the function tracer
(which is in the 10s of thousands), and doing a kallsyms lookup to convert
the ip address into a name, then comparing that name with the string passed
in.
When a function causes the function tracer to crash the system, a binary
bisect of the available_filter_functions can be done to find the culprit.
But this requires passing in half of the functions in
available_filter_functions over and over again, which makes it basically a
O(n^2) operation. With 40,000 functions, that ends up bing 1,600,000,000
opertions! And enabling this can take over 20 minutes.
As a quick speed up, if a number is passed into one of the filter files,
instead of doing a search, it just enables the function at the corresponding
line of the available_filter_functions file. That is:
# echo 50 > set_ftrace_filter
# cat set_ftrace_filter
x86_pmu_commit_txn
# head -50 available_filter_functions | tail -1
x86_pmu_commit_txn
This allows setting of half the available_filter_functions to take place in
less than a second!
# time seq 20000 > set_ftrace_filter
real 0m0.042s
user 0m0.005s
sys 0m0.015s
# wc -l set_ftrace_filter
20000 set_ftrace_filter
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
The following commit
441dae8f2f29 ("tracing: Add support for display of tgid in trace output")
removed the call to print_event_info() from print_func_help_header_irq()
which results in the ftrace header not reporting the number of entries
written in the buffer. As this wasn't the original intent of the patch,
re-introduce the call to print_event_info() to restore the orginal
behaviour.
Link: http://lkml.kernel.org/r/20190214152950.4179-1-quentin.perret@arm.com
Acked-by: Joel Fernandes <joelaf@google.com>
Cc: stable@vger.kernel.org
Fixes: 441dae8f2f29 ("tracing: Add support for display of tgid in trace output")
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
The userspace can ask kprobe to intercept strings at any memory address,
including invalid kernel address. In this case, fetch_store_strlen()
would crash since it uses general usercopy function, and user access
functions are no longer allowed to access kernel memory.
For example, we can crash the kernel by doing something as below:
$ sudo kprobe 'p:do_sys_open +0(+0(%si)):string'
[ 103.620391] BUG: GPF in non-whitelisted uaccess (non-canonical address?)
[ 103.622104] general protection fault: 0000 [#1] SMP PTI
[ 103.623424] CPU: 10 PID: 1046 Comm: cat Not tainted 5.0.0-rc3-00130-gd73aba1-dirty #96
[ 103.625321] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-2-g628b2e6-dirty-20190104_103505-linux 04/01/2014
[ 103.628284] RIP: 0010:process_fetch_insn+0x1ab/0x4b0
[ 103.629518] Code: 10 83 80 28 2e 00 00 01 31 d2 31 ff 48 8b 74 24 28 eb 0c 81 fa ff 0f 00 00 7f 1c 85 c0 75 18 66 66 90 0f ae e8 48 63
ca 89 f8 <8a> 0c 31 66 66 90 83 c2 01 84 c9 75 dc 89 54 24 34 89 44 24 28 48
[ 103.634032] RSP: 0018:ffff88845eb37ce0 EFLAGS: 00010246
[ 103.635312] RAX: 0000000000000000 RBX: ffff888456c4e5a8 RCX: 0000000000000000
[ 103.637057] RDX: 0000000000000000 RSI: 2e646c2f6374652f RDI: 0000000000000000
[ 103.638795] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 103.640556] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[ 103.642297] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 103.644040] FS: 0000000000000000(0000) GS:ffff88846f000000(0000) knlGS:0000000000000000
[ 103.646019] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 103.647436] CR2: 00007ffc79758038 CR3: 0000000463360006 CR4: 0000000000020ee0
[ 103.649147] Call Trace:
[ 103.649781] ? sched_clock_cpu+0xc/0xa0
[ 103.650747] ? do_sys_open+0x5/0x220
[ 103.651635] kprobe_trace_func+0x303/0x380
[ 103.652645] ? do_sys_open+0x5/0x220
[ 103.653528] kprobe_dispatcher+0x45/0x50
[ 103.654682] ? do_sys_open+0x1/0x220
[ 103.655875] kprobe_ftrace_handler+0x90/0xf0
[ 103.657282] ftrace_ops_assist_func+0x54/0xf0
[ 103.658564] ? __call_rcu+0x1dc/0x280
[ 103.659482] 0xffffffffc00000bf
[ 103.660384] ? __ia32_sys_open+0x20/0x20
[ 103.661682] ? do_sys_open+0x1/0x220
[ 103.662863] do_sys_open+0x5/0x220
[ 103.663988] do_syscall_64+0x60/0x210
[ 103.665201] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 103.666862] RIP: 0033:0x7fc22fadccdd
[ 103.668034] Code: 48 89 54 24 e0 41 83 e2 40 75 32 89 f0 25 00 00 41 00 3d 00 00 41 00 74 24 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff
ff 0f 05 <48> 3d 00 f0 ff ff 77 33 f3 c3 66 0f 1f 84 00 00 00 00 00 48 8d 44
[ 103.674029] RSP: 002b:00007ffc7972c3a8 EFLAGS: 00000287 ORIG_RAX: 0000000000000101
[ 103.676512] RAX: ffffffffffffffda RBX: 0000562f86147a21 RCX: 00007fc22fadccdd
[ 103.678853] RDX: 0000000000080000 RSI: 00007fc22fae1428 RDI: 00000000ffffff9c
[ 103.681151] RBP: ffffffffffffffff R08: 0000000000000000 R09: 0000000000000000
[ 103.683489] R10: 0000000000000000 R11: 0000000000000287 R12: 00007fc22fce90a8
[ 103.685774] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
[ 103.688056] Modules linked in:
[ 103.689131] ---[ end trace 43792035c28984a1 ]---
This can be fixed by using probe_mem_read() instead, as it can handle faulting
kernel memory addresses, which kprobes can legitimately do.
Link: http://lkml.kernel.org/r/20190125151051.7381-1-changbin.du@gmail.com
Cc: stable@vger.kernel.org
Fixes: 9da3f2b7405 ("x86/fault: BUG() when uaccess helpers fault on kernel addresses")
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"This fixes kprobes/uprobes dynamic processing of strings, where it
processes the args but does not update the remaining length of the
buffer that the string arguments will be placed in. It constantly
passes in the total size of buffer used instead of passing in the
remaining size of the buffer used.
This could cause issues if the strings are larger than the max size of
an event which could cause the strings to be written beyond what was
reserved on the buffer"
* tag 'trace-v5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: probeevent: Correctly update remaining space in dynamic area
|
|
Since kprobes breakpoint handling involves hardirq tracer,
probing these functions cause breakpoint recursion problem.
Prohibit probing on those functions.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrea Righi <righi.andrea@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/154998802073.31052.17255044712514564153.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Commit 9178412ddf5a ("tracing: probeevent: Return consumed
bytes of dynamic area") improved the string fetching
mechanism by returning the number of required bytes after
copying the argument to the dynamic area. However, this
return value is now only used to increment the pointer
inside the dynamic area but misses updating the 'maxlen'
variable which indicates the remaining space in the dynamic
area.
This means that fetch_store_string() always reads the *total*
size of the dynamic area from the data_loc pointer instead of
the *remaining* size (and passes it along to
strncpy_from_{user,unsafe}) even if we're already about to
copy data into the middle of the dynamic area.
Link: http://lkml.kernel.org/r/20190206190013.16405-1-andreas.ziegler@fau.de
Cc: Ingo Molnar <mingo@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 9178412ddf5a ("tracing: probeevent: Return consumed bytes of dynamic area")
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Andreas Ziegler <andreas.ziegler@fau.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Here is an example for this change.
$ sudo perf record -e 'ftrace:function' --filter='ip==schedule'
$ sudo perf report
The output of perf before this patch:
\# Samples: 100 of event 'ftrace:function'
\# Event count (approx.): 100
\#
\# Overhead Trace output
\# ........ ......................................
\#
51.00% ffffffff81f6aaa0 <-- ffffffff81158e8d
29.00% ffffffff81f6aaa0 <-- ffffffff8116ccb2
8.00% ffffffff81f6aaa0 <-- ffffffff81f6f2ed
4.00% ffffffff81f6aaa0 <-- ffffffff811628db
4.00% ffffffff81f6aaa0 <-- ffffffff81f6ec5b
2.00% ffffffff81f6aaa0 <-- ffffffff81f6f21a
1.00% ffffffff81f6aaa0 <-- ffffffff811b04af
1.00% ffffffff81f6aaa0 <-- ffffffff8143ce17
After this patch:
\# Samples: 36 of event 'ftrace:function'
\# Event count (approx.): 36
\#
\# Overhead Trace output
\# ........ ............................................
\#
38.89% schedule <-- schedule_hrtimeout_range_clock
27.78% schedule <-- worker_thread
13.89% schedule <-- schedule_timeout
11.11% schedule <-- smpboot_thread_fn
5.56% schedule <-- rcu_gp_kthread
2.78% schedule <-- exit_to_usermode_loop
Link: http://lkml.kernel.org/r/20190209161919.32350-1-changbin.du@gmail.com
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|