summaryrefslogtreecommitdiffstats
path: root/arch/arm64/kernel/ftrace.c
AgeCommit message (Collapse)AuthorFilesLines
2021-11-16arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTRMark Rutland1-3/+3
When CONFIG_FUNCTION_GRAPH_TRACER is selected and the function graph tracer is in use, unwind_frame() may erroneously associate a traced function with an incorrect return address. This can happen when starting an unwind from a pt_regs, or when unwinding across an exception boundary. This can be seen when recording with perf while the function graph tracer is in use. For example: | # echo function_graph > /sys/kernel/debug/tracing/current_tracer | # perf record -g -e raw_syscalls:sys_enter:k /bin/true | # perf report ... reports the callchain erroneously as: | el0t_64_sync | el0t_64_sync_handler | el0_svc_common.constprop.0 | perf_callchain | get_perf_callchain | syscall_trace_enter | syscall_trace_enter ... whereas when the function graph tracer is not in use, it reports: | el0t_64_sync | el0t_64_sync_handler | el0_svc | do_el0_svc | el0_svc_common.constprop.0 | syscall_trace_enter | syscall_trace_enter The underlying problem is that ftrace_graph_get_ret_stack() takes an index offset from the most recent entry added to the fgraph return stack. We start an unwind at offset 0, and increment the offset each time we encounter a rewritten return address (i.e. when we see `return_to_handler`). This is broken in two cases: 1) Between creating a pt_regs and starting the unwind, function calls may place entries on the stack, leaving an arbitrary offset which we can only determine by performing a full unwind from the caller of the unwind code (and relying on none of the unwind code being instrumented). This can result in erroneous entries being reported in a backtrace recorded by perf or kfence when the function graph tracer is in use. Currently show_regs() is unaffected as dump_backtrace() performs an initial unwind. 2) When unwinding across an exception boundary (whether continuing an unwind or starting a new unwind from regs), we currently always skip the LR of the interrupted context. Where this was live and contained a rewritten address, we won't consume the corresponding fgraph ret stack entry, leaving subsequent entries off-by-one. This can result in erroneous entries being reported in a backtrace performed by any in-kernel unwinder when that backtrace crosses an exception boundary, with entries after the boundary being reported incorrectly. This includes perf, kfence, show_regs(), panic(), etc. To fix this, we need to be able to uniquely identify each rewritten return address such that we can map this back to the original return address. We can use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR to associate each rewritten return address with a unique location on the stack. As the return address is passed in the LR (and so is not guaranteed a unique location in memory), we use the FP upon entry to the function (i.e. the address of the caller's frame record) as the return address pointer. Any nested call will have a different FP value as the caller must create its own frame record and update FP to point to this. Since ftrace_graph_ret_addr() requires the return address with the PAC stripped, the stripping of the PAC is moved before the fixup of the rewritten address. As we would unconditionally strip the PAC, moving this earlier is not harmful, and we can avoid a redundant strip in the return address fixup code. I've tested this with the perf case above, the ftrace selftests, and a number of ad-hoc unwinder tests. The tests all pass, and I have seen no unexpected behaviour as a result of this change. I've tested with pointer authentication under QEMU TCG where magic-sysrq+l correctly recovers the original return addresses. Note that this doesn't fix the issue of skipping a live LR at an exception boundary, which is a more general problem and requires more substantial rework. Were we to consume the LR in all cases this would result in warnings where the interrupted context's LR contains `return_to_handler`, but the FP has been altered, e.g. | func: | <--- ftrace entry ---> // logs FP & LR, rewrites LR | STP FP, LR, [SP, #-16]! | MOV FP, SP | <--- INTERRUPT ---> ... as ftrace_graph_get_ret_stack() fill not find a matching entry, triggering the WARN_ON_ONCE() in unwind_frame(). Link: https://lore.kernel.org/r/20211025164925.GB2001@C02TD0UTHF1T.local Link: https://lore.kernel.org/r/20211027132529.30027-1-mark.rutland@arm.com Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Cc: Mark Brown <broonie@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211029162245.39761-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-08ftrace: Cleanup ftrace_dyn_arch_init()Weizhao Ouyang1-5/+0
Most of ARCHs use empty ftrace_dyn_arch_init(), introduce a weak common ftrace_dyn_arch_init() to cleanup them. Link: https://lkml.kernel.org/r/20210909090216.1955240-1-o451686892@gmail.com Acked-by: Heiko Carstens <hca@linux.ibm.com> (s390) Acked-by: Helge Deller <deller@gmx.de> (parisc) Signed-off-by: Weizhao Ouyang <o451686892@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-11arm64: insn: decouple patching from insn codeMark Rutland1-0/+1
Currently, <asm/insn.h> includes <asm/patching.h>. We intend that <asm/insn.h> will be usable from userspace, so it doesn't make sense to include headers for kernel-only features such as the patching routines, and we'd intended to restrict <asm/insn.h> to instruction encoding details. Let's decouple the patching code from <asm/insn.h>, and explicitly include <asm/patching.h> where it is needed. Since <asm/patching.h> isn't included from assembly, we can drop the __ASSEMBLY__ guards. At the same time, sort the kprobes includes so that it's easier to see what is and isn't incldued. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210609102301.17332-2-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-04-08arm64: ftrace: use function_nocfi for ftrace_callSami Tolvanen1-1/+1
With CONFIG_CFI_CLANG, the compiler replaces function pointers with jump table addresses, which breaks dynamic ftrace as the address of ftrace_call is replaced with the address of ftrace_call.cfi_jt. Use function_nocfi() to get the address of the actual function instead. Suggested-by: Ben Dai <ben.dai@unisoc.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210408182843.1754385-17-samitolvanen@google.com
2020-06-08arm64: ftrace: Change CONFIG_FTRACE_WITH_REGS to CONFIG_DYNAMIC_FTRACE_WITH_REGSJoe Perches1-1/+2
CONFIG_FTRACE_WITH_REGS does not exist as a Kconfig symbol. Fixes: 3b23e4991fb6 ("arm64: implement ftrace with regs") Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/b9b27f2233bd1fa31d72ff937beefdae0e2104e5.camel@perches.com Signed-off-by: Will Deacon <will@kernel.org>
2019-11-06arm64: ftrace: minimize ifdefferyMark Rutland1-10/+8
Now that we no longer refer to mod->arch.ftrace_trampolines in the body of ftrace_make_call(), we can use IS_ENABLED() rather than ifdeffery, and make the code easier to follow. Likewise in ftrace_make_nop(). Let's do so. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Torsten Duwe <duwe@suse.de> Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Torsten Duwe <duwe@suse.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org>
2019-11-06arm64: implement ftrace with regsTorsten Duwe1-14/+70
This patch implements FTRACE_WITH_REGS for arm64, which allows a traced function's arguments (and some other registers) to be captured into a struct pt_regs, allowing these to be inspected and/or modified. This is a building block for live-patching, where a function's arguments may be forwarded to another function. This is also necessary to enable ftrace and in-kernel pointer authentication at the same time, as it allows the LR value to be captured and adjusted prior to signing. Using GCC's -fpatchable-function-entry=N option, we can have the compiler insert a configurable number of NOPs between the function entry point and the usual prologue. This also ensures functions are AAPCS compliant (e.g. disabling inter-procedural register allocation). For example, with -fpatchable-function-entry=2, GCC 8.1.0 compiles the following: | unsigned long bar(void); | | unsigned long foo(void) | { | return bar() + 1; | } ... to: | <foo>: | nop | nop | stp x29, x30, [sp, #-16]! | mov x29, sp | bl 0 <bar> | add x0, x0, #0x1 | ldp x29, x30, [sp], #16 | ret This patch builds the kernel with -fpatchable-function-entry=2, prefixing each function with two NOPs. To trace a function, we replace these NOPs with a sequence that saves the LR into a GPR, then calls an ftrace entry assembly function which saves this and other relevant registers: | mov x9, x30 | bl <ftrace-entry> Since patchable functions are AAPCS compliant (and the kernel does not use x18 as a platform register), x9-x18 can be safely clobbered in the patched sequence and the ftrace entry code. There are now two ftrace entry functions, ftrace_regs_entry (which saves all GPRs), and ftrace_entry (which saves the bare minimum). A PLT is allocated for each within modules. Signed-off-by: Torsten Duwe <duwe@suse.de> [Mark: rework asm, comments, PLTs, initialization, commit message] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Torsten Duwe <duwe@suse.de> Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Torsten Duwe <duwe@suse.de> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Julien Thierry <jthierry@redhat.com> Cc: Will Deacon <will@kernel.org>
2019-11-06arm64: module/ftrace: intialize PLT at load timeMark Rutland1-41/+14
Currently we lazily-initialize a module's ftrace PLT at runtime when we install the first ftrace call. To do so we have to apply a number of sanity checks, transiently mark the module text as RW, and perform an IPI as part of handling Neoverse-N1 erratum #1542419. We only expect the ftrace trampoline to point at ftrace_caller() (AKA FTRACE_ADDR), so let's simplify all of this by intializing the PLT at module load time, before the module loader marks the module RO and performs the intial I-cache maintenance for the module. Thus we can rely on the module having been correctly intialized, and can simplify the runtime work necessary to install an ftrace call in a module. This will also allow for the removal of module_disable_ro(). Tested by forcing ftrace_make_call() to use the module PLT, and then loading up a module after setting up ftrace with: | echo ":mod:<module-name>" > set_ftrace_filter; | echo function > current_tracer; | modprobe <module-name> Since FTRACE_ADDR is only defined when CONFIG_DYNAMIC_FTRACE is selected, we wrap its use along with most of module_init_ftrace_plt() with ifdeffery rather than using IS_ENABLED(). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Torsten Duwe <duwe@suse.de> Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Torsten Duwe <duwe@suse.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org>
2019-10-04arm64: ftrace: Ensure synchronisation in PLT setup for Neoverse-N1 #1542419James Morse1-3/+9
CPUs affected by Neoverse-N1 #1542419 may execute a stale instruction if it was recently modified. The affected sequence requires freshly written instructions to be executable before a branch to them is updated. There are very few places in the kernel that modify executable text, all but one come with sufficient synchronisation: * The module loader's flush_module_icache() calls flush_icache_range(), which does a kick_all_cpus_sync() * bpf_int_jit_compile() calls flush_icache_range(). * Kprobes calls aarch64_insn_patch_text(), which does its work in stop_machine(). * static keys and ftrace both patch between nops and branches to existing kernel code (not generated code). The affected sequence is the interaction between ftrace and modules. The module PLT is cleaned using __flush_icache_range() as the trampoline shouldn't be executable until we update the branch to it. Drop the double-underscore so that this path runs kick_all_cpus_sync() too. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-08-16arm64: ftrace: Ensure module ftrace trampoline is coherent with I-sideWill Deacon1-9/+13
The initial support for dynamic ftrace trampolines in modules made use of an indirect branch which loaded its target from the beginning of a special section (e71a4e1bebaf7 ("arm64: ftrace: add support for far branches to dynamic ftrace")). Since no instructions were being patched, no cache maintenance was needed. However, later in be0f272bfc83 ("arm64: ftrace: emit ftrace-mod.o contents through code") this code was reworked to output the trampoline instructions directly into the PLT entry but, unfortunately, the necessary cache maintenance was overlooked. Add a call to __flush_icache_range() after writing the new trampoline instructions but before patching in the branch to the trampoline. Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: James Morse <james.morse@arm.com> Cc: <stable@vger.kernel.org> Fixes: be0f272bfc83 ("arm64: ftrace: emit ftrace-mod.o contents through code") Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-19treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500Thomas Gleixner1-4/+1
Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-23arm64/module: ftrace: deal with place relative nature of PLTsArd Biesheuvel1-2/+7
Another bodge for the ftrace PLT code: plt_entries_equal() now takes the place relative nature of the ADRP/ADD based PLT entries into account, which means that a struct trampoline instance on the stack is no longer equal to the same set of opcodes in the module struct, given that they don't point to the same place in memory anymore. Work around this by using memcmp() in the ftrace PLT handling code. Acked-by: Will Deacon <will.deacon@arm.com> Tested-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-04-08arm64/ftrace: fix inadvertent BUG() in trampoline checkArd Biesheuvel1-2/+1
The ftrace trampoline code (which deals with modules loaded out of BL range of the core kernel) uses plt_entries_equal() to check whether the per-module trampoline equals a zero buffer, to decide whether the trampoline has already been initialized. This triggers a BUG() in the opcode manipulation code, since we end up checking the ADRP offset of a 0x0 opcode, which is not an ADRP instruction. So instead, add a helper to check whether a PLT is initialized, and call that from the frace code. Cc: <stable@vger.kernel.org> # v5.0 Fixes: bdb85cd1d206 ("arm64/module: switch to ADRP/ADD sequences for PLT entries") Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-31Merge tag 'trace-v4.21' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: - Rework of the kprobe/uprobe and synthetic events to consolidate all the dynamic event code. This will make changes in the future easier. - Partial rewrite of the function graph tracing infrastructure. This will allow for multiple users of hooking onto functions to get the callback (return) of the function. This is the ground work for having kprobes and function graph tracer using one code base. - Clean up of the histogram code that will facilitate adding more features to the histograms in the future. - Addition of str_has_prefix() and a few use cases. There currently is a similar function strstart() that is used in a few places, but only returns a bool and not a length. These instances will be removed in the future to use str_has_prefix() instead. - A few other various clean ups as well. * tag 'trace-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (57 commits) tracing: Use the return of str_has_prefix() to remove open coded numbers tracing: Have the historgram use the result of str_has_prefix() for len of prefix tracing: Use str_has_prefix() instead of using fixed sizes tracing: Use str_has_prefix() helper for histogram code string.h: Add str_has_prefix() helper function tracing: Make function ‘ftrace_exports’ static tracing: Simplify printf'ing in seq_print_sym tracing: Avoid -Wformat-nonliteral warning tracing: Merge seq_print_sym_short() and seq_print_sym_offset() tracing: Add hist trigger comments for variable-related fields tracing: Remove hist trigger synth_var_refs tracing: Use hist trigger's var_ref array to destroy var_refs tracing: Remove open-coding of hist trigger var_ref management tracing: Use var_refs[] for hist trigger reference checking tracing: Change strlen to sizeof for hist trigger static strings tracing: Remove unnecessary hist trigger struct field tracing: Fix ftrace_graph_get_ret_stack() to use task and not current seq_buf: Use size_t for len in seq_buf_puts() seq_buf: Make seq_buf_puts() null-terminate the buffer arm64: Use ftrace_graph_get_ret_stack() instead of curr_ret_stack ...
2018-12-25Merge tag 'arm64-upstream' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 festive updates from Will Deacon: "In the end, we ended up with quite a lot more than I expected: - Support for ARMv8.3 Pointer Authentication in userspace (CRIU and kernel-side support to come later) - Support for per-thread stack canaries, pending an update to GCC that is currently undergoing review - Support for kexec_file_load(), which permits secure boot of a kexec payload but also happens to improve the performance of kexec dramatically because we can avoid the sucky purgatory code from userspace. Kdump will come later (requires updates to libfdt). - Optimisation of our dynamic CPU feature framework, so that all detected features are enabled via a single stop_machine() invocation - KPTI whitelisting of Cortex-A CPUs unaffected by Meltdown, so that they can benefit from global TLB entries when KASLR is not in use - 52-bit virtual addressing for userspace (kernel remains 48-bit) - Patch in LSE atomics for per-cpu atomic operations - Custom preempt.h implementation to avoid unconditional calls to preempt_schedule() from preempt_enable() - Support for the new 'SB' Speculation Barrier instruction - Vectorised implementation of XOR checksumming and CRC32 optimisations - Workaround for Cortex-A76 erratum #1165522 - Improved compatibility with Clang/LLD - Support for TX2 system PMUS for profiling the L3 cache and DMC - Reflect read-only permissions in the linear map by default - Ensure MMIO reads are ordered with subsequent calls to Xdelay() - Initial support for memory hotplug - Tweak the threshold when we invalidate the TLB by-ASID, so that mremap() performance is improved for ranges spanning multiple PMDs. - Minor refactoring and cleanups" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (125 commits) arm64: kaslr: print PHYS_OFFSET in dump_kernel_offset() arm64: sysreg: Use _BITUL() when defining register bits arm64: cpufeature: Rework ptr auth hwcaps using multi_entry_cap_matches arm64: cpufeature: Reduce number of pointer auth CPU caps from 6 to 4 arm64: docs: document pointer authentication arm64: ptr auth: Move per-thread keys from thread_info to thread_struct arm64: enable pointer authentication arm64: add prctl control for resetting ptrauth keys arm64: perf: strip PAC when unwinding userspace arm64: expose user PAC bit positions via ptrace arm64: add basic pointer authentication support arm64/cpufeature: detect pointer authentication arm64: Don't trap host pointer auth use to EL2 arm64/kvm: hide ptrauth from guests arm64/kvm: consistently handle host HCR_EL2 flags arm64: add pointer authentication register bits arm64: add comments about EC exception levels arm64: perf: Treat EXCLUDE_EL* bit definitions as unsigned arm64: kpti: Whitelist Cortex-A CPUs that don't implement the CSV3 field arm64: enable per-task stack canaries ...
2018-12-10arm64: ftrace: Set FTRACE_MAY_SLEEP before ftrace_modify_all_code()Steven Rostedt (VMware)1-0/+1
It has been reported that ftrace_replace_code() which is called by ftrace_modify_all_code() can cause a soft lockup warning for an allmodconfig kernel. This is because all the debug options enabled causes the loop in ftrace_replace_code() (which loops over all the functions being enabled where there can be 10s of thousands), is too slow, and never schedules out. To solve this, setting FTRACE_MAY_SLEEP to the command passed into ftrace_replace_code() will make it call cond_resched() in the loop, which prevents the soft lockup warning from triggering. Link: http://lkml.kernel.org/r/20181204192903.8193-1-anders.roxell@linaro.org Link: http://lkml.kernel.org/r/20181205183304.000714627@goodmis.org Acked-by: Will Deacon <will.deacon@arm.com> Reported-by: Anders Roxell <anders.roxell@linaro.org> Tested-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-11-30arm64: ftrace: always pass instrumented pc in x0Mark Rutland1-1/+1
The core ftrace hooks take the instrumented PC in x0, but for some reason arm64's prepare_ftrace_return() takes this in x1. For consistency, let's flip the argument order and always pass the instrumented PC in x0. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Torsten Duwe <duwe@suse.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-27arm64: function_graph: Simplify with function_graph_enter()Steven Rostedt (VMware)1-14/+1
The function_graph_enter() function does the work of calling the function graph hook function and the management of the shadow stack, simplifying the work done in the architecture dependent prepare_ftrace_return(). Have arm64 use the new code, and remove the shadow stack management as well as having to set up the trace structure. This is needed to prepare for a fix of a design bug on how the curr_ret_stack is used. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: stable@kernel.org Fixes: 03274a3ffb449 ("tracing/fgraph: Adjust fgraph depth before calling trace return callback") Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-11-27arm64/module: switch to ADRP/ADD sequences for PLT entriesArd Biesheuvel1-1/+1
Now that we have switched to the small code model entirely, and reduced the extended KASLR range to 4 GB, we can be sure that the targets of relative branches that are out of range are in range for a ADRP/ADD pair, which is one instruction shorter than our current MOVN/MOVK/MOVK sequence, and is more idiomatic and so it is more likely to be implemented efficiently by micro-architectures. So switch over the ordinary PLT code and the special handling of the Cortex-A53 ADRP errata, as well as the ftrace trampline handling. Reviewed-by: Torsten Duwe <duwe@lst.de> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [will: Added a couple of comments in the plt equality check] Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-01arm64: ftrace: emit ftrace-mod.o contents through codeArd Biesheuvel1-6/+8
When building the arm64 kernel with both CONFIG_ARM64_MODULE_PLTS and CONFIG_DYNAMIC_FTRACE enabled, the ftrace-mod.o object file is built with the kernel and contains a trampoline that is linked into each module, so that modules can be loaded far away from the kernel and still reach the ftrace entry point in the core kernel with an ordinary relative branch, as is emitted by the compiler instrumentation code dynamic ftrace relies on. In order to be able to build out of tree modules, this object file needs to be included into the linux-headers or linux-devel packages, which is undesirable, as it makes arm64 a special case (although a precedent does exist for 32-bit PPC). Given that the trampoline essentially consists of a PLT entry, let's not bother with a source or object file for it, and simply patch it in whenever the trampoline is being populated, using the existing PLT support routines. Cc: <stable@vger.kernel.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-23arm64: ftrace: fix !CONFIG_ARM64_MODULE_PLTS kernelsMark Rutland1-6/+8
When a kernel is built without CONFIG_ARM64_MODULE_PLTS, we don't generate the expected branch instruction in ftrace_make_nop(). This means we pass zero (rather than a valid branch) to ftrace_modify_code() as the expected instruction to validate. This causes us to return -EINVAL to the core ftrace code for a valid case, resulting in a splat at boot time. This was an unintended effect of commit: 687644209a6e9557 ("arm64: ftrace: fix building without CONFIG_MODULES") ... which incorrectly moved the generation of the branch instruction into the ifdef for CONFIG_ARM64_MODULE_PLTS. This patch fixes the issue by moving the ifdef inside of the relevant if-else case, and always checking that the branch is in range, regardless of CONFIG_ARM64_MODULE_PLTS. This ensures that we generate the expected branch instruction, and also improves our sanity checks. For consistency, both ftrace_make_nop() and ftrace_make_call() are updated with this pattern. Fixes: 687644209a6e9557 ("arm64: ftrace: fix building without CONFIG_MODULES") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-12arm64: ftrace: fix building without CONFIG_MODULESWill Deacon1-6/+10
When CONFIG_MODULES is disabled, we cannot dereference a module pointer: arch/arm64/kernel/ftrace.c: In function 'ftrace_make_call': arch/arm64/kernel/ftrace.c:107:36: error: dereferencing pointer to incomplete type 'struct module' trampoline = (unsigned long *)mod->arch.ftrace_trampoline; Also, the within_module() function is not defined: arch/arm64/kernel/ftrace.c: In function 'ftrace_make_nop': arch/arm64/kernel/ftrace.c:171:8: error: implicit declaration of function 'within_module'; did you mean 'init_module'? [-Werror=implicit-function-declaration] This addresses both by adding replacing the IS_ENABLED(CONFIG_ARM64_MODULE_PLTS) checks with #ifdef versions. Fixes: e71a4e1bebaf ("arm64: ftrace: add support for far branches to dynamic ftrace") Reported-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-07arm64: ftrace: add support for far branches to dynamic ftraceArd Biesheuvel1-0/+51
Currently, dynamic ftrace support in the arm64 kernel assumes that all core kernel code is within range of ordinary branch instructions that occur in module code, which is usually the case, but is no longer guaranteed now that we have support for module PLTs and address space randomization. Since on arm64, all patching of branch instructions involves function calls to the same entry point [ftrace_caller()], we can emit the modules with a trampoline that has unlimited range, and patch both the trampoline itself and the branch instruction to redirect the call via the trampoline. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [will: minor clarification to smp_wmb() comment] Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-07arm64: ftrace: don't validate branch via PLT in ftrace_make_nop()Ard Biesheuvel1-3/+43
When turning branch instructions into NOPs, we attempt to validate the action by comparing the old value at the call site with the opcode of a direct relative branch instruction pointing at the old target. However, these call sites are statically initialized to call _mcount(), and may be redirected via a PLT entry if the module is loaded far away from the kernel text, leading to false negatives and spurious errors. So skip the validation if CONFIG_ARM64_MODULE_PLTS is configured. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-24ftrace: Add return address pointer to ftrace_ret_stackJosh Poimboeuf1-1/+1
Storing this value will help prevent unwinders from getting out of sync with the function graph tracer ret_stack. Now instead of needing a stateful iterator, they can compare the return address pointer to find the right ret_stack entry. Note that an array of 50 ftrace_ret_stack structs is allocated for every task. So when an arch implements this, it will add either 200 or 400 bytes of memory usage per task (depending on whether it's a 32-bit or 64-bit platform). Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/a95cfcc39e8f26b89a430c56926af0bb217bc0a1.1471607358.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-12-21arm64: ftrace: modify a stack frame in a safe wayAKASHI Takahiro1-7/+4
Function graph tracer modifies a return address (LR) in a stack frame by calling ftrace_prepare_return() in a traced function's function prologue. The current code does this modification before preserving an original address at ftrace_push_return_trace() and there is always a small window of inconsistency when an interrupt occurs. This doesn't matter, as far as an interrupt stack is introduced, because stack tracer won't be invoked in an interrupt context. But it would be better to proactively minimize such a window by moving the LR modification after ftrace_push_return_trace(). Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-04arm64: ftrace: fix the comments for ftrace_modify_codeLi Bin1-6/+5
There is no need to worry about module and __init text disappearing case, because that ftrace has a module notifier that is called when a module is being unloaded and before the text goes away and this code grabs the ftrace_lock mutex and removes the module functions from the ftrace list, such that it will no longer do any modifications to that module's text, the update to make functions be traced or not is done under the ftrace_lock mutex as well. And by now, __init section codes should not been modified by ftrace, because it is black listed in recordmcount.c and ignored by ftrace. Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Li Bin <huawei.libin@huawei.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-04arm64: ftrace: stop using kstop_machine to enable/disable tracingLi Bin1-0/+5
For ftrace on arm64, kstop_machine which is hugely disruptive to a running system is not needed to convert nops to ftrace calls or back, because that to be modified instrucions, that NOP, B or BL, are all safe instructions which called "concurrent modification and execution of instructions", that can be executed by one thread of execution as they are being modified by another thread of execution without requiring explicit synchronization. Signed-off-by: Li Bin <huawei.libin@huawei.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-02-23arm64: ftrace: fix ftrace_modify_graph_caller for branch replacePratyush Anand1-1/+1
ftrace_enable_ftrace_graph_caller and ftrace_disable_ftrace_graph_caller should replace B(jmp) instruction and not BL(call) instruction. Commit 9f1ae7596aad("arm64: Correct ftrace calls to aarch64_insn_gen_branch_imm()") had a typo and used AARCH64_INSN_BRANCH_LINK instead of AARCH64_INSN_BRANCH_NOLINK. Either instruction will work, as the link register is saved/restored across the branch but this better matches the intention of the code. Signed-off-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-09-19arm64: Correct ftrace calls to aarch64_insn_gen_branch_imm()Catalin Marinas1-4/+6
The aarch64_insn_gen_branch_imm() function takes an enum as the last argument rather than a bool. It happens to work because AARCH64_INSN_BRANCH_LINK matches 'true' but better to use the actual type. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-05-29arm64: ftrace: Add dynamic ftrace supportAKASHI Takahiro1-0/+112
This patch allows "dynamic ftrace" if CONFIG_DYNAMIC_FTRACE is enabled. Here we can turn on and off tracing dynamically per-function base. On arm64, this is done by patching single branch instruction to _mcount() inserted by gcc -pg option. The branch is replaced to NOP initially at kernel start up, and later on, NOP to branch to ftrace_caller() when enabled or branch to NOP when disabled. Please note that ftrace_caller() is a counterpart of _mcount() in case of 'static' ftrace. More details on architecture specific requirements are described in Documentation/trace/ftrace-design.txt. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-05-29arm64: Add ftrace supportAKASHI Takahiro1-0/+64
This patch implements arm64 specific part to support function tracers, such as function (CONFIG_FUNCTION_TRACER), function_graph (CONFIG_FUNCTION_GRAPH_TRACER) and function profiler (CONFIG_FUNCTION_PROFILER). With 'function' tracer, all the functions in the kernel are traced with timestamps in ${sysfs}/tracing/trace. If function_graph tracer is specified, call graph is generated. The kernel must be compiled with -pg option so that _mcount() is inserted at the beginning of functions. This function is called on every function's entry as long as tracing is enabled. In addition, function_graph tracer also needs to be able to probe function's exit. ftrace_graph_caller() & return_to_handler do this by faking link register's value to intercept function's return path. More details on architecture specific requirements are described in Documentation/trace/ftrace-design.txt. Reviewed-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>