linux - Linux Kernel (branches are rebased on master from time to time)

Age	Commit message (Collapse)	Author	Files	Lines
2020-09-17	arm64: bpf: Fix branch offset in JIT	Ilias Apalodimas	1	-12/+31
	Running the eBPF test_verifier leads to random errors looking like this: [ 6525.735488] Unexpected kernel BRK exception at EL1 [ 6525.735502] Internal error: ptrace BRK handler: f2000100 [#1] SMP [ 6525.741609] Modules linked in: nls_utf8 cifs libdes libarc4 dns_resolver fscache binfmt_misc nls_ascii nls_cp437 vfat fat aes_ce_blk crypto_simd cryptd aes_ce_cipher ghash_ce gf128mul efi_pstore sha2_ce sha256_arm64 sha1_ce evdev efivars efivarfs ip_tables x_tables autofs4 btrfs blake2b_generic xor xor_neon zstd_compress raid6_pq libcrc32c crc32c_generic ahci xhci_pci libahci xhci_hcd igb libata i2c_algo_bit nvme realtek usbcore nvme_core scsi_mod t10_pi netsec mdio_devres of_mdio gpio_keys fixed_phy libphy gpio_mb86s7x [ 6525.787760] CPU: 3 PID: 7881 Comm: test_verifier Tainted: G W 5.9.0-rc1+ #47 [ 6525.796111] Hardware name: Socionext SynQuacer E-series DeveloperBox, BIOS build #1 Jun 6 2020 [ 6525.804812] pstate: 20000005 (nzCv daif -PAN -UAO BTYPE=--) [ 6525.810390] pc : bpf_prog_c3d01833289b6311_F+0xc8/0x9f4 [ 6525.815613] lr : bpf_prog_d53bb52e3f4483f9_F+0x38/0xc8c [ 6525.820832] sp : ffff8000130cbb80 [ 6525.824141] x29: ffff8000130cbbb0 x28: 0000000000000000 [ 6525.829451] x27: 000005ef6fcbf39b x26: 0000000000000000 [ 6525.834759] x25: ffff8000130cbb80 x24: ffff800011dc7038 [ 6525.840067] x23: ffff8000130cbd00 x22: ffff0008f624d080 [ 6525.845375] x21: 0000000000000001 x20: ffff800011dc7000 [ 6525.850682] x19: 0000000000000000 x18: 0000000000000000 [ 6525.855990] x17: 0000000000000000 x16: 0000000000000000 [ 6525.861298] x15: 0000000000000000 x14: 0000000000000000 [ 6525.866606] x13: 0000000000000000 x12: 0000000000000000 [ 6525.871913] x11: 0000000000000001 x10: ffff8000000a660c [ 6525.877220] x9 : ffff800010951810 x8 : ffff8000130cbc38 [ 6525.882528] x7 : 0000000000000000 x6 : 0000009864cfa881 [ 6525.887836] x5 : 00ffffffffffffff x4 : 002880ba1a0b3e9f [ 6525.893144] x3 : 0000000000000018 x2 : ffff8000000a4374 [ 6525.898452] x1 : 000000000000000a x0 : 0000000000000009 [ 6525.903760] Call trace: [ 6525.906202] bpf_prog_c3d01833289b6311_F+0xc8/0x9f4 [ 6525.911076] bpf_prog_d53bb52e3f4483f9_F+0x38/0xc8c [ 6525.915957] bpf_dispatcher_xdp_func+0x14/0x20 [ 6525.920398] bpf_test_run+0x70/0x1b0 [ 6525.923969] bpf_prog_test_run_xdp+0xec/0x190 [ 6525.928326] __do_sys_bpf+0xc88/0x1b28 [ 6525.932072] __arm64_sys_bpf+0x24/0x30 [ 6525.935820] el0_svc_common.constprop.0+0x70/0x168 [ 6525.940607] do_el0_svc+0x28/0x88 [ 6525.943920] el0_sync_handler+0x88/0x190 [ 6525.947838] el0_sync+0x140/0x180 [ 6525.951154] Code: d4202000 d4202000 d4202000 d4202000 (d4202000) [ 6525.957249] ---[ end trace cecc3f93b14927e2 ]--- The reason is the offset[] creation and later usage, while building the eBPF body. The code currently omits the first instruction, since build_insn() will increase our ctx->idx before saving it. That was fine up until bounded eBPF loops were introduced. After that introduction, offset[0] must be the offset of the end of prologue which is the start of the 1st insn while, offset[n] holds the offset of the end of n-th insn. When "taken loop with back jump to 1st insn" test runs, it will eventually call bpf2a64_offset(-1, 2, ctx). Since negative indexing is permitted, the current outcome depends on the value stored in ctx->offset[-1], which has nothing to do with our array. If the value happens to be 0 the tests will work. If not this error triggers. commit 7c2e988f400e ("bpf: fix x64 JIT code generation for jmp to 1st insn") fixed an indentical bug on x86 when eBPF bounded loops were introduced. So let's fix it by creating the ctx->offset[] differently. Track the beginning of instruction and account for the extra instruction while calculating the arm instruction offsets. Fixes: 2589726d12a1 ("bpf: introduce bounded loops") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reported-by: Jiri Olsa <jolsa@kernel.org> Co-developed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Co-developed-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200917084925.177348-1-ilias.apalodimas@linaro.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-31	bpf, arm64: Add BPF exception tables	Jean-Philippe Brucker	1	-6/+87
	When a tracing BPF program attempts to read memory without using the bpf_probe_read() helper, the verifier marks the load instruction with the BPF_PROBE_MEM flag. Since the arm64 JIT does not currently recognize this flag it falls back to the interpreter. Add support for BPF_PROBE_MEM, by appending an exception table to the BPF program. If the load instruction causes a data abort, the fixup infrastructure finds the exception table and fixes up the fault, by clearing the destination register and jumping over the faulting instruction. To keep the compact exception table entry format, inspect the pc in fixup_exception(). A more generic solution would add a "handler" field to the table entry, like on x86 and s390. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200728152122.1292756-2-jean-philippe@linaro.org
2020-05-28	Merge branch 'for-next/bti' into for-next/core	Will Deacon	1	-0/+12
	Support for Branch Target Identification (BTI) in user and kernel (Mark Brown and others) * for-next/bti: (39 commits) arm64: vdso: Fix CFI directives in sigreturn trampoline arm64: vdso: Don't prefix sigreturn trampoline with a BTI C instruction arm64: bti: Fix support for userspace only BTI arm64: kconfig: Update and comment GCC version check for kernel BTI arm64: vdso: Map the vDSO text with guarded pages when built for BTI arm64: vdso: Force the vDSO to be linked as BTI when built for BTI arm64: vdso: Annotate for BTI arm64: asm: Provide a mechanism for generating ELF note for BTI arm64: bti: Provide Kconfig for kernel mode BTI arm64: mm: Mark executable text as guarded pages arm64: bpf: Annotate JITed code for BTI arm64: Set GP bit in kernel page tables to enable BTI for the kernel arm64: asm: Override SYM_FUNC_START when building the kernel with BTI arm64: bti: Support building kernel C code using BTI arm64: Document why we enable PAC support for leaf functions arm64: insn: Report PAC and BTI instructions as skippable arm64: insn: Don't assume unrecognized HINTs are skippable arm64: insn: Provide a better name for aarch64_insn_is_nop() arm64: insn: Add constants for new HINT instruction decode arm64: Disable old style assembly annotations ...
2020-05-11	bpf, arm64: Optimize ADD,SUB,JMP BPF_K using arm64 add/sub immediates	Luke Nelson	1	-6/+30
	The current code for BPF_{ADD,SUB} BPF_K loads the BPF immediate to a temporary register before performing the addition/subtraction. Similarly, BPF_JMP BPF_K cases load the immediate to a temporary register before comparison. This patch introduces optimizations that use arm64 immediate add, sub, cmn, or cmp instructions when the BPF immediate fits. If the immediate does not fit, it falls back to using a temporary register. Example of generated code for BPF_ALU64_IMM(BPF_ADD, R0, 2): without optimization: 24: mov x10, #0x2 28: add x7, x7, x10 with optimization: 24: add x7, x7, #0x2 The code could use A64_{ADD,SUB}_I directly and check if it returns AARCH64_BREAK_FAULT, similar to how logical immediates are handled. However, aarch64_insn_gen_add_sub_imm from insn.c prints error messages when the immediate does not fit, and it's simpler to check if the immediate fits ahead of time. Co-developed-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Luke Nelson <luke.r.nels@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20200508181547.24783-4-luke.r.nels@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2020-05-11	bpf, arm64: Optimize AND,OR,XOR,JSET BPF_K using arm64 logical immediates	Luke Nelson	1	-8/+29
	The current code for BPF_{AND,OR,XOR,JSET} BPF_K loads the immediate to a temporary register before use. This patch changes the code to avoid using a temporary register when the BPF immediate is encodable using an arm64 logical immediate instruction. If the encoding fails (due to the immediate not being encodable), it falls back to using a temporary register. Example of generated code for BPF_ALU32_IMM(BPF_AND, R0, 0x80000001): without optimization: 24: mov w10, #0x8000ffff 28: movk w10, #0x1 2c: and w7, w7, w10 with optimization: 24: and w7, w7, #0x80000001 Since the encoding process is quite complex, the JIT reuses existing functionality in arch/arm64/kernel/insn.c for encoding logical immediates rather than duplicate it in the JIT. Co-developed-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Luke Nelson <luke.r.nels@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20200508181547.24783-3-luke.r.nels@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2020-05-07	arm64: bpf: Annotate JITed code for BTI	Mark Brown	1	-0/+12
	In order to extend the protection offered by BTI to all code executing in kernel mode we need to annotate JITed BPF code appropriately for BTI. To do this we need to add a landing pad to the start of each BPF function and also immediately after the function prologue if we are emitting a function which can be tail called. Jumps within BPF functions are all to immediate offsets and therefore do not require landing pads. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20200506195138.22086-6-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2019-09-03	arm64: bpf: optimize modulo operation	Jerin Jacob	1	-4/+2
	Optimize modulo operation instruction generation by using single MSUB instruction vs MUL followed by SUB instruction scheme. Signed-off-by: Jerin Jacob <jerinj@marvell.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08	Merge tag 'arm64-upstream' of ↵	Linus Torvalds	1	-1/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - arm64 support for syscall emulation via PTRACE_SYSEMU{,_SINGLESTEP} - Wire up VM_FLUSH_RESET_PERMS for arm64, allowing the core code to manage the permissions of executable vmalloc regions more strictly - Slight performance improvement by keeping softirqs enabled while touching the FPSIMD/SVE state (kernel_neon_begin/end) - Expose a couple of ARMv8.5 features to user (HWCAP): CondM (new XAFLAG and AXFLAG instructions for floating point comparison flags manipulation) and FRINT (rounding floating point numbers to integers) - Re-instate ARM64_PSEUDO_NMI support which was previously marked as BROKEN due to some bugs (now fixed) - Improve parking of stopped CPUs and implement an arm64-specific panic_smp_self_stop() to avoid warning on not being able to stop secondary CPUs during panic - perf: enable the ARM Statistical Profiling Extensions (SPE) on ACPI platforms - perf: DDR performance monitor support for iMX8QXP - cache_line_size() can now be set from DT or ACPI/PPTT if provided to cope with a system cache info not exposed via the CPUID registers - Avoid warning on hardware cache line size greater than ARCH_DMA_MINALIGN if the system is fully coherent - arm64 do_page_fault() and hugetlb cleanups - Refactor set_pte_at() to avoid redundant READ_ONCE(ptep) - Ignore ACPI 5.1 FADTs reported as 5.0 (infer from the 'arm_boot_flags' introduced in 5.1) - CONFIG_RANDOMIZE_BASE now enabled in defconfig - Allow the selection of ARM64_MODULE_PLTS, currently only done via RANDOMIZE_BASE (and an erratum workaround), allowing modules to spill over into the vmalloc area - Make ZONE_DMA32 configurable tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (54 commits) perf: arm_spe: Enable ACPI/Platform automatic module loading arm_pmu: acpi: spe: Add initial MADT/SPE probing ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens ACPI/PPTT: Modify node flag detection to find last IDENTICAL x86/entry: Simplify _TIF_SYSCALL_EMU handling arm64: rename dump_instr as dump_kernel_instr arm64/mm: Drop [PTE\|PMD]_TYPE_FAULT arm64: Implement panic_smp_self_stop() arm64: Improve parking of stopped CPUs arm64: Expose FRINT capabilities to userspace arm64: Expose ARMv8.5 CondM capability to userspace arm64: defconfig: enable CONFIG_RANDOMIZE_BASE arm64: ARM64_MODULES_PLTS must depend on MODULES arm64: bpf: do not allocate executable memory arm64/kprobes: set VM_FLUSH_RESET_PERMS on kprobe instruction pages arm64/mm: wire up CONFIG_ARCH_HAS_SET_DIRECT_MAP arm64: module: create module allocations without exec permissions arm64: Allow user selection of ARM64_MODULE_PLTS acpi/arm64: ignore 5.1 FADTs that are reported as 5.0 arm64: Allow selecting Pseudo-NMI again ...
2019-06-24	arm64: bpf: do not allocate executable memory	Ard Biesheuvel	1	-1/+1
	The BPF code now takes care of mapping the code pages executable after mapping them read-only, to ensure that no RWX mapped regions are needed, even transiently. This means we can drop the executable permissions from the mapping at allocation time. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-19	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234	Thomas Gleixner	1	-12/+1
	Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 503 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Enrico Weigelt <info@metux.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-26	bpf, arm64: use more scalable stadd over ldxr / stxr loop in xadd	Daniel Borkmann	1	-9/+19
	Since ARMv8.1 supplement introduced LSE atomic instructions back in 2016, lets add support for STADD and use that in favor of LDXR / STXR loop for the XADD mapping if available. STADD is encoded as an alias for LDADD with XZR as the destination register, therefore add LDADD to the instruction encoder along with STADD as special case and use it in the JIT for CPUs that advertise LSE atomics in CPUID register. If immediate offset in the BPF XADD insn is 0, then use dst register directly instead of temporary one. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-26	bpf, arm64: remove prefetch insn in xadd mapping	Daniel Borkmann	1	-1/+0
	Prefetch-with-intent-to-write is currently part of the XADD mapping in the AArch64 JIT and follows the kernel's implementation of atomic_add. This may interfere with other threads executing the LDXR/STXR loop, leading to potential starvation and fairness issues. Drop the optional prefetch instruction. Fixes: 85f68fe89832 ("bpf, arm64: implement jiting of BPF_XADD") Reported-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-01-26	arm64: bpf: implement jitting of JMP32	Jiong Wang	1	-7/+30
	This patch implements code-gen for new JMP32 instructions on arm64. Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Zi Shen Lim <zlim.lnx@gmail.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-12	bpf: arm64: Enable arm64 jit to provide bpf_line_info	Martin KaFai Lau	1	-0/+1
	This patch enables arm64's bpf_int_jit_compile() to provide bpf_line_info by calling bpf_prog_fill_jited_linfo(). Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-05	arm64/bpf: don't allocate BPF JIT programs in module memory	Ard Biesheuvel	1	-0/+13
	The arm64 module region is a 128 MB region that is kept close to the core kernel, in order to ensure that relative branches are always in range. So using the same region for programs that do not have this restriction is wasteful, and preferably avoided. Now that the core BPF JIT code permits the alloc/free routines to be overridden, implement them by vmalloc()/vfree() calls from a dedicated 128 MB region set aside for BPF programs. This ensures that BPF programs are still in branching range of each other, which is something the JIT currently depends upon (and is not guaranteed when using module_alloc() on KASLR kernels like we do currently). It also ensures that placement of BPF programs does not correlate with the placement of the core kernel or modules, making it less likely that leaking the former will reveal the latter. This also solves an issue under KASAN, where shadow memory is needlessly allocated for all BPF programs (which don't require KASAN shadow pages since they are not KASAN instrumented) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-30	arm64/bpf: use movn/movk/movk sequence to generate kernel addresses	Ard Biesheuvel	1	-11/+6
	On arm64, all executable code is guaranteed to reside in the vmalloc space (or the module space), and so jump targets will only use 48 bits at most, and the remaining bits are guaranteed to be 0x1. This means we can generate an immediate jump address using a sequence of one MOVN (move wide negated) and two MOVK instructions, where the first one sets the lower 16 bits but also sets all top bits to 0x1. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-26	bpf, arm64: fix getting subprog addr from aux for calls	Daniel Borkmann	1	-9/+17
	The arm64 JIT has the same issue as ppc64 JIT in that the relative BPF to BPF call offset can be too far away from core kernel in that relative encoding into imm is not sufficient and could potentially be truncated, see also fd045f6cd98e ("arm64: add support for module PLTs") which adds spill-over space for module_alloc() and therefore bpf_jit_binary_alloc(). Therefore, use the recently added bpf_jit_get_func_addr() helper for properly fetching the address through prog->aux->func[off]->bpf_func instead. This also has the benefit to optimize normal helper calls since their address can use the optimized emission. Tested on Cavium ThunderX CN8890. Fixes: db496944fdaa ("bpf: arm64: add JIT support for multi-function programs") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-05-14	bpf, arm64: save 4 bytes in prologue when ebpf insns came from cbpf	Daniel Borkmann	1	-10/+13
	We can trivially save 4 bytes in prologue for cBPF since tail calls can never be used from there. The register push/pop is pairwise, here, x25 (fp) and x26 (tcc), so no point in changing that, only reset to zero is not needed. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-05-14	bpf, arm64: optimize 32/64 immediate emission	Daniel Borkmann	1	-31/+54
	Improve the JIT to emit 64 and 32 bit immediates, the current algorithm is not optimal and we often emit more instructions than actually needed. arm64 has movz, movn, movk variants but for the current 64 bit immediates we only use movz with a series of movk when needed. For example loading ffffffffffffabab emits the following 4 instructions in the JIT today: * movz: abab, shift: 0, result: 000000000000abab * movk: ffff, shift: 16, result: 00000000ffffabab * movk: ffff, shift: 32, result: 0000ffffffffabab * movk: ffff, shift: 48, result: ffffffffffffabab Whereas after the patch the same load only needs a single instruction: * movn: 5454, shift: 0, result: ffffffffffffabab Another example where two extra instructions can be saved: * movz: abab, shift: 0, result: 000000000000abab * movk: 1f2f, shift: 16, result: 000000001f2fabab * movk: ffff, shift: 32, result: 0000ffff1f2fabab * movk: ffff, shift: 48, result: ffffffff1f2fabab After the patch: * movn: e0d0, shift: 16, result: ffffffff1f2fffff * movk: abab, shift: 0, result: ffffffff1f2fabab Another example with movz, before: * movz: 0000, shift: 0, result: 0000000000000000 * movk: fea0, shift: 32, result: 0000fea000000000 After: * movz: fea0, shift: 32, result: 0000fea000000000 Moreover, reuse emit_a64_mov_i() for 32 bit immediates that are loaded via emit_a64_mov_i64() which is a similar optimization as done in 6fe8b9c1f41d ("bpf, x64: save several bytes by using mov over movabsq when possible"). On arm64, the latter allows to use a single instruction with movn due to zero extension where otherwise two would be needed. And last but not least add a missing optimization in emit_a64_mov_i() where movn is used but the subsequent movk not needed. With some of the Cilium programs in use, this shrinks the needed instructions by about three percent. Tested on Cavium ThunderX CN8890. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-05-14	bpf, arm64: save 4 bytes of unneeded stack space	Daniel Borkmann	1	-5/+2
	Follow-up to 816d9ef32a8b ("bpf, arm64: remove ld_abs/ld_ind") in that the extra 4 byte JIT scratchpad is not needed anymore since it was in ld_abs/ld_ind as stack buffer for bpf_load_pointer(). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-05-03	bpf, arm64: remove ld_abs/ld_ind	Daniel Borkmann	1	-65/+0
	Since LD_ABS/LD_IND instructions are now removed from the core and reimplemented through a combination of inlined BPF instructions and a slow-path helper, we can get rid of the complexity from arm64 JIT. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-02-22	bpf, arm64: fix out of bounds access in tail call	Daniel Borkmann	1	-2/+3
	I recently noticed a crash on arm64 when feeding a bogus index into BPF tail call helper. The crash would not occur when the interpreter is used, but only in case of JIT. Output looks as follows: [ 347.007486] Unable to handle kernel paging request at virtual address fffb850e96492510 [...] [ 347.043065] [fffb850e96492510] address between user and kernel address ranges [ 347.050205] Internal error: Oops: 96000004 [#1] SMP [...] [ 347.190829] x13: 0000000000000000 x12: 0000000000000000 [ 347.196128] x11: fffc047ebe782800 x10: ffff808fd7d0fd10 [ 347.201427] x9 : 0000000000000000 x8 : 0000000000000000 [ 347.206726] x7 : 0000000000000000 x6 : 001c991738000000 [ 347.212025] x5 : 0000000000000018 x4 : 000000000000ba5a [ 347.217325] x3 : 00000000000329c4 x2 : ffff808fd7cf0500 [ 347.222625] x1 : ffff808fd7d0fc00 x0 : ffff808fd7cf0500 [ 347.227926] Process test_verifier (pid: 4548, stack limit = 0x000000007467fa61) [ 347.235221] Call trace: [ 347.237656] 0xffff000002f3a4fc [ 347.240784] bpf_test_run+0x78/0xf8 [ 347.244260] bpf_prog_test_run_skb+0x148/0x230 [ 347.248694] SyS_bpf+0x77c/0x1110 [ 347.251999] el0_svc_naked+0x30/0x34 [ 347.255564] Code: 9100075a d280220a 8b0a002a d37df04b (f86b694b) [...] In this case the index used in BPF r3 is the same as in r1 at the time of the call, meaning we fed a pointer as index; here, it had the value 0xffff808fd7cf0500 which sits in x2. While I found tail calls to be working in general (also for hitting the error cases), I noticed the following in the code emission: # bpftool p d j i 988 [...] 38: ldr w10, [x1,x10] 3c: cmp w2, w10 40: b.ge 0x000000000000007c <-- signed cmp 44: mov x10, #0x20 // #32 48: cmp x26, x10 4c: b.gt 0x000000000000007c 50: add x26, x26, #0x1 54: mov x10, #0x110 // #272 58: add x10, x1, x10 5c: lsl x11, x2, #3 60: ldr x11, [x10,x11] <-- faulting insn (f86b694b) 64: cbz x11, 0x000000000000007c [...] Meaning, the tests passed because commit ddb55992b04d ("arm64: bpf: implement bpf_tail_call() helper") was using signed compares instead of unsigned which as a result had the test wrongly passing. Change this but also the tail call count test both into unsigned and cap the index as u32. Latter we did as well in 90caccdd8cc0 ("bpf: fix bpf_tail_call() x64 JIT") and is needed in addition here, too. Tested on HiSilicon Hi1616. Result after patch: # bpftool p d j i 268 [...] 38: ldr w10, [x1,x10] 3c: add w2, w2, #0x0 40: cmp w2, w10 44: b.cs 0x0000000000000080 48: mov x10, #0x20 // #32 4c: cmp x26, x10 50: b.hi 0x0000000000000080 54: add x26, x26, #0x1 58: mov x10, #0x110 // #272 5c: add x10, x1, x10 60: lsl x11, x2, #3 64: ldr x11, [x10,x11] 68: cbz x11, 0x0000000000000080 [...] Fixes: ddb55992b04d ("arm64: bpf: implement bpf_tail_call() helper") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-26	bpf, arm64: remove obsolete exception handling from div/mod	Daniel Borkmann	1	-13/+0
	Since we've changed div/mod exception handling for src_reg in eBPF verifier itself, remove the leftovers from arm64 JIT. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-20	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	David S. Miller	1	-2/+0
	Alexei Starovoitov says: ==================== pull-request: bpf-next 2018-01-19 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) bpf array map HW offload, from Jakub. 2) support for bpf_get_next_key() for LPM map, from Yonghong. 3) test_verifier now runs loaded programs, from Alexei. 4) xdp cpumap monitoring, from Jesper. 5) variety of tests, cleanups and small x64 JIT optimization, from Daniel. 6) user space can now retrieve HW JITed program, from Jiong. Note there is a minor conflict between Russell's arm32 JIT fixes and removal of bpf_jit_enable variable by Daniel which should be resolved by keeping Russell's comment and removing that variable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller	1	-9/+11
	The BPF verifier conflict was some minor contextual issue. The TUN conflict was less trivial. Cong Wang fixed a memory leak of tfile->tx_array in 'net'. This is an skb_array. But meanwhile in net-next tun changed tfile->tx_arry into tfile->tx_ring which is a ptr_ring. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19	bpf: get rid of pure_initcall dependency to enable jits	Daniel Borkmann	1	-2/+0
	Having a pure_initcall() callback just to permanently enable BPF JITs under CONFIG_BPF_JIT_ALWAYS_ON is unnecessary and could leave a small race window in future where JIT is still disabled on boot. Since we know about the setting at compilation time anyway, just initialize it properly there. Also consolidate all the individual bpf_jit_enable variables into a single one and move them under one location. Moreover, don't allow for setting unspecified garbage values on them. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-16	bpf, arm64: fix stack_depth tracking in combination with tail calls	Daniel Borkmann	1	-9/+11
	Using dynamic stack_depth tracking in arm64 JIT is currently broken in combination with tail calls. In prologue, we cache ctx->stack_size and adjust SP reg for setting up function call stack, and tearing it down again in epilogue. Problem is that when doing a tail call, the cached ctx->stack_size might not be the same. One way to fix the problem with minimal overhead is to re-adjust SP in emit_bpf_tail_call() and properly adjust it to the current program's ctx->stack_size. Tested on Cavium ThunderX ARMv8. Fixes: f1c9eed7f437 ("bpf, arm64: take advantage of stack_depth tracking") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-19	bpf: arm64: fix uninitialized variable	Alexei Starovoitov	1	-0/+1
	fix the following issue: arch/arm64/net/bpf_jit_comp.c: In function 'bpf_int_jit_compile': arch/arm64/net/bpf_jit_comp.c:982:18: error: 'image_size' may be used uninitialized in this function [-Werror=maybe-uninitialized] Fixes: db496944fdaa ("bpf: arm64: add JIT support for multi-function programs") Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17	bpf: arm64: add JIT support for multi-function programs	Alexei Starovoitov	1	-4/+64
	similar to x64 add support for bpf-to-bpf calls. When program has calls to in-kernel helpers the target call offset is known at JIT time and arm64 architecture needs 2 passes. With bpf-to-bpf calls the dynamically allocated function start is unknown until all functions of the program are JITed. Therefore (just like x64) arm64 JIT needs one extra pass over the program to emit correct call offsets. Implementation detail: Avoid being too clever in 64-bit immediate moves and always use 4 instructions (instead of 3-4 depending on the address) to make sure only one extra pass is needed. If some future optimization would make it worth while to optimize 'call 64-bit imm' further, the JIT would need to do 4 passes over the program instead of 3 as in this patch. For typical bpf program address the mov needs 3 or 4 insns, so unconditional 4 insns to save extra pass is a worthy trade off at this state of JIT. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17	bpf: fix net.core.bpf_jit_enable race	Alexei Starovoitov	1	-1/+1
	global bpf_jit_enable variable is tested multiple times in JITs, blinding and verifier core. The malicious root can try to toggle it while loading the programs. This race condition was accounted for and there should be no issues, but it's safer to avoid this race condition. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-08-09	bpf, arm64: implement jiting of BPF_J{LT, LE, SLT, SLE}	Daniel Borkmann	1	-0/+20
	This work implements jiting of BPF_J{LT,LE,SLT,SLE} instructions with BPF_X/BPF_K variants for the arm64 eBPF JIT. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-05	Merge tag 'arm64-upstream' of ↵	Linus Torvalds	1	-3/+3
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: - RAS reporting via GHES/APEI (ACPI) - Indirect ftrace trampolines for modules - Improvements to kernel fault reporting - Page poisoning - Sigframe cleanups and preparation for SVE context - Core dump fixes - Sparse fixes (mainly relating to endianness) - xgene SoC PMU v3 driver - Misc cleanups and non-critical fixes * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (75 commits) arm64: fix endianness annotation for 'struct jit_ctx' and friends arm64: cpuinfo: constify attribute_group structures. arm64: ptrace: Fix incorrect get_user() use in compat_vfp_set() arm64: ptrace: Remove redundant overrun check from compat_vfp_set() arm64: ptrace: Avoid setting compat FP[SC]R to garbage if get_user fails arm64: fix endianness annotation for __apply_alternatives()/get_alt_insn() arm64: fix endianness annotation in get_kaslr_seed() arm64: add missing conversion to __wsum in ip_fast_csum() arm64: fix endianness annotation in acpi_parking_protocol.c arm64: use readq() instead of readl() to read 64bit entry_point arm64: fix endianness annotation for reloc_insn_movw() & reloc_insn_imm() arm64: fix endianness annotation for aarch64_insn_write() arm64: fix endianness annotation in aarch64_insn_read() arm64: fix endianness annotation in call_undef_hook() arm64: fix endianness annotation for debug-monitors.c ras: mark stub functions as 'inline' arm64: pass endianness info to sparse arm64: ftrace: fix !CONFIG_ARM64_MODULE_PLTS kernels arm64: signal: Allow expansion of the signal frame acpi: apei: check for pending errors when probing GHES entries ...
2017-06-30	arm64: fix endianness annotation for 'struct jit_ctx' and friends	Luc Van Oostenryck	1	-3/+3
	struct jit_ctx::image is used the store a pointer to the jitted intructions, which are always little-endian. These instructions are thus correctly converted from native order to little-endian before being stored but the pointer 'image' is declared as for native order values. Fix this by declaring the field as __le32* instead of u32*. Same for the pointer used in jit_fill_hole() to initialize the image. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller	1	-2/+5
	The conflicts were two cases of overlapping changes in batman-adv and the qed driver. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11	bpf, arm64: take advantage of stack_depth tracking	Daniel Borkmann	1	-11/+11
	Make use of recently implemented stack_depth tracking for arm64 JIT, so that stack usage can be reduced heavily for programs not using tail calls at least. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-07	bpf, arm64: use separate register for state in stxr	Daniel Borkmann	1	-2/+5
	Will reported that in BPF_XADD we must use a different register in stxr instruction for the status flag due to otherwise CONSTRAINED UNPREDICTABLE behavior per architecture. Reference manual says [1]: If s == t, then one of the following behaviors must occur: * The instruction is UNDEFINED. * The instruction executes as a NOP. * The instruction performs the store to the specified address, but the value stored is UNKNOWN. Thus, use a different temporary register for the status flag to fix it. Disassembly extract from test 226/STX_XADD_DW from test_bpf.ko: [...] 0000003c: c85f7d4b ldxr x11, [x10] 00000040: 8b07016b add x11, x11, x7 00000044: c80c7d4b stxr w12, x11, [x10] 00000048: 35ffffac cbnz w12, 0x0000003c [...] [1] https://static.docs.arm.com/ddi0487/b/DDI0487B_a_armv8_arm.pdf, p.6132 Fixes: 85f68fe89832 ("bpf, arm64: implement jiting of BPF_XADD") Reported-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06	bpf: Add jited_len to struct bpf_prog	Martin KaFai Lau	1	-0/+1
	Add jited_len to struct bpf_prog. It will be useful for the struct bpf_prog_info which will be added in the later patch. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-31	bpf: free up BPF_JMP \| BPF_CALL \| BPF_X opcode	Alexei Starovoitov	1	-1/+1
	free up BPF_JMP \| BPF_CALL \| BPF_X opcode to be used by actual indirect call by register and use kernel internal opcode to mark call instruction into bpf_tail_call() helper. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-11	bpf, arm64: fix faulty emission of map access in tail calls	Daniel Borkmann	1	-2/+3
	Shubham was recently asking on netdev why in arm64 JIT we don't multiply the index for accessing the tail call map by 8. That led me into testing out arm64 JIT wrt tail calls and it turned out I got a NULL pointer dereference on the tail call. The buggy access is at: prog = array->ptrs[index]; if (prog == NULL) goto out; [...] 00000060: d2800e0a mov x10, #0x70 // #112 00000064: f86a682a ldr x10, [x1,x10] 00000068: f862694b ldr x11, [x10,x2] 0000006c: b40000ab cbz x11, 0x00000080 [...] The code triggering the crash is f862694b. x1 at the time contains the address of the bpf array, x10 offsetof(struct bpf_array, ptrs). Meaning, above we load the pointer to the program at map slot 0 into x10. x10 can then be NULL if the slot is not occupied, which we later on try to access with a user given offset in x2 that is the map index. Fix this by emitting the following instead: [...] 00000060: d2800e0a mov x10, #0x70 // #112 00000064: 8b0a002a add x10, x1, x10 00000068: d37df04b lsl x11, x2, #3 0000006c: f86b694b ldr x11, [x10,x11] 00000070: b40000ab cbz x11, 0x00000084 [...] This basically adds the offset to ptrs to the base address of the bpf array we got and we later on access the map with an index * 8 offset relative to that. The tail call map itself is basically one large area with meta data at the head followed by the array of prog pointers. This makes tail calls working again, tested on Cavium ThunderX ARMv8. Fixes: ddb55992b04d ("arm64: bpf: implement bpf_tail_call() helper") Reported-by: Shubham Bansal <illusionist.neo@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08	arm64: use set_memory.h header	Laura Abbott	1	-0/+1
	The set_memory_* functions have moved to set_memory.h. Use that header explicitly. Link: http://lkml.kernel.org/r/1488920133-27229-4-git-send-email-labbott@redhat.com Signed-off-by: Laura Abbott <labbott@redhat.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-02	bpf, arm64: fix jit branch offset related to ldimm64	Daniel Borkmann	1	-4/+4
	When the instruction right before the branch destination is a 64 bit load immediate, we currently calculate the wrong jump offset in the ctx->offset[] array as we only account one instruction slot for the 64 bit load immediate although it uses two BPF instructions. Fix it up by setting the offset into the right slot after we incremented the index. Before (ldimm64 test 1): [...] 00000020: 52800007 mov w7, #0x0 // #0 00000024: d2800060 mov x0, #0x3 // #3 00000028: d2800041 mov x1, #0x2 // #2 0000002c: eb01001f cmp x0, x1 00000030: 54ffff82 b.cs 0x00000020 00000034: d29fffe7 mov x7, #0xffff // #65535 00000038: f2bfffe7 movk x7, #0xffff, lsl #16 0000003c: f2dfffe7 movk x7, #0xffff, lsl #32 00000040: f2ffffe7 movk x7, #0xffff, lsl #48 00000044: d29dddc7 mov x7, #0xeeee // #61166 00000048: f2bdddc7 movk x7, #0xeeee, lsl #16 0000004c: f2ddddc7 movk x7, #0xeeee, lsl #32 00000050: f2fdddc7 movk x7, #0xeeee, lsl #48 [...] After (ldimm64 test 1): [...] 00000020: 52800007 mov w7, #0x0 // #0 00000024: d2800060 mov x0, #0x3 // #3 00000028: d2800041 mov x1, #0x2 // #2 0000002c: eb01001f cmp x0, x1 00000030: 540000a2 b.cs 0x00000044 00000034: d29fffe7 mov x7, #0xffff // #65535 00000038: f2bfffe7 movk x7, #0xffff, lsl #16 0000003c: f2dfffe7 movk x7, #0xffff, lsl #32 00000040: f2ffffe7 movk x7, #0xffff, lsl #48 00000044: d29dddc7 mov x7, #0xeeee // #61166 00000048: f2bdddc7 movk x7, #0xeeee, lsl #16 0000004c: f2ddddc7 movk x7, #0xeeee, lsl #32 00000050: f2fdddc7 movk x7, #0xeeee, lsl #48 [...] Also, add a couple of test cases to make sure JITs pass this test. Tested on Cavium ThunderX ARMv8. The added test cases all pass after the fix. Fixes: 8eee539ddea0 ("arm64: bpf: fix out-of-bounds read in bpf2a64_offset()") Reported-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Xi Wang <xi.wang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-02	bpf, arm64: implement jiting of BPF_XADD	Daniel Borkmann	1	-5/+11
	This work adds BPF_XADD for BPF_W/BPF_DW to the arm64 JIT and therefore completes JITing of all BPF instructions, meaning we can thus also remove the 'notyet' label and do not need to fall back to the interpreter when BPF_XADD is used in a program! This now also brings arm64 JIT in line with x86_64, s390x, ppc64, sparc64, where all current eBPF features are supported. BPF_W example from test_bpf: .u.insns_int = { BPF_ALU32_IMM(BPF_MOV, R0, 0x12), BPF_ST_MEM(BPF_W, R10, -40, 0x10), BPF_STX_XADD(BPF_W, R10, R0, -40), BPF_LDX_MEM(BPF_W, R0, R10, -40), BPF_EXIT_INSN(), }, [...] 00000020: 52800247 mov w7, #0x12 // #18 00000024: 928004eb mov x11, #0xffffffffffffffd8 // #-40 00000028: d280020a mov x10, #0x10 // #16 0000002c: b82b6b2a str w10, [x25,x11] // start of xadd mapping: 00000030: 928004ea mov x10, #0xffffffffffffffd8 // #-40 00000034: 8b19014a add x10, x10, x25 00000038: f9800151 prfm pstl1strm, [x10] 0000003c: 885f7d4b ldxr w11, [x10] 00000040: 0b07016b add w11, w11, w7 00000044: 880b7d4b stxr w11, w11, [x10] 00000048: 35ffffab cbnz w11, 0x0000003c // end of xadd mapping: [...] BPF_DW example from test_bpf: .u.insns_int = { BPF_ALU32_IMM(BPF_MOV, R0, 0x12), BPF_ST_MEM(BPF_DW, R10, -40, 0x10), BPF_STX_XADD(BPF_DW, R10, R0, -40), BPF_LDX_MEM(BPF_DW, R0, R10, -40), BPF_EXIT_INSN(), }, [...] 00000020: 52800247 mov w7, #0x12 // #18 00000024: 928004eb mov x11, #0xffffffffffffffd8 // #-40 00000028: d280020a mov x10, #0x10 // #16 0000002c: f82b6b2a str x10, [x25,x11] // start of xadd mapping: 00000030: 928004ea mov x10, #0xffffffffffffffd8 // #-40 00000034: 8b19014a add x10, x10, x25 00000038: f9800151 prfm pstl1strm, [x10] 0000003c: c85f7d4b ldxr x11, [x10] 00000040: 8b07016b add x11, x11, x7 00000044: c80b7d4b stxr w11, x11, [x10] 00000048: 35ffffab cbnz w11, 0x0000003c // end of xadd mapping: [...] Tested on Cavium ThunderX ARMv8, test suite results after the patch: No JIT: [ 3751.855362] test_bpf: Summary: 311 PASSED, 0 FAILED, [0/303 JIT'ed] With JIT: [ 3573.759527] test_bpf: Summary: 311 PASSED, 0 FAILED, [303/303 JIT'ed] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-28	bpf, x86_64/arm64: remove old ldimm64 artifacts from jits	Daniel Borkmann	1	-9/+0
	For both cases, the verifier is already rejecting such invalid formed instructions. Thus, remove these artifacts from old times and align it with ppc64, sparc64 and s390x JITs that don't have them in the first place. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-21	bpf: fix unlocking of jited image when module ronx not set	Daniel Borkmann	1	-1/+1
	Eric and Willem reported that they recently saw random crashes when JIT was in use and bisected this to 74451e66d516 ("bpf: make jited programs visible in traces"). Issue was that the consolidation part added bpf_jit_binary_unlock_ro() that would unlock previously made read-only memory back to read-write. However, DEBUG_SET_MODULE_RONX cannot be used for this to test for presence of set_memory_*() functions. We need to use ARCH_HAS_SET_MEMORY instead to fix this; also add the corresponding bpf_jit_binary_lock_ro() to filter.h. Fixes: 74451e66d516 ("bpf: make jited programs visible in traces") Reported-by: Eric Dumazet <edumazet@google.com> Reported-by: Willem de Bruijn <willemb@google.com> Bisected-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17	bpf: make jited programs visible in traces	Daniel Borkmann	1	-15/+0
	Long standing issue with JITed programs is that stack traces from function tracing check whether a given address is kernel code through {__,}kernel_text_address(), which checks for code in core kernel, modules and dynamically allocated ftrace trampolines. But what is still missing is BPF JITed programs (interpreted programs are not an issue as __bpf_prog_run() will be attributed to them), thus when a stack trace is triggered, the code walking the stack won't see any of the JITed ones. The same for address correlation done from user space via reading /proc/kallsyms. This is read by tools like perf, but the latter is also useful for permanent live tracing with eBPF itself in combination with stack maps when other eBPF types are part of the callchain. See offwaketime example on dumping stack from a map. This work tries to tackle that issue by making the addresses and symbols known to the kernel. The lookup from *kernel_text_address() is implemented through a latched RB tree that can be read under RCU in fast-path that is also shared for symbol/size/offset lookup for a specific given address in kallsyms. The slow-path iteration through all symbols in the seq file done via RCU list, which holds a tiny fraction of all exported ksyms, usually below 0.1 percent. Function symbols are exported as bpf_prog_<tag>, in order to aide debugging and attribution. This facility is currently enabled for root-only when bpf_jit_kallsyms is set to 1, and disabled if hardening is active in any mode. The rationale behind this is that still a lot of systems ship with world read permissions on kallsyms thus addresses should not get suddenly exposed for them. If that situation gets much better in future, we always have the option to change the default on this. Likewise, unprivileged programs are not allowed to add entries there either, but that is less of a concern as most such programs types relevant in this context are for root-only anyway. If enabled, call graphs and stack traces will then show a correct attribution; one example is illustrated below, where the trace is now visible in tooling such as perf script --kallsyms=/proc/kallsyms and friends. Before: 7fff8166889d bpf_clone_redirect+0x80007f0020ed (/lib/modules/4.9.0-rc8+/build/vmlinux) f5d80 __sendmsg_nocancel+0xffff006451f1a007 (/usr/lib64/libc-2.18.so) After: 7fff816688b7 bpf_clone_redirect+0x80007f002107 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fffa0575728 bpf_prog_33c45a467c9e061a+0x8000600020fb (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fffa07ef1fc cls_bpf_classify+0x8000600020dc (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff81678b68 tc_classify+0x80007f002078 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164d40b __netif_receive_skb_core+0x80007f0025fb (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164d718 __netif_receive_skb+0x80007f002018 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164e565 process_backlog+0x80007f002095 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164dc71 net_rx_action+0x80007f002231 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff81767461 __softirqentry_text_start+0x80007f0020d1 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff817658ac do_softirq_own_stack+0x80007f00201c (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff810a2c20 do_softirq+0x80007f002050 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff810a2cb5 __local_bh_enable_ip+0x80007f002085 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8168d452 ip_finish_output2+0x80007f002152 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8168ea3d ip_finish_output+0x80007f00217d (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8168f2af ip_output+0x80007f00203f (/lib/modules/4.9.0-rc8+/build/vmlinux) [...] 7fff81005854 do_syscall_64+0x80007f002054 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff817649eb return_from_SYSCALL_64+0x80007f002000 (/lib/modules/4.9.0-rc8+/build/vmlinux) f5d80 __sendmsg_nocancel+0xffff01c484812007 (/usr/lib64/libc-2.18.so) Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17	bpf: remove stubs for cBPF from arch code	Daniel Borkmann	1	-5/+0
	Remove the dummy bpf_jit_compile() stubs for eBPF JITs and make that a single __weak function in the core that can be overridden similarly to the eBPF one. Also remove stale pr_err() mentions of bpf_jit_compile. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-10	arm64: bpf: optimize LD_ABS, LD_IND	Zi Shen Lim	1	-3/+0
	Remove superfluous stack frame, saving us 3 instructions for every LD_ABS or LD_IND. Signed-off-by: Zi Shen Lim <zlim.lnx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-10	arm64: bpf: optimize JMP_CALL	Zi Shen Lim	1	-3/+0
	Remove superfluous stack frame, saving us 3 instructions for every JMP_CALL. Signed-off-by: Zi Shen Lim <zlim.lnx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-10	arm64: bpf: implement bpf_tail_call() helper	Zi Shen Lim	1	-9/+96
	Add support for JMP_CALL_X (tail call) introduced by commit 04fd61ab36ec ("bpf: allow bpf programs to tail-call other bpf programs"). bpf_tail_call() arguments: ctx - context pointer passed to next program array - pointer to map which type is BPF_MAP_TYPE_PROG_ARRAY index - index inside array that selects specific program to run In this implementation arm64 JIT jumps into callee program after prologue, so callee program reuses the same stack. For tail_call_cnt, we use the callee-saved R26 (which was already saved/restored but previously unused by JIT). With this patch a tail call generates the following code on arm64: if (index >= array->map.max_entries) goto out; 34: mov x10, #0x10 // #16 38: ldr w10, [x1,x10] 3c: cmp w2, w10 40: b.ge 0x0000000000000074 if (tail_call_cnt > MAX_TAIL_CALL_CNT) goto out; tail_call_cnt++; 44: mov x10, #0x20 // #32 48: cmp x26, x10 4c: b.gt 0x0000000000000074 50: add x26, x26, #0x1 prog = array->ptrs[index]; if (prog == NULL) goto out; 54: mov x10, #0x68 // #104 58: ldr x10, [x1,x10] 5c: ldr x11, [x10,x2] 60: cbz x11, 0x0000000000000074 goto *(prog->bpf_func + prologue_size); 64: mov x10, #0x20 // #32 68: ldr x10, [x11,x10] 6c: add x10, x10, #0x20 70: br x10 74: Signed-off-by: Zi Shen Lim <zlim.lnx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-17	bpf: arm64: remove callee-save registers use for tmp registers	Yang Shi	1	-29/+5
	In the current implementation of ARM64 eBPF JIT, R23 and R24 are used for tmp registers, which are callee-saved registers. This leads to variable size of JIT prologue and epilogue. The latest blinding constant change prefers to constant size of prologue and epilogue. AAPCS reserves R9 ~ R15 for temp registers which not need to be saved/restored during function call. So, replace R23 and R24 to R10 and R11, and remove tmp_used flag to save 2 instructions for some jited BPF program. CC: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Zi Shen Lim <zlim.lnx@gmail.com> Signed-off-by: Yang Shi <yang.shi@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>