summaryrefslogtreecommitdiffstats
path: root/arch/x86/entry/entry_64.S
AgeCommit message (Collapse)AuthorFilesLines
2016-10-07Merge branch 'for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial updates from Jiri Kosina: "The usual rocket science from the trivial tree" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: tracing/syscalls: fix multiline in error message text lib/Kconfig.debug: fix DEBUG_SECTION_MISMATCH description doc: vfs: fix fadvise() sycall name x86/entry: spell EBX register correctly in documentation securityfs: fix securityfs_create_dir comment irq: Fix typo in tracepoint.xml
2016-10-03Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds1-38/+113
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull low-level x86 updates from Ingo Molnar: "In this cycle this topic tree has become one of those 'super topics' that accumulated a lot of changes: - Add CONFIG_VMAP_STACK=y support to the core kernel and enable it on x86 - preceded by an array of changes. v4.8 saw preparatory changes in this area already - this is the rest of the work. Includes the thread stack caching performance optimization. (Andy Lutomirski) - switch_to() cleanups and all around enhancements. (Brian Gerst) - A large number of dumpstack infrastructure enhancements and an unwinder abstraction. The secret long term plan is safe(r) live patching plus maybe another attempt at debuginfo based unwinding - but all these current bits are standalone enhancements in a frame pointer based debug environment as well. (Josh Poimboeuf) - More __ro_after_init and const annotations. (Kees Cook) - Enable KASLR for the vmemmap memory region. (Thomas Garnier)" [ The virtually mapped stack changes are pretty fundamental, and not x86-specific per se, even if they are only used on x86 right now. ] * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits) x86/asm: Get rid of __read_cr4_safe() thread_info: Use unsigned long for flags x86/alternatives: Add stack frame dependency to alternative_call_2() x86/dumpstack: Fix show_stack() task pointer regression x86/dumpstack: Remove dump_trace() and related callbacks x86/dumpstack: Convert show_trace_log_lvl() to use the new unwinder oprofile/x86: Convert x86_backtrace() to use the new unwinder x86/stacktrace: Convert save_stack_trace_*() to use the new unwinder perf/x86: Convert perf_callchain_kernel() to use the new unwinder x86/unwind: Add new unwind interface and implementations x86/dumpstack: Remove NULL task pointer convention fork: Optimize task creation by caching two thread stacks per CPU if CONFIG_VMAP_STACK=y sched/core: Free the stack early if CONFIG_THREAD_INFO_IN_TASK lib/syscall: Pin the task stack in collect_syscall() x86/process: Pin the target stack in get_wchan() x86/dumpstack: Pin the target stack when dumping it kthread: Pin the stack via try_get_task_stack()/put_task_stack() in to_live_kthread() function sched/core: Add try_get_task_stack() and put_task_stack() x86/entry/64: Fix a minor comment rebase error iommu/amd: Don't put completion-wait semaphore on stack ...
2016-09-30x86/entry/64: Fix context tracking state warning when load_gs_index failsWanpeng Li1-2/+2
This warning: WARNING: CPU: 0 PID: 3331 at arch/x86/entry/common.c:45 enter_from_user_mode+0x32/0x50 CPU: 0 PID: 3331 Comm: ldt_gdt_64 Not tainted 4.8.0-rc7+ #13 Call Trace: dump_stack+0x99/0xd0 __warn+0xd1/0xf0 warn_slowpath_null+0x1d/0x20 enter_from_user_mode+0x32/0x50 error_entry+0x6d/0xc0 ? general_protection+0x12/0x30 ? native_load_gs_index+0xd/0x20 ? do_set_thread_area+0x19c/0x1f0 SyS_set_thread_area+0x24/0x30 do_int80_syscall_32+0x7c/0x220 entry_INT80_compat+0x38/0x50 ... can be reproduced by running the GS testcase of the ldt_gdt test unit in the x86 selftests. do_int80_syscall_32() will call enter_form_user_mode() to convert context tracking state from user state to kernel state. The load_gs_index() call can fail with user gsbase, gsbase will be fixed up and proceed if this happen. However, enter_from_user_mode() will be called again in the fixed up path though it is context tracking kernel state currently. This patch fixes it by just fixing up gsbase and telling lockdep that IRQs are off once load_gs_index() failed with user gsbase. Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1475197266-3440-1-git-send-email-wanpeng.li@hotmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-29x86/entry: spell EBX register correctly in documentationNicolas Iooss1-1/+1
As EBS does not mean anything reasonable in the context it is used, it seems like a misspelling for EBX. Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-09-16x86/entry/64: Fix a minor comment rebase errorAndy Lutomirski1-1/+0
When I rebased my thread_info changes onto Brian's switch_to() changes, I carefully checked that I fixed up all the code correctly, but I missed a comment :( Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jann Horn <jann@thejh.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 15f4eae70d36 ("x86: Move thread_info into task_struct") Link: http://lkml.kernel.org/r/089fe1e1cbe8b258b064fccbb1a5a5fd23861031.1474003868.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15x86: Move thread_info into task_structAndy Lutomirski1-2/+5
Now that most of the thread_info users have been cleaned up, this is straightforward. Most of this code was written by Linus. Originally-from: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jann Horn <jann@thejh.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/a50eab40abeaec9cb9a9e3cbdeafd32190206654.1473801993.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-13x86/entry/64: Clean up and document espfix64 stack setupAndy Lutomirski1-11/+53
The espfix64 setup code was a bit inscrutible and contained an unnecessary push of RAX. Remove that push, update all the stack offsets to match, and document the whole mess. Reported-By: Borislav Petkov <bp@alien8.de> Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/e5459eb10cf1175c8b36b840bc425f210d045f35.1473717910.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24sched/x86: Pass kernel thread parameters in 'struct fork_frame'Brian Gerst1-20/+17
Instead of setting up a fake pt_regs context, put the kernel thread function pointer and arg into the unused callee-restored registers of 'struct fork_frame'. Signed-off-by: Brian Gerst <brgerst@gmail.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1471106302-10159-6-git-send-email-brgerst@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24sched/x86: Rewrite the switch_to() codeBrian Gerst1-3/+38
Move the low-level context switch code to an out-of-line asm stub instead of using complex inline asm. This allows constructing a new stack frame for the child process to make it seamlessly flow to ret_from_fork without an extra test and branch in __switch_to(). It also improves code generation for __schedule() by using the C calling convention instead of clobbering all registers. Signed-off-by: Brian Gerst <brgerst@gmail.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1471106302-10159-5-git-send-email-brgerst@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-18Merge branch 'x86/urgent' into x86/asm, to pick up fixesIngo Molnar1-5/+20
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-10x86/entry: Clarify the RF saving/restoring situation with SYSCALL/SYSRETBorislav Petkov1-5/+9
Clarify why exactly RF cannot be restored properly by SYSRET to avoid confusion. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160803171429.GA2590@nazgul.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-10x86, kasan, ftrace: Put APIC interrupt handlers into .irqentry.textAlexander Potapenko1-0/+11
Dmitry Vyukov has reported unexpected KASAN stackdepot growth: https://github.com/google/kasan/issues/36 ... which is caused by the APIC handlers not being present in .irqentry.text: When building with CONFIG_FUNCTION_GRAPH_TRACER=y or CONFIG_KASAN=y, put the APIC interrupt handlers into the .irqentry.text section. This is needed because both KASAN and function graph tracer use __irqentry_text_start and __irqentry_text_end to determine whether a function is an IRQ entry point. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: aryabinin@virtuozzo.com Cc: kasan-dev@googlegroups.com Cc: kcc@google.com Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/1468575763-144889-1-git-send-email-glider@google.com [ Minor edits. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-01x86/entry: Remove duplicated commentBorislav Petkov1-2/+1
Ok, ok, we see it is called from C :-) Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160801100502.29796-1-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-15x86/dumpstack: When OOPSing, rewind the stack before do_exit()Andy Lutomirski1-0/+11
If we call do_exit() with a clean stack, we greatly reduce the risk of recursive oopses due to stack overflow in do_exit, and we allow do_exit to work even if we OOPS from an IST stack. The latter gives us a much better chance of surviving long enough after we detect a stack overflow to write out our logs. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/32f73ceb372ec61889598da5e5b145889b9f2e19.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05x86/entry, sched/x86: Don't save/restore EFLAGS on task switchBrian Gerst1-3/+0
Now that NT is filtered by the SYSENTER entry code, it is safe to skip saving and restoring flags on task switch. Also remove a leftover reset of flags on 64-bit fork. Signed-off-by: Brian Gerst <brgerst@gmail.com> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1462416278-11974-2-git-send-email-brgerst@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-29x86/segments/64: When load_gs_index fails, clear the baseAndy Lutomirski1-0/+6
On AMD CPUs, a failed load_gs_base currently may not clear the FS base. Fix it. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1a6c4d3a8a4e7be79ba448b42685e0321d50c14c.1461698311.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13x86/entry/64: Make gs_change a local labelBorislav Petkov1-5/+5
... so that it doesn't appear in objdump output. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rudolf Marek <r.marek@assembler.cz> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/b9c532a0e5f8d56dede2bd59767d40024d5a75e2.1460075211.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13x86/cpu: Add Erratum 88 detection on AMDBorislav Petkov1-1/+1
Erratum 88 affects old AMD K8s, where a SWAPGS fails to cause an input dependency on GS. Therefore, we need to MFENCE before it. But that MFENCE is expensive and unnecessary on the remaining x86 CPUs out there so patch it out on the CPUs which don't require it. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Andy Lutomirski <luto@kernel.org Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rudolf Marek <r.marek@assembler.cz> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/aec6b2df1bfc56101d4e9e2e5d5d570bf41663c6.1460075211.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-10x86/entry: Improve system call entry commentsAndy Lutomirski1-0/+10
Ingo suggested that the comments should explain when the various entries are used. This adds these explanations and improves other parts of the comments. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/9524ecef7a295347294300045d08354d6a57c6e7.1457578375.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-01x86/entry/64: Fix fast-path syscall return register stateAndy Lutomirski1-1/+3
I was fishing RIP (i.e. RCX) out of pt_regs->cx and RFLAGS (i.e. R11) out of pt_regs->r11. While it usually worked (pt_regs started out with CX == IP and R11 == FLAGS), it was very fragile. In particular, it broke sys_iopl() because sys_iopl() forgot to mark itself as using ptregs. Undo that part of the syscall rework. There was no compelling reason to do it this way. While I'm at it, load RCX and R11 before the other regs to be a little friendlier to the CPU, as they will be the first of the reloaded registers to be used. Reported-and-tested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 1e423bff959e x86/entry/64: ("Migrate the 64-bit syscall slow path to C") Link: http://lkml.kernel.org/r/a85f8360c397e48186a9bc3e565ad74307a7b011.1454261517.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-01x86/entry/64: Fix an IRQ state error on ptregs-using syscallsAndy Lutomirski1-6/+13
I messed up the IRQ state when jumping off the fast path due to invocation of a ptregs-using syscall. This bug shouldn't have had any impact yet, but it would have caused problems with subsequent context tracking cleanups. Reported-and-tested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 1e423bff959e x86/entry/64: ("Migrate the 64-bit syscall slow path to C") Link: http://lkml.kernel.org/r/ab92cd365fb7b0a56869e920017790d96610fdca.1454261517.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-29x86/entry/64: Migrate the 64-bit syscall slow path to CAndy Lutomirski1-78/+39
This is more complicated than the 32-bit and compat cases because it preserves an asm fast path for the case where the callee-saved regs aren't needed in pt_regs and no entry or exit work needs to be done. This appears to slow down fastpath syscalls by no more than one cycle on my Skylake laptop. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/ce2335a4d42dc164b24132ee5e8c7716061f947b.1454022279.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-29x86/entry/64: Stop using int_ret_from_sys_call in ret_from_forkAndy Lutomirski1-16/+19
ret_from_fork is now open-coded and is no longer tangled up with the syscall code. This isn't so bad -- this adds very little code, and IMO the result is much easier to understand. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/a0747e2a5e47084655a1e96351c545b755c41fa7.1454022279.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-29x86/entry/64: Call all native slow-path syscalls with full pt-regsAndy Lutomirski1-78/+1
This removes all of the remaining asm syscall stubs except for stub_ptregs_64. Entries in the main syscall table are now all callable from C. The resulting asm is every bit as ridiculous as it looks. The next few patches will clean it up. This patch is here to let reviewers rest their brains and for bisection. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/a6b3801be0d505d50aefabda02d3b93efbfc9c73.1454022279.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-29x86/entry/64: Always run ptregs-using syscalls on the slow pathAndy Lutomirski1-14/+42
64-bit syscalls currently have an optimization in which they are called with partial pt_regs. A small handful require full pt_regs. In the 32-bit and compat cases, I cleaned this up by forcing full pt_regs for all syscalls. The performance hit doesn't really matter as the affected system calls are fundamentally heavy and this is the 32-bit compat case. I want to clean up the 64-bit case as well, but I don't want to hurt fast path performance. To do that, I want to force the syscalls that use pt_regs onto the slow path. This will enable us to make slow path syscalls be real ABI-compliant C functions. Use the new syscall entry qualification machinery for this. 'stub_clone' is now 'stub_clone/ptregs'. The next patch will eliminate the stubs, and we'll just have 'sys_clone/ptregs'. As of this patch, two-phase entry tracing is no longer used. It has served its purpose (namely a huge speedup on some workloads prior to more general opportunistic SYSRET support), and once the dust settles I'll send patches to back it out. The implementation is heavily based on a patch from Brian Gerst: http://lkml.kernel.org/g/1449666173-15366-1-git-send-email-brgerst@gmail.com Originally-From: Brian Gerst <brgerst@gmail.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/b9beda88460bcefec6e7d792bd44eca9b760b0c4.1454022279.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-11-24x86/entry/64: Bypass enter_from_user_mode on non-context-tracking bootsAndy Lutomirski1-6/+2
On CONFIG_CONTEXT_TRACKING kernels that have context tracking disabled at runtime (which includes most distro kernels), we still have the overhead of a call to enter_from_user_mode in interrupt and exception entries. If jump labels are available, this uses the jump label infrastructure to skip the call. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/73ee804fff48cd8c66b65b724f9f728a11a8c686.1447361906.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-11-24x86/entry/64: Fix irqflag tracing wrt context trackingAndy Lutomirski1-1/+18
Paolo pointed out that enter_from_user_mode could be called while irqflags were traced as though IRQs were on. In principle, this could confuse lockdep. It doesn't cause any problems that I've seen in any configuration, but if I build with CONFIG_DEBUG_LOCKDEP=y, enable a nohz_full CPU, and add code like: if (irqs_disabled()) { spin_lock(&something); spin_unlock(&something); } to the top of enter_from_user_mode, then lockdep will complain without this fix. It seems that lockdep's irqflags sanity checks are too weak to detect this bug without forcing the issue. This patch adds one byte to normal kernels, and it's IMO a bit ugly. I haven't spotted a better way to do this yet, though. The issue is that we can't do TRACE_IRQS_OFF until after SWAPGS (if needed), but we're also supposed to do it before calling C code. An alternative approach would be to call trace_hardirqs_off in enter_from_user_mode. That would be less code and would not bloat normal kernels at all, but it would be harder to see how the code worked. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/86237e362390dfa6fec12de4d75a238acb0ae787.1447361906.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09x86/entry/64/compat: Migrate the body of the syscall entry to CAndy Lutomirski1-1/+1
Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/a2f0fce68feeba798a24339b5a7ec1ec2dd9eaf7.1444091585.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09x86/entry/64/compat: Set up full pt_regs for all compat syscallsAndy Lutomirski1-5/+1
This is conceptually simpler. More importantly, it eliminates the PTREGSCALL and execve stubs, which were not compatible with the C ABI. This means that C code can call through the compat syscall table. The execve stubs are a bit subtle. They did two things: they cleared some registers and they forced slow-path return. Neither is necessary any more: elf_common_init clears the extra registers and start_thread calls force_iret(). Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/f95b7f7dfaacf88a8cae85bb06226cae53769287.1444091584.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-07x86/entry, locking/lockdep: Move lockdep_sys_exit() to ↵Andy Lutomirski1-1/+0
prepare_exit_to_usermode() Rather than worrying about exactly where LOCKDEP_SYS_EXIT should go in the asm code, add it to prepare_exit_from_usermode() and remove all of the asm calls that are followed by prepare_exit_to_usermode(). LOCKDEP_SYS_EXIT now appears only in the syscall fast paths. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/1736ebe948b845e68120b86b89091f3ec27f5e8e.1444091584.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-22x86/nmi/64: Fix a paravirt stack-clobbering bug in the NMI codeAndy Lutomirski1-1/+4
The NMI entry code that switches to the normal kernel stack needs to be very careful not to clobber any extra stack slots on the NMI stack. The code is fine under the assumption that SWAPGS is just a normal instruction, but that assumption isn't really true. Use SWAPGS_UNSAFE_STACK instead. This is part of a fix for some random crashes that Sasha saw. Fixes: 9b6e6a8334d5 ("x86/nmi/64: Switch stacks on userspace NMI entry") Reported-and-tested-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/974bc40edffdb5c2950a5c4977f821a446b76178.1442791737.git.luto@kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-22x86/paravirt: Replace the paravirt nop with a bona fide empty functionAndy Lutomirski1-0/+11
PARAVIRT_ADJUST_EXCEPTION_FRAME generates this code (using nmi as an example, trimmed for readability): ff 15 00 00 00 00 callq *0x0(%rip) # 2796 <nmi+0x6> 2792: R_X86_64_PC32 pv_irq_ops+0x2c That's a call through a function pointer to regular C function that does nothing on native boots, but that function isn't protected against kprobes, isn't marked notrace, and is certainly not guaranteed to preserve any registers if the compiler is feeling perverse. This is bad news for a CLBR_NONE operation. Of course, if everything works correctly, once paravirt ops are patched, it gets nopped out, but what if we hit this code before paravirt ops are patched in? This can potentially cause breakage that is very difficult to debug. A more subtle failure is possible here, too: if _paravirt_nop uses the stack at all (even just to push RBP), it will overwrite the "NMI executing" variable if it's called in the NMI prologue. The Xen case, perhaps surprisingly, is fine, because it's already written in asm. Fix all of the cases that default to paravirt_nop (including adjust_exception_frame) with a big hammer: replace paravirt_nop with an asm function that is just a ret instruction. The Xen case may have other problems, so document them. This is part of a fix for some random crashes that Sasha saw. Reported-and-tested-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/8f5d2ba295f9d73751c33d97fda03e0495d9ade0.1442791737.git.luto@kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-07-31Merge branch 'x86/urgent' into x86/asm, before applying dependent patchesIngo Molnar1-100/+199
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-17x86/entry/64, x86/nmi/64: Add CONFIG_DEBUG_ENTRY NMI testing codeAndy Lutomirski1-0/+15
It turns out to be rather tedious to test the NMI nesting code. Make it easier: add a new CONFIG_DEBUG_ENTRY option that causes the NMI handler to pre-emptively unmask NMIs. With this option set, errors in the repeat_nmi logic or failures to detect that we're in a nested NMI will result in quick panics under perf (especially if multiple counters are running at high frequency) instead of requiring an unusual workload that generates page faults or breakpoints inside NMIs. I called it CONFIG_DEBUG_ENTRY instead of CONFIG_DEBUG_NMI_ENTRY because I want to add new non-NMI checks elsewhere in the entry code in the future, and I'd rather not add too many new config options or add this option and then immediately rename it. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-17x86/nmi/64: Make the "NMI executing" variable more consistentAndy Lutomirski1-6/+5
Currently, "NMI executing" is one the first time an outermost NMI hits repeat_nmi and zero thereafter. Change it to be zero each time for consistency. This is intended to help NMI handling fail harder if it's buggy. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-17x86/nmi/64: Minor asm simplificationAndy Lutomirski1-2/+1
Replace LEA; MOV with an equivalent SUB. This saves one instruction. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-17x86/nmi/64: Use DF to avoid userspace RSP confusing nested NMI detectionAndy Lutomirski1-4/+25
We have a tricky bug in the nested NMI code: if we see RSP pointing to the NMI stack on NMI entry from kernel mode, we assume that we are executing a nested NMI. This isn't quite true. A malicious userspace program can point RSP at the NMI stack, issue SYSCALL, and arrange for an NMI to happen while RSP is still pointing at the NMI stack. Fix it with a sneaky trick. Set DF in the region of code that the RSP check is intended to detect. IRET will clear DF atomically. ( Note: other than paravirt, there's little need for all this complexity. We could check RIP instead of RSP. ) Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-17x86/nmi/64: Reorder nested NMI checksAndy Lutomirski1-16/+18
Check the repeat_nmi .. end_repeat_nmi special case first. The next patch will rework the RSP check and, as a side effect, the RSP check will no longer detect repeat_nmi .. end_repeat_nmi, so we'll need this ordering of the checks. Note: this is more subtle than it appears. The check for repeat_nmi .. end_repeat_nmi jumps straight out of the NMI code instead of adjusting the "iret" frame to force a repeat. This is necessary, because the code between repeat_nmi and end_repeat_nmi sets "NMI executing" and then writes to the "iret" frame itself. If a nested NMI comes in and modifies the "iret" frame while repeat_nmi is also modifying it, we'll end up with garbage. The old code got this right, as does the new code, but the new code is a bit more explicit. If we were to move the check right after the "NMI executing" check, then we'd get it wrong and have random crashes. ( Because the "NMI executing" check would jump to the code that would modify the "iret" frame without checking if the interrupted NMI was currently modifying it. ) Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-17x86/nmi/64: Improve nested NMI commentsAndy Lutomirski1-66/+92
I found the nested NMI documentation to be difficult to follow. Improve the comments. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-17x86/nmi/64: Switch stacks on userspace NMI entryAndy Lutomirski1-4/+58
Returning to userspace is tricky: IRET can fail, and ESPFIX can rearrange the stack prior to IRET. The NMI nesting fixup relies on a precise stack layout and atomic IRET. Rather than trying to teach the NMI nesting fixup to handle ESPFIX and failed IRET, punt: run NMIs that came from user mode on the normal kernel stack. This will make some nested NMIs visible to C code, but the C code is okay with that. As a side effect, this should speed up perf: it eliminates an RDMSR when NMIs come from user mode. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-17x86/nmi/64: Remove asm code that saves CR2Andy Lutomirski1-17/+0
Now that do_nmi saves CR2, we don't need to save it in asm. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-07x86/entry: Remove SCHEDULE_USER and asm/context-tracking.hAndy Lutomirski1-1/+0
SCHEDULE_USER is no longer used, and asm/context-tracking.h contained nothing else. Remove the header entirely. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Denys Vlasenko <vda.linux@googlemail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/854e9b45f69af20e26c47099eb236321563ebcee.1435952415.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-07x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old ↵Andy Lutomirski1-46/+18
assembly code Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Denys Vlasenko <vda.linux@googlemail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/60e90901eee611e59e958bfdbbe39969b4f88fe5.1435952415.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-07x86/asm/entry/64: Simplify IRQ stack pt_regs handlingAndy Lutomirski1-5/+3
There's no need for both RSI and RDI to point to the original stack. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Denys Vlasenko <vda.linux@googlemail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/3a0481f809dd340c7d3f54ce3fd6d66ef2a578cd.1435952415.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-07x86/asm/entry/64: Save all regs on interrupt entryAndy Lutomirski1-20/+9
To prepare for the big rewrite of the error and interrupt exit paths, we will need pt_regs completely filled in. It's already completely filled in when error_exit runs, so rearrange interrupt handling to match it. This will slow down interrupt handling very slightly (eight instructions), but the simplification it enables will be more than worth it. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Denys Vlasenko <vda.linux@googlemail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/d8a766a7f558b30e6e01352854628a2d9943460c.1435952415.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-07x86/entry/64: Migrate 64-bit and compat syscalls to the new exit handlers ↵Andy Lutomirski1-61/+8
and remove old assembly code These need to be migrated together, as the compat case used to jump into the middle of the 64-bit exit code. Remove the old assembly code. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Denys Vlasenko <vda.linux@googlemail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/d4d1d70de08ac3640badf50048a9e8f18fe2497f.1435952415.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-07x86/entry/64: Really create an error-entry-from-usermode code pathAndy Lutomirski1-12/+16
In 539f51136500 ("x86/asm/entry/64: Disentangle error_entry/exit gsbase/ebx/usermode code"), I arranged the code slightly wrong -- IRET faults would skip the code path that was intended to execute on all error entries from user mode. Fix it up. While we're at it, make all the labels in error_entry local. This does not fix a bug, but we'll need it, and it slightly shrinks the code. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Denys Vlasenko <vda.linux@googlemail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/91e17891e49fa3d61357eadc451529ad48143ee1.1435952415.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-07x86/entry/64/compat: Fix bad fast syscall arg failure pathAndy Lutomirski1-1/+1
If user code does SYSCALL32 or SYSENTER without a valid stack, then our attempt to determine the syscall args will result in a failed uaccess fault. Previously, we would try to recover by jumping to the syscall exit code, but we'd run the syscall exit work even though we never made it to the syscall entry work. Clean it up by treating the failure path as a non-syscall entry and exit pair. This fixes strace's output when running the syscall_arg_fault test. Without this fix, strace would get out of sync and would fail to associate syscall entries with syscall exits. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Denys Vlasenko <vda.linux@googlemail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/903010762c07a3d67df914fea2da84b52b0f8f1d.1435952415.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-10x86/asm/entry/64: Disentangle error_entry/exit gsbase/ebx/usermode codeAndy Lutomirski1-8/+31
The error_entry/error_exit code to handle gsbase and whether we return to user mdoe was a mess: - error_sti was misnamed. In particular, it did not enable interrupts. - Error handling for gs_change was hopelessly tangled the normal usermode path. Separate it out. This saves a branch in normal entries from kernel mode. - The comments were bad. Fix it up. As a nice side effect, there's now a code path that happens on error entries from user mode. We'll use it soon. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/f1be898ab93360169fb845ab85185948832209ee.1433878454.git.luto@kernel.org [ Prettified it, clarified comments some more. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-08x86/asm/entry/64: Clean up entry_64.SIngo Molnar1-416/+404
Make the 64-bit syscall entry code a bit more readable: - use consistent assembly coding style similar to the other entry_*.S files - remove old comments that are not true anymore - eliminate whitespace noise - use consistent vertical spacing - fix various comments - reorganize entry point generation tables to be more readable No code changed: # arch/x86/entry/entry_64.o: text data bss dec hex filename 12282 0 0 12282 2ffa entry_64.o.before 12282 0 0 12282 2ffa entry_64.o.after md5: cbab1f2d727a2a8a87618eeb79f391b7 entry_64.o.before.asm cbab1f2d727a2a8a87618eeb79f391b7 entry_64.o.after.asm Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>