summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2015-06-09x86/mpx: Support 32-bit binaries on 64-bit kernelsDave Hansen2-53/+179
Right now, the kernel can only switch between 64-bit and 32-bit binaries at compile time. This patch adds support for 32-bit binaries on 64-bit kernels when we support ia32 emulation. We essentially choose which set of table sizes to use when doing arithmetic for the bounds table calculations. This also uses a different approach for calculating the table indexes than before. I think the new one makes it much more clear what is going on, and allows us to share more code between the 32-bit and 64-bit cases. Based-on-patch-by: Qiaowei Ren <qiaowei.ren@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150607183705.E01F21E2@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/mpx: Use 32-bit-only cmpxchg() for 32-bit appsDave Hansen1-5/+36
user_atomic_cmpxchg_inatomic() actually looks at sizeof(*ptr) to figure out how many bytes to copy. If we run it on a 64-bit kernel with a 64-bit pointer, it will copy a 64-bit bounds directory entry. That's fine, except when we have 32-bit programs with 32-bit bounds directory entries and we only *want* 32-bits. This patch breaks the cmpxchg() operation out in to its own function and performs the 32-bit type swizzling in there. Note, the "64-bit" version of this code _would_ work on a 32-bit-only kernel. The issue this patch addresses is only for when the kernel's 'long' is mismatched from the size of the bounds directory entry of the process we are working on. The new helper modifies 'actual_old_val' or returns an error. But gcc doesn't know this, so it warns about 'actual_old_val' being unused. Shut it up with an uninitialized_var(). Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150607183705.672B115E@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/mpx: Introduce new 'directory entry' to 'addr' helper functionDave Hansen2-8/+34
Currently, to get from a bounds directory entry to the virtual address of a bounds table, we simply mask off a few low bits. However, the set of bits we mask off is different for 32-bit and 64-bit binaries. This breaks the operation out in to a helper function and also adds a temporary variable to store the result until we are sure we are returning one. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150607183704.007686CE@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/mpx: Add temporary variable to reduce maskingDave Hansen1-3/+4
When we allocate a bounds table, we call mmap(), then add a "valid" bit to the value before storing it in to the bounds directory. If we fail along the way, we go and mask that valid bit _back_ out. That seems a little silly, and this makes it much more clear when we have a plain address versus an actual table _entry_. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150607183704.3D69D5F4@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86: Make is_64bit_mm() widely availableDave Hansen2-9/+14
The uprobes code has a nice helper, is_64bit_mm(), that consults both the runtime and compile-time flags for 32-bit support. Instead of reinventing the wheel, pull it in to an x86 header so we can use it for MPX. I prefer passing the 'mm' around to test_thread_flag(TIF_IA32) because it makes it explicit where the context is coming from. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150607183704.F0209999@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/mpx: Trace allocation of new bounds tablesDave Hansen2-0/+17
Bounds tables are a significant consumer of memory. It is important to know when they are being allocated. Add a trace point to trace whenever an allocation occurs and also its virtual address. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150607183704.EC23A93E@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/mpx: Trace the attempts to find bounds tablesDave Hansen2-0/+34
There are two different events being traced here. They are doing similar things so share a trace "EVENT_CLASS" and are presented together. 1. Trace when MPX is zapping pages "mpx_unmap_zap": When MPX can not free an entire bounds table, it will instead try to zap unused parts of a bounds table to free the backing memory. This decreases RSS (resident set size) without decreasing the virtual space allocated for bounds tables. 2. Trace attempts to find bounds tables "mpx_unmap_search": This event traces any time we go looking to unmap a bounds table for a given virtual address range. This is useful to ensure that the kernel actually "tried" to free a bounds table versus times it succeeded in finding one. It might try and fail if it realized that a table was shared with an adjacent VMA which is not being unmapped. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150607183703.B9D2468B@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/mpx: Trace entry to bounds exception pathsDave Hansen2-0/+35
There are two basic things that can happen as the result of a bounds exception (#BR): 1. We allocate a new bounds table 2. We pass up a bounds exception to userspace. This patch adds a trace point for the case where we are passing the exception up to userspace with a signal. We are also explicit that we're printing out the inverse of the 'upper' that we encounter. If you want to filter, for instance, you need to ~ the value first. The reason we do this is because of how 'upper' is stored in the bounds table. If a pointer's range is: 0x1000 -> 0x2000 it is stored in the bounds table as (32-bits here for brevity): lower: 0x00001000 upper: 0xffffdfff That is so that an all 0's entry: lower: 0x00000000 upper: 0x00000000 corresponds to the "init" bounds which store a *range* of: 0x00000000 -> 0xffffffff That is, by far, the common case, and that lets us use the zero page, or deduplicate the memory, etc... The 'upper' stored in the table is gibberish to print by itself, so we print ~upper to get the *actual*, logical, human-readable value printed out. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150607183703.027BB9B0@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/mpx: Trace #BR exceptionsDave Hansen3-0/+55
This is the first in a series of MPX tracing patches. I've found these extremely useful in the process of debugging applications and the kernel code itself. This exception hooks in to the bounds (#BR) exception very early and allows capturing the key registers which would influence how the exception is handled. Note that bndcfgu/bndstatus are technically still 64-bit registers even in 32-bit mode. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150607183703.5FE2619A@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/mpx: Introduce a boot-time disable flagDave Hansen2-0/+20
MPX has the _potential_ to cause some issues. Say part of your init system tried to protect one of its components from buffer overflows with MPX. If there were a false positive, it's possible that MPX could keep a system from booting. MPX could also potentially cause performance issues since it is present in hot paths like the unmap path. Allow it to be disabled at boot time. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150607183702.2E8B77AB@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/mpx: Restrict the mmap() size check to bounds tablesDave Hansen1-2/+2
The comment and code here are confusing. We do not currently allocate the bounds directory in the kernel. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150607183702.222CEC2A@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/mpx: Remove redundant MPX_BNDCFG_ADDR_MASKQiaowei Ren1-1/+0
MPX_BNDCFG_ADDR_MASK is defined two times, so this patch removes redundant one. Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150607183702.5F129376@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/mpx: Clean up the code by not passing a task pointer around when unnecessaryDave Hansen5-29/+25
The MPX code can only work on the current task. You can not, for instance, enable MPX management in another process or thread. You can also not handle a fault for another process or thread. Despite this, we pass a task_struct around prolifically. This patch removes all of the task struct passing for code paths where the code can not deal with another task (which turns out to be all of them). This has no functional changes. It's just a cleanup. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: bp@alien8.de Link: http://lkml.kernel.org/r/20150607183702.6A81DA2C@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/mpx: Use the new get_xsave_field_ptr()APIDave Hansen3-28/+27
The MPX registers (bndcsr/bndcfgu/bndstatus) are not directly accessible via normal instructions. They essentially act as if they were floating point registers and are saved/restored along with those registers. There are two main paths in the MPX code where we care about the contents of these registers: 1. #BR (bounds) faults 2. the prctl() code where we are setting MPX up Both of those paths _might_ be called without the FPU having been used. That means that 'tsk->thread.fpu.state' might never be allocated. Also, fpu_save_init() is not preempt-safe. It was a bug to call it without disabling preemption. The new get_xsave_addr() calls unlazy_fpu() instead and properly disables preemption. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave@sr71.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: bp@alien8.de Link: http://lkml.kernel.org/r/20150607183701.BC0D37CF@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/fpu/xstate: Wrap get_xsave_addr() to make it saferDave Hansen2-0/+33
The MPX code appears is calling a low-level FPU function (copy_fpregs_to_fpstate()). This function is not able to be called in all contexts, although it is safe to call directly in some cases. Although probably correct, the current code is ugly and potentially error-prone. So, add a wrapper that calls the (slightly) higher-level fpu__save() (which is preempt- safe) and also ensures that we even *have* an FPU context (in the case that this was called when in lazy FPU mode). Ingo had this to say about the details about when we need preemption disabled: > it's indeed generally unsafe to access/copy FPU registers with preemption enabled, > for two reasons: > > - on older systems that use FSAVE the instruction destroys FPU register > contents, which has to be handled carefully > > - even on newer systems if we copy to FPU registers (which this code doesn't) > then we don't want a context switch to occur in the middle of it, because a > context switch will write to the fpstate, potentially overwriting our new data > with old FPU state. > > But it's safe to access FPU registers with preemption enabled in a couple of > special cases: > > - potentially destructively saving FPU registers: the signal handling code does > this in copy_fpstate_to_sigframe(), because it can rely on the signal restore > side to restore the original FPU state. > > - reading FPU registers on modern systems: we don't do this anywhere at the > moment, mostly to keep symmetry with older systems where FSAVE is > destructive. > > - initializing FPU registers on modern systems: fpu__clear() does this. Here > it's safe because we don't copy from the fpstate. > > - directly writing FPU registers from user-space memory (!). We do this in > fpu__restore_sig(), and it's safe because neither context switches nor > irq-handler FPU use can corrupt the source context of the copy (which is > user-space memory). > > Note that the MPX code's current use of copy_fpregs_to_fpstate() was safe I think, > because: > > - MPX is predicated on eagerfpu, so the destructive F[N]SAVE instruction won't be > used. > > - the code was only reading FPU registers, and was doing it only in places that > guaranteed that an FPU state was already active (i.e. didn't do it in > kthreads) Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave@sr71.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: bp@alien8.de Link: http://lkml.kernel.org/r/20150607183700.AA881696@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-09x86/fpu/xstate: Fix up bad get_xsave_addr() assumptionsDave Hansen1-8/+37
get_xsave_addr() assumes that if an xsave bit is present in the hardware (pcntxt_mask) that it is present in a given xsave buffer. Due to an bug in the xsave code on all of the systems that have MPX (and thus all the users of this code), that has been a true assumption. But, the bug is getting fixed, so our assumption is not going to hold any more. It's quite possible (and normal) for an enabled state to be present on 'pcntxt_mask', but *not* in 'xstate_bv'. We need to consult 'xstate_bv'. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave@sr71.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150607183700.1E739B34@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27x86/fpu: Make WARN_ON_FPU() more robust in the !CONFIG_X86_DEBUG_FPU caseIngo Molnar1-1/+1
Make sure the WARN_ON_FPU() macro consumes the macro argument, to avoid 'unused variable' build warnings if the only use of a variable is in debugging code. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27x86/fpu: Simplify copy_kernel_to_xregs_booting()Ingo Molnar2-2/+3
copy_kernel_to_xregs_booting() has a second parameter that is the mask of xfeatures that should be copied - but this parameter is always -1. Simplify the call site of this function, this also makes it more similar to the function call signature of other copy_kernel_to*regs() functions. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27x86/fpu: Standardize the parameter type of copy_kernel_to_fpregs()Ingo Molnar3-11/+11
Bring the __copy_fpstate_to_fpregs() and copy_fpstate_to_fpregs() functions in line with the parameter passing convention of other kernel-to-FPU-registers copying functions: pass around an in-memory FPU register state pointer, instead of struct fpu *. NOTE: This patch also changes the assembly constraint of the FXSAVE-leak workaround from 'fpu->fpregs_active' to 'fpstate' - but that is fine, as we only need a valid memory address there for the FILDL instruction. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27x86/fpu: Remove error return values from copy_kernel_to_*regs() functionsIngo Molnar2-31/+14
None of the copy_kernel_to_*regs() FPU register copying functions are supposed to fail, and all of them have debugging checks that enforce this. Remove their return values and simplify their call sites, which have redundant error checks and error handling code paths. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27x86/fpu: Rename copy_fpstate_to_fpregs() to copy_kernel_to_fpregs()Ingo Molnar3-7/+7
Bring the __copy_fpstate_to_fpregs() and copy_fpstate_to_fpregs() functions in line with the naming of other kernel-to-FPU-registers copying functions. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27x86/fpu: Add debugging checks to all copy_kernel_to_*() functionsIngo Molnar1-8/+20
Copying from in-kernel FPU context buffers to FPU registers are never supposed to fault. Add debugging checks to copy_kernel_to_fxregs() and copy_kernel_to_fregs() to double check this assumption. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27x86/fpu: Add debugging check to fpu__restore()Ingo Molnar1-0/+2
The copy_fpstate_to_fpregs() function is never supposed to fail, so add a debugging check to its call site in fpu__restore(). Cc: Andy Lutomirski <luto@amacapital.net> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27x86/fpu: Optimize fpu__activate_fpstate_write()Ingo Molnar1-32/+19
fpu__activate_fpstate_write() is used before ptrace writes to the fpstate context. Because it expects the modified registers to be reloaded on the nexts context switch, it's only valid to call this function for stopped child tasks. - add a debugging check for this assumption - remove code that only runs if the current task's FPU state needs to be saved, which cannot occur here - update comments to match the implementation Cc: Andy Lutomirski <luto@amacapital.net> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27x86/fpu: Rename fpu__activate_fpstate() to fpu__activate_fpstate_write()Ingo Molnar3-5/+5
Remaining users of fpu__activate_fpstate() are all places that want to modify FPU registers, rename the function to fpu__activate_fpstate_write() according to this usage. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27x86/fpu: Optimize fpu__activate_fpstate_read()Ingo Molnar1-4/+1
fpu__activate_fpstate_read() is used before FPU registers are read from the fpstate by ptrace and core dumping. It's not necessary to unlazy non-current child tasks in this case, since the reading of registers is non-destructive. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27x86/fpu: Split out the fpu__activate_fpstate_read() methodIngo Molnar3-3/+33
Currently fpu__activate_fpstate() is used for two distinct purposes: - read access by ptrace and core dumping, where in the core dumping case the current task's FPU state may be examined as well. - write access by ptrace, which modifies FPU registers and expects the modified registers to be reloaded on the next context switch. Split out the reading side into fpu__activate_fpstate_read(). ( Note that this is just a pure duplication of fpu__activate_fpstate() for the time being, we'll optimize the new function in the next patch. ) Cc: Andy Lutomirski <luto@amacapital.net> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27x86/fpu: Fix FPU register read access to the current taskIngo Molnar3-25/+32
Bobby Powers reported the following FPU warning during ELF coredumping: WARNING: CPU: 0 PID: 27452 at arch/x86/kernel/fpu/core.c:324 fpu__activate_stopped+0x8a/0xa0() This warning unearthed an invalid assumption about fpu__activate_stopped() that I added in: 67e97fc2ec57 ("x86/fpu: Rename init_fpu() to fpu__unlazy_stopped() and add debugging check") the old init_fpu() function had an (intentional but obscure) side effect: when FPU registers are accessed for the current task, for reading, then it synchronized live in-register FPU state with the fpstate by saving it. So fix this bug by saving the FPU if we are the current task. We'll still warn in fpu__save() if this is called for not yet stopped child tasks, so the debugging check is still preserved. Also rename the function to fpu__activate_fpstate(), because it's not exclusively used for stopped tasks, but for the current task as well. ( Note that this bug calls for a cleaner separation of access-for-read and access-for-modification FPU methods, but we'll do that in separate patches. ) Reported-by: Bobby Powers <bobbypowers@gmail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-25x86/fpu: Micro-optimize the copy_xregs_to_kernel*() and ↵Ingo Molnar1-14/+23
copy_kernel_to_xregs*() functions The copy_xregs_to_kernel*() and copy_kernel_to_xregs*() functions are used to copy FPU registers to kernel memory and vice versa. They are never expected to fail, yet they have a return code, mostly because that way they can share the assembly macros with the copy*user*() functions. This error code is then silently ignored by the context switching and other code - which made the bug in: b8c1b8ea7b21 ("x86/fpu: Fix FPU state save area alignment bug") harder to fix than necessary. So remove the return values and check for no faults when FPU debugging is enabled in the .config. This improves the eagerfpu context switching fast path by a couple of instructions, when FPU debugging is disabled: ffffffff810407fa: 89 c2 mov %eax,%edx ffffffff810407fc: 48 0f ae 2f xrstor64 (%rdi) ffffffff81040800: 31 c0 xor %eax,%eax -ffffffff81040802: eb 0a jmp ffffffff8104080e <__switch_to+0x321> +ffffffff81040802: eb 16 jmp ffffffff8104081a <__switch_to+0x32d> ffffffff81040804: 31 c0 xor %eax,%eax ffffffff81040806: 48 0f ae 8b c0 05 00 fxrstor64 0x5c0(%rbx) ffffffff8104080d: 00 Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-25x86/fpu: Improve the initialization logic of 'err' around xstate_fault() ↵Ingo Molnar1-6/+6
constraints There's a confusing aspect of how xstate_fault() constraints are handled by the FPU register/memory copying functions in fpu/internal.h: they use "0" (0) to signal that the asm code will not always set 'err' to a valid value. But 'err' is already initialized to 0 in C code, which is duplicated by the asm() constraint. Should the initialization value ever be changed, it might become subtly inconsistent with the not too clear asm() constraint. Use 'err' as the value of the input variable instead, to clarify this all. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-25x86/fpu: Improve xstate_fault() handlingIngo Molnar1-10/+10
There are two problems with xstate_fault handling: - The xstate_fault() macro takes an argument, but that's propagated into the assembly named label as well. This is technically correct currently but might result in failures if anytime a more complex argument is used. So use a separate '_err' name instead for the label. - All the xstate_fault() using functions have an error variable named 'err', which is an output variable to the asm() they are using. The problem is, it's not always set by the asm(), in which case the compiler might optimize out its initialization, so that the C variable 'err' might become corrupted after the asm() - confusing anyone who tries to take advantage of this variable after the asm(). Mark it an input variable as well. This is a latent bug currently, but an upcoming debug patch will make use of 'err'. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-25x86/fpu: Rename xstate related 'fx' references to 'xstate'Ingo Molnar1-12/+12
So the xstate code was probably first copied from the fxregs code, hence it carried over the 'fx' naming for the state pointer variable. But this is slightly confusing, as we usually on call the (legacy) MMX/SSE state 'fx', both in data structures and in the functions build around FXSAVE/FXRSTOR. So rename it to 'xstate' to make it more apparent what it is related to. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-25x86/fpu: Fix fpu__init_system_xstate() commentsIngo Molnar1-8/+2
Remove obsolete comment about __init limitations: in the new code there aren't any. Also standardize the comment style in the function while at it. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-25x86/fpu: Move the xstate copying functions into fpu/internal.hIngo Molnar2-192/+192
All the other register<-> memory copying functions are defined in fpu/internal.h, so move the xstate variants there too. Beyond being more consistent, this also allows FPU debugging checks to be added to them. (Because they can now use the macros defined in fpu/internal.h.) Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-25Merge branch 'linus' into x86/fpuIngo Molnar184-693/+1356
Resolve semantic conflict in arch/x86/kvm/cpuid.c with: c447e76b4cab ("kvm/fpu: Enable eager restore kvm FPU for MPX") By removing the FPU internal include files. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-25x86/fpu: Fix FPU state save area alignment bugIngo Molnar1-1/+6
On most configs task-struct is cache line aligned, which makes the XSAVE area's 64-byte required alignment work out fine. But on some .config's task_struct is aligned only to 16 bytes (enforced by ARCH_MIN_TASKALIGN), which makes things like fpu__copy() (that XSAVEOPT uses) not work so well. I broke this in: 7366ed771f6e ("x86/fpu: Simplify FPU handling by embedding the fpstate in task_struct (again)") which embedded the fpstate in the task_struct. The alignment requirements of the FPU code were originally present in ARCH_MIN_TASKALIGN, which still has a value of 16, which was the alignment requirement of the FPU state area prior XSAVE. But this link was not documented (and not required) and the link got lost when the FPU state area was made dynamic years ago. With XSAVEOPT the minimum alignment requirment went up to 64 bytes, and the embedding of the FPU state area in task_struct exposed it again - and '16' was not increased to '64'. So fix this bug, but also try to address the underlying lost link of information that made it easier to happen: - document ARCH_MIN_TASKALIGN a bit better - use alignof() to recover the current alignment requirements. This would work in the future as well, should the alignment requirements go up to 128 bytes with things like AVX512. ( We should probably also use the vSMP alignment rules for all of x86, but that's for another patch. ) Reported-by: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-24Linux 4.1-rc5v4.1-rc5Linus Torvalds1-1/+1
2015-05-24Merge tag 'scsi-fixes' of ↵Linus Torvalds13-77/+71
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is a set of five fixes: Two MAINTAINER email updates (urgent because the non-avagotech emails will start bouncing) an lpfc big endian oops fix, a 256 byte sector hang fix (to eliminate 256 byte sectors) and a storvsc fix which could cause test unit ready failures on bringup" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: MAINTAINERS: Revise lpfc maintainers for Avago Technologies ownership of Emulex MAINTAINERS, be2iscsi: change email domain sd: Disable support for 256 byte/sector disks lpfc: Fix breakage on big endian kernels storvsc: Set the SRB flags correctly when no data transfer is needed
2015-05-23Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds2-12/+29
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "One more fix from the timer departement: - Handle division of negative nanosecond values proper on 32bit. A recent cleanup wrecked the sign handling of the dividend and dropped the check for negative divisors" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: ktime: Fix ktime_divns to do signed division
2015-05-23Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds1-1/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irqchip fix from Thomas Gleixner: "A fix for a GIC-V3 irqchip regression which prevents some systems from booting" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/gicv3-its: ITS table size should not be smaller than PSZ
2015-05-23Merge branch 'for-linus' of ↵Linus Torvalds1-13/+20
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull two Ceph fixes from Sage Weil: "These fix an issue with the RBD notifications when there are topology changes in the cluster" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: Revert "libceph: clear r_req_lru_item in __unregister_linger_request()" libceph: request a new osdmap if lingering request maps to no osd
2015-05-23Merge branch 'for-linus-4.1' of ↵Linus Torvalds3-0/+38
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "I fixed up a regression from 4.0 where conversion between different raid levels would sometimes bail out without converting. Filipe tracked down a race where it was possible to double allocate chunks on the drive. Mark has a fix for fiemap. All three will get bundled off for stable as well" * 'for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: fix regression in raid level conversion Btrfs: fix racy system chunk allocation when setting block group ro btrfs: clear 'ret' in btrfs_check_shared() loop
2015-05-22Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds30-240/+190
Pull drm fixes from Dave Airlie: "Radeon has two displayport fixes, one for a regression. i915 regression flicker fix needed so 4.0 can get fixed. A bunch of msm fixes and a bunch of exynos fixes, these two are probably a bit larger than I'd like, but most of them seems pretty good" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (29 commits) drm/radeon: fix error flag checking in native aux path drm/radeon: retry dcpd fetch drm/msm/mdp5: fix incorrect parameter for msm_framebuffer_iova() drm/exynos: dp: Lower level of EDID read success message drm/exynos: cleanup exynos_drm_plane drm/exynos: 'win' is always unsigned drm/exynos: mixer: don't dump registers under spinlock drm/exynos: Consolidate return statements in fimd_bind() drm/exynos: Constify exynos_drm_crtc_ops drm/exynos: Fix build breakage on !DRM_EXYNOS_FIMD drm/exynos: mixer: Constify platform_device_id drm/exynos: mixer: cleanup pixelformat handling drm/exynos: mixer: also allow NV21 for the video processor drm/exynos: mixer: remove buffer count handling in vp_video_buffer() drm/exynos: plane: honor buffer offset for dma_addr drm/exynos: fb: use drm_format_num_planes to get buffer count drm/i915: fix screen flickering drm/msm: fix locking inconsistencies in gpu->destroy() drm/msm/dsi: Simplify the code to get the number of read byte drm/msm: Attach assigned encoder to eDP and DSI connectors ...
2015-05-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds51-105/+308
Pull networking fixes from David Miller: 1) Don't leak ipvs->sysctl_tbl, from Tommi Rentala. 2) Fix neighbour table entry leak in rocker driver, from Ying Xue. 3) Do not emit bonding notifications for unregistered interfaces, from Nicolas Dichtel. 4) Set ipv6 flow label properly when in TIME_WAIT state, from Florent Fourcot. 5) Fix regression in ipv6 multicast filter test, from Henning Rogge. 6) do_replace() in various footables netfilter modules is missing a check for 0 counters in the datastructure provided by the user. Fix from Dave Jones, and found with trinity. 7) Fix RCU bug in packet scheduler classifier module unloads, from Daniel Borkmann. 8) Avoid deadlock in tcp_get_info() by using u64_sync. From Eric Dumzaet. 9) Input packet processing can race with inetdev_destroy() teardown, fix potential OOPS in ip_error() by explicitly testing whether the inetdev is still attached. From Eric W Biederman. 10) MLDv2 parser in bridge multicast code breaks too early while parsing. Fix from Thadeu Lima de Souza Cascardo. 11) Asking for settings on non-zero PHYID doesn't work because we do not import the command structure from the user and use the PHYID provided there. Fix from Arun Parameswaran. 12) Fix UDP checksums with IPV6 RAW sockets, from Vlad Yasevich. 13) Missing NF_TABLES depends for TPROXY etc can cause build failures, fix from Florian Westphal. 14) Fix netfilter conntrack to handle RFC5961 challenge ACKs properly, from Jesper Dangaard Brouer. 15) If netlink autobind retry fails, we have to reset the sockets portid back to zero. From Herbert Xu. 16) VXLAN netns exit code unregisters using wrong device, from John W Linville. 17) Add some USB device IDs to ath3k and btusb bluetooth drivers, from Dmitry Tunin and Wen-chien Jesse Sung. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits) bridge: fix lockdep splat net: core: 'ethtool' issue with querying phy settings bridge: fix parsing of MLDv2 reports ARM: zynq: DT: Use the zynq binding with macb net: macb: Disable half duplex gigabit on Zynq net: macb: Document zynq gem dt binding ipv4: fill in table id when replacing a route cdc_ncm: Fix tx_bytes statistics ipv4: Avoid crashing in ip_error tcp: fix a potential deadlock in tcp_get_info() net: sched: fix call_rcu() race on classifier module unloads net: phy: Make sure phy_start() always re-enables the phy interrupts ipv6: fix ECMP route replacement ipv6: do not delete previously existing ECMP routes if add fails Revert "netfilter: bridge: query conntrack about skb dnat" netfilter: ensure number of counters is >0 in do_replace() netfilter: nfnetlink_{log,queue}: Register pernet in first place tcp: don't over-send F-RTO probes tcp: only undo on partial ACKs in CA_Loss net/ipv6/udp: Fix ipv6 multicast socket filter regression ...
2015-05-22Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds4-9/+6
Pull block fixes from Jens Axboe: "Three small fixes that have been picked up the last few weeks. Specifically: - Fix a memory corruption issue in NVMe with malignant user constructed request. From Christoph. - Kill (now) unused blk_queue_bio(), dm was changed to not need this anymore. From Mike Snitzer. - Always use blk_schedule_flush_plug() from the io_schedule() path when flushing a plug, fixing a !TASK_RUNNING warning with md. From Shaohua" * 'for-linus' of git://git.kernel.dk/linux-block: sched: always use blk_schedule_flush_plug in io_schedule_out nvme: fix kernel memory corruption with short INQUIRY buffers block: remove export for blk_queue_bio
2015-05-22Merge tag 'md/4.1-rc4-fixes' of git://neil.brown.name/mdLinus Torvalds3-3/+10
Pull md bugfixes from Neil Brown: "I have a few more raid5 bugfixes pending, but I want them to get a bit more review first. In the meantime: - one serious RAID0 data corruption - caused by recent bugfix that wasn't reviewed properly. - one raid5 fix in new code (a couple more of those to come). - one little fix to stop static analysis complaining about silly rcu annotation" * tag 'md/4.1-rc4-fixes' of git://neil.brown.name/md: md/bitmap: remove rcu annotation from pointer arithmetic. md/raid0: fix restore to sector variable in raid0_make_request raid5: fix broken async operation chain
2015-05-22Merge branch 'for-linus' of ↵Linus Torvalds6-4/+70
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: "Updates for the input subsystem. The main change is that we tell joydev not to touch "absolute mice", such as VMware virtual mouse, as that produced bad result (cursor stuck in upper right corner) with games" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: smtpe-ts - wait 50mS until polling for pen-up Input: smtpe-ts - use msecs_to_jiffies() instead of HZ Input: joydev - don't classify the vmmouse as a joystick Input: vmmouse - do not reference non-existing version of X driver Input: alps - fix finger jumps on lifting 2 fingers on v7 touchpad Input: elantech - fix semi-mt protocol for v3 HW Input: sx8654 - fix memory allocation check
2015-05-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds1-12/+13
Pull another crypto fix from Herbert Xu: "Fix ICV corruption in s390/ghash when the same tfm is used by more than one thread" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: s390/ghash - Fix incorrect ghash icv buffer handling.
2015-05-22bridge: fix lockdep splatEric Dumazet1-0/+2
Following lockdep splat was reported : [ 29.382286] =============================== [ 29.382315] [ INFO: suspicious RCU usage. ] [ 29.382344] 4.1.0-0.rc0.git11.1.fc23.x86_64 #1 Not tainted [ 29.382380] ------------------------------- [ 29.382409] net/bridge/br_private.h:626 suspicious rcu_dereference_check() usage! [ 29.382455] other info that might help us debug this: [ 29.382507] rcu_scheduler_active = 1, debug_locks = 0 [ 29.382549] 2 locks held by swapper/0/0: [ 29.382576] #0: (((&p->forward_delay_timer))){+.-...}, at: [<ffffffff81139f75>] call_timer_fn+0x5/0x4f0 [ 29.382660] #1: (&(&br->lock)->rlock){+.-...}, at: [<ffffffffa0450dc1>] br_forward_delay_timer_expired+0x31/0x140 [bridge] [ 29.382754] stack backtrace: [ 29.382787] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.1.0-0.rc0.git11.1.fc23.x86_64 #1 [ 29.382838] Hardware name: LENOVO 422916G/LENOVO, BIOS A1KT53AUS 04/07/2015 [ 29.382882] 0000000000000000 3ebfc20364115825 ffff880666603c48 ffffffff81892d4b [ 29.382943] 0000000000000000 ffffffff81e124e0 ffff880666603c78 ffffffff8110bcd7 [ 29.383004] ffff8800785c9d00 ffff88065485ac58 ffff880c62002800 ffff880c5fc88ac0 [ 29.383065] Call Trace: [ 29.383084] <IRQ> [<ffffffff81892d4b>] dump_stack+0x4c/0x65 [ 29.383130] [<ffffffff8110bcd7>] lockdep_rcu_suspicious+0xe7/0x120 [ 29.383178] [<ffffffffa04520f9>] br_fill_ifinfo+0x4a9/0x6a0 [bridge] [ 29.383225] [<ffffffffa045266b>] br_ifinfo_notify+0x11b/0x4b0 [bridge] [ 29.383271] [<ffffffffa0450d90>] ? br_hold_timer_expired+0x70/0x70 [bridge] [ 29.383320] [<ffffffffa0450de8>] br_forward_delay_timer_expired+0x58/0x140 [bridge] [ 29.383371] [<ffffffffa0450d90>] ? br_hold_timer_expired+0x70/0x70 [bridge] [ 29.383416] [<ffffffff8113a033>] call_timer_fn+0xc3/0x4f0 [ 29.383454] [<ffffffff81139f75>] ? call_timer_fn+0x5/0x4f0 [ 29.383493] [<ffffffff8110a90f>] ? lock_release_holdtime.part.29+0xf/0x200 [ 29.383541] [<ffffffffa0450d90>] ? br_hold_timer_expired+0x70/0x70 [bridge] [ 29.383587] [<ffffffff8113a6a4>] run_timer_softirq+0x244/0x490 [ 29.383629] [<ffffffff810b68cc>] __do_softirq+0xec/0x670 [ 29.383666] [<ffffffff810b70d5>] irq_exit+0x145/0x150 [ 29.383703] [<ffffffff8189f506>] smp_apic_timer_interrupt+0x46/0x60 [ 29.383744] [<ffffffff8189d523>] apic_timer_interrupt+0x73/0x80 [ 29.383782] <EOI> [<ffffffff816f131f>] ? cpuidle_enter_state+0x5f/0x2f0 [ 29.383832] [<ffffffff816f131b>] ? cpuidle_enter_state+0x5b/0x2f0 Problem here is that br_forward_delay_timer_expired() is a timer handler, calling br_ifinfo_notify() which assumes either rcu_read_lock() or RTNL are held. Simplest fix seems to add rcu read lock section. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Josh Boyer <jwboyer@fedoraproject.org> Reported-by: Dominick Grift <dac.override@gmail.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-22net: core: 'ethtool' issue with querying phy settingsArun Parameswaran1-1/+9
When trying to configure the settings for PHY1, using commands like 'ethtool -s eth0 phyad 1 speed 100', the 'ethtool' seems to modify other settings apart from the speed of the PHY1, in the above case. The ethtool seems to query the settings for PHY0, and use this as the base to apply the new settings to the PHY1. This is causing the other settings of the PHY 1 to be wrongly configured. The issue is caused by the '_ethtool_get_settings()' API, which gets called because of the 'ETHTOOL_GSET' command, is clearing the 'cmd' pointer (of type 'struct ethtool_cmd') by calling memset. This clears all the parameters (if any) passed for the 'ETHTOOL_GSET' cmd. So the driver's callback is always invoked with 'cmd->phy_address' as '0'. The '_ethtool_get_settings()' is called from other files in the 'net/core'. So the fix is applied to the 'ethtool_get_settings()' which is only called in the context of the 'ethtool'. Signed-off-by: Arun Parameswaran <aparames@broadcom.com> Reviewed-by: Ray Jui <rjui@broadcom.com> Reviewed-by: Scott Branden <sbranden@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>