linux - Linux Kernel (branches are rebased on master from time to time)

Age	Commit message (Collapse)	Author	Files	Lines
2018-08-13	Merge branches 'fixes', 'misc' and 'spectre' into for-linus	Russell King	15	-145/+101
	Conflicts: arch/arm/include/asm/uaccess.h Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-08-13	parisc: Fix and improve kernel stack unwinding	Helge Deller	10	-232/+88
	This patchset fixes and improves stack unwinding a lot: 1. Show backward stack traces with up to 30 callsites 2. Add callinfo to ENTRY_CFI() such that every assembler function will get an entry in the unwind table 3. Use constants instead of numbers in call_on_stack() 4. Do not depend on CONFIG_KALLSYMS to generate backtraces. 5. Speed up backtrace generation Make sure you have this patch to GNU as installed: https://sourceware.org/ml/binutils/2018-07/msg00474.html Without this patch, unwind info in the kernel is often wrong for various functions. Signed-off-by: Helge Deller <deller@gmx.de>
2018-08-13	parisc: Remove unnecessary barriers from spinlock.h	John David Anglin	1	-6/+2
	Now that mb() is an instruction barrier, it will slow performance if we issue unnecessary barriers. The spinlock defines have a number of unnecessary barriers. The __ldcw() define is both a hardware and compiler barrier. The mb() barriers in the routines using __ldcw() serve no purpose. The only barrier needed is the one in arch_spin_unlock(). We need to ensure all accesses are complete prior to releasing the lock. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: Helge Deller <deller@gmx.de>
2018-08-13	parisc: Remove ordered stores from syscall.S	John David Anglin	1	-12/+12
	Now that we use a sync prior to releasing the locks in syscall.S, we don't need the PA 2.0 ordered stores used to release some locks. Using an ordered store, potentially slows the release and subsequent code. There are a number of other ordered stores and loads that serve no purpose. I have converted these to normal stores. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: Helge Deller <deller@gmx.de>
2018-08-13	parisc: prefer _THIS_IP_ and _RET_IP_ statement expressions	Nick Desaulniers	1	-2/+2
	As part of the effort to reduce the code duplication between _THIS_IP_ and current_text_addr(), let's consolidate callers of current_text_addr() to use _THIS_IP_. Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Helge Deller <deller@gmx.de>
2018-08-13	parisc: Add HAVE_REGS_AND_STACK_ACCESS_API feature	Helge Deller	3	-0/+112
	Some parts of the HAVE_REGS_AND_STACK_ACCESS_API feature is needed for the rseq syscall. This patch adds the most important parts, and as long as we don't support kprobes, we should be fine. Signed-off-by: Helge Deller <deller@gmx.de>
2018-08-13	parisc: Drop architecture-specific ENOTSUP define	Helge Deller	1	-1/+0
	parisc is the only Linux architecture which has defined a value for ENOTSUP. All other architectures #define ENOTSUP as EOPNOTSUPP in their libc headers. Having an own value for ENOTSUP which is different than EOPNOTSUPP often gives problems with userspace programs which expect both to be the same. One such example is a build error in the libuv package, as can be seen in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=900237. Since we dropped HP-UX support, there is no real benefit in keeping an own value for ENOTSUP. This patch drops the parisc value for ENOTSUP from the kernel sources. glibc needs no patch, it reuses the exported headers. Signed-off-by: Helge Deller <deller@gmx.de>
2018-08-13	parisc: use generic dma_noncoherent_ops	Christoph Hellwig	4	-139/+16
	Switch to the generic noncoherent direct mapping implementation. Fix sync_single_for_cpu to do skip the cache flush unless the transfer is to the device to match the more tested unmap_single path which should have the same cache coherency implications. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Helge Deller <deller@gmx.de>
2018-08-13	parisc: always use flush_kernel_dcache_range for DMA cache maintainance	Christoph Hellwig	1	-3/+3
	Current the S/G list based DMA ops use flush_kernel_vmap_range which contains a few UP optimizations, while the rest of the DMA operations uses flush_kernel_dcache_range. The single vs sg operations are supposed to have the same effect, so they should use the same routines. Use the more conservation version for now, but if people more familiar with parisc think the vmap version is generally fine for DMA we should switch all interfaces over to it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Helge Deller <deller@gmx.de>
2018-08-13	parisc: merge pcx_dma_ops and pcxl_dma_ops	Christoph Hellwig	4	-59/+43
	The only difference is that pcxl supports dma coherent allocations, while pcx only supports non-consistent allocations and otherwise fails. But dma_alloc* is not in the fast path, and merging these two allows an easy migration path to the generic dma-noncoherent implementation, so do it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Helge Deller <deller@gmx.de>
2018-08-11	Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net	David S. Miller	2	-2/+0

2018-08-10	lib/ubsan: remove null-pointer checks	Andrey Ryabinin	2	-2/+0
	With gcc-8 fsanitize=null become very noisy. GCC started to complain about things like &a->b, where 'a' is NULL pointer. There is no NULL dereference, we just calculate address to struct member. It's technically undefined behavior so UBSAN is correct to report it. But as long as there is no real NULL-dereference, I think, we should be fine. -fno-delete-null-pointer-checks compiler flag should protect us from any consequences. So let's just no use -fsanitize=null as it's not useful for us. If there is a real NULL-deref we will see crash. Even if userspace mapped something at NULL (root can do this), with things like SMAP should catch the issue. Link: http://lkml.kernel.org/r/20180802153209.813-1-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-10	x86/mm/pti: Move user W+X check into pti_finalize()	Joerg Roedel	3	-5/+10
	The user page-table gets the updated kernel mappings in pti_finalize(), which runs after the RO+X permissions got applied to the kernel page-table in mark_readonly(). But with CONFIG_DEBUG_WX enabled, the user page-table is already checked in mark_readonly() for insecure mappings. This causes false-positive warnings, because the user page-table did not get the updated mappings yet. Move the W+X check for the user page-table into pti_finalize() after it updated all required mappings. [ tglx: Folded !NX supported fix ] Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Waiman Long <llong@redhat.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca> Cc: joro@8bytes.org Link: https://lkml.kernel.org/r/1533727000-9172-1-git-send-email-joro@8bytes.org
2018-08-10	x86/microcode: Allow late microcode loading with SMT disabled	Josh Poimboeuf	1	-4/+12
	The kernel unnecessarily prevents late microcode loading when SMT is disabled. It should be safe to allow it if all the primary threads are online. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2018-08-09	MIPS: Remove remnants of UASM_ISA	Paul Burton	2	-2/+0
	Commit 33679a50370d ("MIPS: uasm: Remove needless ISA abstraction") removed use of the MIPS_ISA preprocessor macro, but left a couple of unused definitions of it behind. Remove the dead code. Signed-off-by: Paul Burton <paul.burton@mips.com>
2018-08-09	Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net	David S. Miller	13	-192/+141
	Overlapping changes in RXRPC, changing to ktime_get_seconds() whilst adding some tracepoints. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09	x86/relocs: Add __end_rodata_aligned to S_REL	Joerg Roedel	1	-0/+1
	This new symbol needs to be in the workaround-list for buggy binutils, otherwise the build with gcc-4.6 fails. Fixes: 39d668e04eda ('x86/mm/pti: Make pti_clone_kernel_text() compile on 32 bit') Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linux-Next Mailing List <linux-next@vger.kernel.org> Link: https://lkml.kernel.org/r/20180809094449.ddmnrkz7qkvo3j2x@suse.de
2018-08-09	Merge branch 'linus' of ↵	Linus Torvalds	8	-191/+101
	git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "This fixes a performance regression in arm64 NEON crypto as well as a crash in x86 aegis/morus on unsupported CPUs" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: x86/aegis,morus - Fix and simplify CPUID checks crypto: arm64 - revert NEON yield for fast AEAD implementations
2018-08-09	ssb: Remove SSB_WARN_ON, SSB_BUG_ON and SSB_DEBUG	Michael Büsch	2	-2/+0
	Use the standard WARN_ON instead. If a small kernel is desired, WARN_ON can be disabled globally. Also remove SSB_DEBUG. Besides WARN_ON it only adds a tiny debug check. Include this check unconditionally. Signed-off-by: Michael Buesch <m@bues.ch> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-08-09	Merge branch 'asoc-4.19' into asoc-next	Mark Brown	6	-196/+123

2018-08-09	kbuild: remove deprecated host-progs variable	Masahiro Yamada	1	-2/+1
	The host-progs has been kept as an alias of hostprogs-y for a long time (at least since the beginning of Git era), with the clear prompt: Usage of host-progs is deprecated. Please replace with hostprogs-y! Enough time for the migration has passed. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Max Filippov <jcmvbkbc@gmail.com>
2018-08-09	arm64 / ACPI: clean the additional checks before calling ghes_notify_sea()	Dongjiu Geng	1	-6/+1
	In order to remove the additional check before calling the ghes_notify_sea(), make stub definition when !CONFIG_ACPI_APEI_SEA. After this cleanup, we can simply call the ghes_notify_sea() to let APEI driver handle the SEA notification. Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-08-09	s390/mm: fix addressing exception after suspend/resume	Gerald Schaefer	1	-1/+1
	Commit c9b5ad546e7d "s390/mm: tag normal pages vs pages used in page tables" accidentally changed the logic in arch_set_page_states(), which is used by the suspend/resume code. set_page_stable(page, order) was changed to set_page_stable_dat(page, 0). After this, only the first page of higher order pages will be set to stable, and a write to one of the unstable pages will result in an addressing exception. Fix this by using "order" again, instead of "0". Fixes: c9b5ad546e7d ("s390/mm: tag normal pages vs pages used in page tables") Cc: stable@vger.kernel.org # 4.14+ Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-08-08	x86/mm/kmmio: Make the tracer robust against L1TF	Andi Kleen	1	-10/+15
	The mmio tracer sets io mapping PTEs and PMDs to non present when enabled without inverting the address bits, which makes the PTE entry vulnerable for L1TF. Make it use the right low level macros to actually invert the address bits to protect against L1TF. In principle this could be avoided because MMIO tracing is not likely to be enabled on production machines, but the fix is straigt forward and for consistency sake it's better to get rid of the open coded PTE manipulation. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-08-08	parisc: Define mb() and add memory barriers to assembler unlock sequences	John David Anglin	4	-0/+39
	For years I thought all parisc machines executed loads and stores in order. However, Jeff Law recently indicated on gcc-patches that this is not correct. There are various degrees of out-of-order execution all the way back to the PA7xxx processor series (hit-under-miss). The PA8xxx series has full out-of-order execution for both integer operations, and loads and stores. This is described in the following article: http://web.archive.org/web/20040214092531/http://www.cpus.hp.com/technical_references/advperf.shtml For this reason, we need to define mb() and to insert a memory barrier before the store unlocking spinlocks. This ensures that all memory accesses are complete prior to unlocking. The ldcw instruction performs the same function on entry. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: Helge Deller <deller@gmx.de>
2018-08-08	parisc: Enable CONFIG_MLONGCALLS by default	Helge Deller	1	-1/+1
	Enable the -mlong-calls compiler option by default, because otherwise in most cases linking the vmlinux binary fails due to truncations of R_PARISC_PCREL22F relocations. This fixes building the 64-bit defconfig. Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: Helge Deller <deller@gmx.de>
2018-08-08	MIPS: netlogic: xlr: Remove erroneous check in nlm_fmn_send()	Paul Burton	1	-2/+0
	In nlm_fmn_send() we have a loop which attempts to send a message multiple times in order to handle the transient failure condition of a lack of available credit. When examining the status register to detect the failure we check for a condition that can never be true, which falls foul of gcc 8's -Wtautological-compare: In file included from arch/mips/netlogic/common/irq.c:65: ./arch/mips/include/asm/netlogic/xlr/fmn.h: In function 'nlm_fmn_send': ./arch/mips/include/asm/netlogic/xlr/fmn.h:304:22: error: bitwise comparison always evaluates to false [-Werror=tautological-compare] if ((status & 0x2) == 1) ^~ If the path taken if this condition were true all we do is print a message to the kernel console. Since failures seem somewhat expected here (making the console message questionable anyway) and the condition has clearly never evaluated true we simply remove it, rather than attempting to fix it to check status correctly. Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20174/ Cc: Ganesan Ramalingam <ganesanr@broadcom.com> Cc: James Hogan <jhogan@kernel.org> Cc: Jayachandran C <jnair@caviumnetworks.com> Cc: John Crispin <john@phrozen.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org
2018-08-08	arm64: alternative: Use true and false for boolean values	Gustavo A. R. Silva	1	-2/+2
	Return statements in functions returning bool should use true or false instead of an integer value. This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-08-08	x86/mm/pat: Make set_memory_np() L1TF safe	Andi Kleen	1	-4/+4
	set_memory_np() is used to mark kernel mappings not present, but it has it's own open coded mechanism which does not have the L1TF protection of inverting the address bits. Replace the open coded PTE manipulation with the L1TF protecting low level PTE routines. Passes the CPA self test. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-08-08	x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert	Andi Kleen	1	-10/+12
	Some cases in THP like: - MADV_FREE - mprotect - split mark the PMD non present for temporarily to prevent races. The window for an L1TF attack in these contexts is very small, but it wants to be fixed for correctness sake. Use the proper low level functions for pmd/pud_mknotpresent() to address this. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-08-08	x86/speculation/l1tf: Invert all not present mappings	Andi Kleen	1	-1/+1
	For kernel mappings PAGE_PROTNONE is not necessarily set for a non present mapping, but the inversion logic explicitely checks for !PRESENT and PROT_NONE. Remove the PROT_NONE check and make the inversion unconditional for all not present mappings. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-08-07	MIPS: VDSO: Force link endianness	Paul Burton	1	-0/+1
	When building the VDSO with clang it appears to invoke ld without specifying endianness, even though clang itself was provided with a -EB or -EL flag. This results in the build failing due to a mismatch between the objects that are the input to ld, and the output it is attempting to create: VDSO arch/mips/vdso/vdso.so.dbg.raw mips-linux-ld: arch/mips/vdso/elf.o: compiled for a big endian system and target is little endian mips-linux-ld: arch/mips/vdso/elf.o: endianness incompatible with that of the selected emulation mips-linux-ld: failed to merge target specific data of file arch/mips/vdso/elf.o ... Work around this problem by explicitly specifying the link endianness using -Wl,-EB or -Wl,-EL when -EB or -EL are part of KBUILD_CFLAGS. This resolves the build failure when using clang, and doesn't have any negative effect on gcc. Signed-off-by: Paul Burton <paul.burton@mips.com>
2018-08-07	MIPS: Always specify -EB or -EL when using clang	Paul Burton	1	-0/+10
	When building using clang, always specify -EB or -EL in order to ensure we target the desired endianness. Since clang cross compiles using a single compiler build with multiple targets, our -dumpmachine tests which don't specify clang's --target argument check output based upon the build machine rather than the machine our build will target. This means our detection of whether to specify -EB fails miserably & we never do. Providing the endianness flag unconditionally for clang resolves this issue & simplifies the clang path somewhat. Signed-off-by: Paul Burton <paul.burton@mips.com>
2018-08-07	x86/mm/pti: Clone kernel-image on PTE level for 32 bit	Joerg Roedel	1	-41/+99
	On 32 bit the kernel sections are not huge-page aligned. When we clone them on PMD-level we unevitably map some areas that are normal kernel memory and may contain secrets to user-space. To prevent that we need to clone the kernel-image on PTE-level for 32 bit. Also make the page-table cloning code more general so that it can handle PMD and PTE level cloning. This can be generalized further in the future to also handle clones on the P4D-level. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Waiman Long <llong@redhat.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca> Cc: joro@8bytes.org Link: https://lkml.kernel.org/r/1533637471-30953-4-git-send-email-joro@8bytes.org
2018-08-07	x86/mm/pti: Don't clear permissions in pti_clone_pmd()	Joerg Roedel	1	-6/+5
	The function sets the global-bit on cloned PMD entries, which only makes sense when the permissions are identical between the user and the kernel page-table. Further, only write-permissions are cleared for entry-text and kernel-text sections, which are not writeable at the end of the boot process. The reason why this RW clearing exists is that in the early PTI implementations the cloned kernel areas were set up during early boot before the kernel text is set to read only and not touched afterwards. This is not longer true. The cloned areas are still set up early to get the entry code working for interrupts and other things, but after the kernel text has been set RO the clone is repeated which copies the RO PMD/PTEs over to the user visible clone. That means the initial clearing of the writable bit can be avoided. [ tglx: Amended changelog ] Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Dave Hansen <dave.hansen@intel.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Waiman Long <llong@redhat.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca> Cc: joro@8bytes.org Link: https://lkml.kernel.org/r/1533637471-30953-3-git-send-email-joro@8bytes.org
2018-08-07	x86/paravirt: Fix spectre-v2 mitigations for paravirt guests	Peter Zijlstra	1	-4/+10
	Nadav reported that on guests we're failing to rewrite the indirect calls to CALLEE_SAVE paravirt functions. In particular the pv_queued_spin_unlock() call is left unpatched and that is all over the place. This obviously wrecks Spectre-v2 mitigation (for paravirt guests) which relies on not actually having indirect calls around. The reason is an incorrect clobber test in paravirt_patch_call(); this function rewrites an indirect call with a direct call to the _SAME_ function, there is no possible way the clobbers can be different because of this. Therefore remove this clobber check. Also put WARNs on the other patch failure case (not enough room for the instruction) which I've not seen trigger in my (limited) testing. Three live kernel image disassemblies for lock_sock_nested (as a small function that illustrates the problem nicely). PRE is the current situation for guests, POST is with this patch applied and NATIVE is with or without the patch for !guests. PRE: (gdb) disassemble lock_sock_nested Dump of assembler code for function lock_sock_nested: 0xffffffff817be970 <+0>: push %rbp 0xffffffff817be971 <+1>: mov %rdi,%rbp 0xffffffff817be974 <+4>: push %rbx 0xffffffff817be975 <+5>: lea 0x88(%rbp),%rbx 0xffffffff817be97c <+12>: callq 0xffffffff819f7160 <_cond_resched> 0xffffffff817be981 <+17>: mov %rbx,%rdi 0xffffffff817be984 <+20>: callq 0xffffffff819fbb00 <_raw_spin_lock_bh> 0xffffffff817be989 <+25>: mov 0x8c(%rbp),%eax 0xffffffff817be98f <+31>: test %eax,%eax 0xffffffff817be991 <+33>: jne 0xffffffff817be9ba <lock_sock_nested+74> 0xffffffff817be993 <+35>: movl $0x1,0x8c(%rbp) 0xffffffff817be99d <+45>: mov %rbx,%rdi 0xffffffff817be9a0 <+48>: callq *0xffffffff822299e8 0xffffffff817be9a7 <+55>: pop %rbx 0xffffffff817be9a8 <+56>: pop %rbp 0xffffffff817be9a9 <+57>: mov $0x200,%esi 0xffffffff817be9ae <+62>: mov $0xffffffff817be993,%rdi 0xffffffff817be9b5 <+69>: jmpq 0xffffffff81063ae0 <__local_bh_enable_ip> 0xffffffff817be9ba <+74>: mov %rbp,%rdi 0xffffffff817be9bd <+77>: callq 0xffffffff817be8c0 <__lock_sock> 0xffffffff817be9c2 <+82>: jmp 0xffffffff817be993 <lock_sock_nested+35> End of assembler dump. POST: (gdb) disassemble lock_sock_nested Dump of assembler code for function lock_sock_nested: 0xffffffff817be970 <+0>: push %rbp 0xffffffff817be971 <+1>: mov %rdi,%rbp 0xffffffff817be974 <+4>: push %rbx 0xffffffff817be975 <+5>: lea 0x88(%rbp),%rbx 0xffffffff817be97c <+12>: callq 0xffffffff819f7160 <_cond_resched> 0xffffffff817be981 <+17>: mov %rbx,%rdi 0xffffffff817be984 <+20>: callq 0xffffffff819fbb00 <_raw_spin_lock_bh> 0xffffffff817be989 <+25>: mov 0x8c(%rbp),%eax 0xffffffff817be98f <+31>: test %eax,%eax 0xffffffff817be991 <+33>: jne 0xffffffff817be9ba <lock_sock_nested+74> 0xffffffff817be993 <+35>: movl $0x1,0x8c(%rbp) 0xffffffff817be99d <+45>: mov %rbx,%rdi 0xffffffff817be9a0 <+48>: callq 0xffffffff810a0c20 <__raw_callee_save___pv_queued_spin_unlock> 0xffffffff817be9a5 <+53>: xchg %ax,%ax 0xffffffff817be9a7 <+55>: pop %rbx 0xffffffff817be9a8 <+56>: pop %rbp 0xffffffff817be9a9 <+57>: mov $0x200,%esi 0xffffffff817be9ae <+62>: mov $0xffffffff817be993,%rdi 0xffffffff817be9b5 <+69>: jmpq 0xffffffff81063aa0 <__local_bh_enable_ip> 0xffffffff817be9ba <+74>: mov %rbp,%rdi 0xffffffff817be9bd <+77>: callq 0xffffffff817be8c0 <__lock_sock> 0xffffffff817be9c2 <+82>: jmp 0xffffffff817be993 <lock_sock_nested+35> End of assembler dump. NATIVE: (gdb) disassemble lock_sock_nested Dump of assembler code for function lock_sock_nested: 0xffffffff817be970 <+0>: push %rbp 0xffffffff817be971 <+1>: mov %rdi,%rbp 0xffffffff817be974 <+4>: push %rbx 0xffffffff817be975 <+5>: lea 0x88(%rbp),%rbx 0xffffffff817be97c <+12>: callq 0xffffffff819f7160 <_cond_resched> 0xffffffff817be981 <+17>: mov %rbx,%rdi 0xffffffff817be984 <+20>: callq 0xffffffff819fbb00 <_raw_spin_lock_bh> 0xffffffff817be989 <+25>: mov 0x8c(%rbp),%eax 0xffffffff817be98f <+31>: test %eax,%eax 0xffffffff817be991 <+33>: jne 0xffffffff817be9ba <lock_sock_nested+74> 0xffffffff817be993 <+35>: movl $0x1,0x8c(%rbp) 0xffffffff817be99d <+45>: mov %rbx,%rdi 0xffffffff817be9a0 <+48>: movb $0x0,(%rdi) 0xffffffff817be9a3 <+51>: nopl 0x0(%rax) 0xffffffff817be9a7 <+55>: pop %rbx 0xffffffff817be9a8 <+56>: pop %rbp 0xffffffff817be9a9 <+57>: mov $0x200,%esi 0xffffffff817be9ae <+62>: mov $0xffffffff817be993,%rdi 0xffffffff817be9b5 <+69>: jmpq 0xffffffff81063ae0 <__local_bh_enable_ip> 0xffffffff817be9ba <+74>: mov %rbp,%rdi 0xffffffff817be9bd <+77>: callq 0xffffffff817be8c0 <__lock_sock> 0xffffffff817be9c2 <+82>: jmp 0xffffffff817be993 <lock_sock_nested+35> End of assembler dump. Fixes: 63f70270ccd9 ("[PATCH] i386: PARAVIRT: add common patching machinery") Fixes: 3010a0663fd9 ("x86/paravirt, objtool: Annotate indirect calls") Reported-by: Nadav Amit <namit@vmware.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: stable@vger.kernel.org
2018-08-07	MIPS: Use dins to simplify __write_64bit_c0_split()	Paul Burton	1	-1/+10
	The code in __write_64bit_c0_split() is used by MIPS32 kernels running on MIPS64 CPUs to write a 64-bit value to a 64-bit coprocessor 0 register using a single 64-bit dmtc0 instruction. It does this by combining the 2x 32-bit registers used to hold the 64-bit value into a single register, which in the existing code involves three steps: 1) Zero extend register A which holds bits 31:0 of our data, since it may have previously held a sign-extended value. 2) Shift register B which holds bits 63:32 of our data in bits 31:0 left by 32 bits, such that the bits forming our data are in the position they'll be in the final 64-bit value & bits 31:0 of the register are zero. 3) Or the two registers together to form the 64-bit value in one 64-bit register. From MIPS r2 onwards we have a dins instruction which can effectively perform all 3 of those steps using a single instruction. Add a path for MIPS r2 & beyond which uses dins to take bits 31:0 from register B & insert them into bits 63:32 of register A, giving us our full 64-bit value in register A with one instruction. Since we know that MIPS r2 & above support the sel field for the dmtc0 instruction, we don't bother special casing sel==0. Omiting the sel field would assemble to exactly the same instruction as when we explicitly specify that it equals zero. Signed-off-by: Paul Burton <paul.burton@mips.com>
2018-08-07	MIPS: Use read-write output operand in __write_64bit_c0_split()	Paul Burton	1	-9/+7
	Commit c22c80431055 ("MIPS: Fix input modify in __write_64bit_c0_split()") modified __write_64bit_c0_split() constraints such that we have both an input & an output which we hope to assign to the same registers, and modify the output rather than incorrectly clobbering an input. The way in which we use both an output & an input parameter with the input constrained to share the output registers is a little convoluted & also problematic for clang, which complains if the input & output values have different widths. For example: In file included from kernel/fork.c:98: ./arch/mips/include/asm/mmu_context.h:149:19: error: unsupported inline asm: input with type 'unsigned long' matching output with type 'unsigned long long' write_c0_entryhi(cpu_asid(cpu, next)); ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~ ./arch/mips/include/asm/mmu_context.h:93:2: note: expanded from macro 'cpu_asid' (cpu_context((cpu), (mm)) & cpu_asid_mask(&cpu_data[cpu])) ^ ./arch/mips/include/asm/mipsregs.h:1617:65: note: expanded from macro 'write_c0_entryhi' #define write_c0_entryhi(val) __write_ulong_c0_register($10, 0, val) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~ ./arch/mips/include/asm/mipsregs.h:1430:39: note: expanded from macro '__write_ulong_c0_register' __write_64bit_c0_register(reg, sel, val); \ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~ ./arch/mips/include/asm/mipsregs.h:1400:41: note: expanded from macro '__write_64bit_c0_register' __write_64bit_c0_split(register, sel, value); \ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~ ./arch/mips/include/asm/mipsregs.h:1498:13: note: expanded from macro '__write_64bit_c0_split' : "r,0" (val)); \ ^~~ We can both fix this build failure & simplify the code somewhat by assigning the __tmp variable with the input value in C prior to our inline assembly, and then using a single read-write output operand (ie. a constraint beginning with +) to provide this value to our assembly. Signed-off-by: Paul Burton <paul.burton@mips.com>
2018-08-07	x86/mm/pti: Fix 32 bit PCID check	Joerg Roedel	1	-1/+1
	The check uses the wrong operator and causes false positive warnings in the kernel log on some systems. Fixes: 5e8105950a8b3 ('x86/mm/pti: Add Warning when booting on a PCID capable CPU') Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Waiman Long <llong@redhat.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca> Cc: joro@8bytes.org Link: https://lkml.kernel.org/r/1533637471-30953-2-git-send-email-joro@8bytes.org
2018-08-07	xen: don't use privcmd_call() from xen_mc_flush()	Juergen Gross	2	-9/+22
	Using privcmd_call() for a singleton multicall seems to be wrong, as privcmd_call() is using stac()/clac() to enable hypervisor access to Linux user space. Even if currently not a problem (pv domains can't use SMAP while HVM and PVH domains can't use multicalls) things might change when PVH dom0 support is added to the kernel. Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-08-07	um: clean up archheaders recipe	Masahiro Yamada	1	-7/+1
	Now that '%asm-generic' is added to no-dot-config-targets, 'make asm-generic' does not include the kernel configuration. You can simply do 'make asm-generic' in the recursed top Makefile without bothering syncconfig. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Tested-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Richard Weinberger <richard@nod.at>
2018-08-07	um: fix parallel building with O= option	Masahiro Yamada	1	-2/+1
	Randy Dunlap reports UML occasionally fails to build with -j<N> and O=<builddir> options. make[1]: Entering directory '/home/rdunlap/mmotm-2018-0802-1529/UM64' UPD include/generated/uapi/linux/version.h WRAP arch/x86/include/generated/asm/dma-contiguous.h WRAP arch/x86/include/generated/asm/export.h WRAP arch/x86/include/generated/asm/early_ioremap.h WRAP arch/x86/include/generated/asm/mcs_spinlock.h WRAP arch/x86/include/generated/asm/mm-arch-hooks.h WRAP arch/x86/include/generated/uapi/asm/bpf_perf_event.h WRAP arch/x86/include/generated/uapi/asm/poll.h GEN ./Makefile make[2]: * No rule to make target 'archheaders'. Stop. arch/um/Makefile:119: recipe for target 'archheaders' failed make[1]: * [archheaders] Error 2 make[1]: * Waiting for unfinished jobs.... UPD include/config/kernel.release make[1]: * wait: No child processes. Stop. Makefile:146: recipe for target 'sub-make' failed make: *** [sub-make] Error 2 The cause of the problem is the use of '$(MAKE) KBUILD_SRC=', which recurses to the top Makefile via the $(objtree)/Makefile generated by scripts/mkmakefile. When you run "make -j<N> O=<builddir> ARCH=um", Make can execute 'archheaders' and 'outputmakefile' targets simultaneously because there is no dependency between them. If it happens, $(Q)$(MAKE) KBUILD_SRC= ARCH=$(HEADER_ARCH) archheaders ... tries to run $(objtree)/Makefile that is being updated. The correct way for the recursion is $(Q)$(MAKE) -f $(srctree)/Makefile ARCH=$(HEADER_ARCH) archheaders ..., which does not rely on the generated Makefile. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Tested-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Richard Weinberger <richard@nod.at>
2018-08-07	s390: fix br_r1_trampoline for machines without exrl	Martin Schwidefsky	1	-2/+0
	For machines without the exrl instruction the BFP jit generates code that uses an "br %r1" instruction located in the lowcore page. Unfortunately there is a cut & paste error that puts an additional "larl %r1,.+14" instruction in the code that clobbers the branch target address in %r1. Remove the larl instruction. Cc: <stable@vger.kernel.org> # v4.17+ Fixes: de5cb6eb51 ("s390: use expoline thunks in the BPF JIT") Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-08-07	s390/lib: use expoline for all bcr instructions	Martin Schwidefsky	1	-6/+10
	The memove, memset, memcpy, __memset16, __memset32 and __memset64 function have an additional indirect return branch in form of a "bzr" instruction. These need to use expolines as well. Cc: <stable@vger.kernel.org> # v4.17+ Fixes: 97489e0663 ("s390/lib: use expoline for indirect branches") Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-08-07	cpu/hotplug: Fix SMT supported evaluation	Thomas Gleixner	1	-1/+1
	Josh reported that the late SMT evaluation in cpu_smt_state_init() sets cpu_smt_control to CPU_SMT_NOT_SUPPORTED in case that 'nosmt' was supplied on the kernel command line as it cannot differentiate between SMT disabled by BIOS and SMT soft disable via 'nosmt'. That wreckages the state and makes the sysfs interface unusable. Rework this so that during bringup of the non boot CPUs the availability of SMT is determined in cpu_smt_allowed(). If a newly booted CPU is not a 'primary' thread then set the local cpu_smt_available marker and evaluate this explicitely right after the initial SMP bringup has finished. SMT evaulation on x86 is a trainwreck as the firmware has all the information _before_ booting the kernel, but there is no interface to query it. Fixes: 73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS") Reported-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-08-07	crypto: arm64/ghash-ce - implement 4-way aggregation	Ard Biesheuvel	2	-51/+142
	Enhance the GHASH implementation that uses 64-bit polynomial multiplication by adding support for 4-way aggregation. This more than doubles the performance, from 2.4 cycles per byte to 1.1 cpb on Cortex-A53. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-08-07	crypto: arm64/ghash-ce - replace NEON yield check with block limit	Ard Biesheuvel	2	-32/+23
	Checking the TIF_NEED_RESCHED flag is disproportionately costly on cores with fast crypto instructions and comparatively slow memory accesses. On algorithms such as GHASH, which executes at ~1 cycle per byte on cores that implement support for 64 bit polynomial multiplication, there is really no need to check the TIF_NEED_RESCHED particularly often, and so we can remove the NEON yield check from the assembler routines. However, unlike the AEAD or skcipher APIs, the shash/ahash APIs take arbitrary input lengths, and so there needs to be some sanity check to ensure that we don't hog the CPU for excessive amounts of time. So let's simply cap the maximum input size that is processed in one go to 64 KB. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-08-07	crypto: x86/aegis,morus - Fix and simplify CPUID checks	Ondrej Mosnacek	6	-45/+21
	It turns out I had misunderstood how the x86_match_cpu() function works. It evaluates a logical OR of the matching conditions, not logical AND. This caused the CPU feature checks for AEGIS to pass even if only SSE2 (but not AES-NI) was supported (or vice versa), leading to potential crashes if something tried to use the registered algs. This patch switches the checks to a simpler method that is used e.g. in the Camellia x86 code. The patch also removes the MODULE_DEVICE_TABLE declarations which actually seem to cause the modules to be auto-loaded at boot, which is not desired. The crypto API on-demand module loading is sufficient. Fixes: 1d373d4e8e15 ("crypto: x86 - Add optimized AEGIS implementations") Fixes: 6ecc9d9ff91f ("crypto: x86 - Add optimized MORUS implementations") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Tested-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-08-07	crypto: arm64/aes-ce-gcm - don't reload key schedule if avoidable	Ard Biesheuvel	2	-41/+49
	Squeeze out another 5% of performance by minimizing the number of invocations of kernel_neon_begin()/kernel_neon_end() on the common path, which also allows some reloads of the key schedule to be optimized away. The resulting code runs at 2.3 cycles per byte on a Cortex-A53. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-08-07	crypto: arm64/aes-ce-gcm - implement 2-way aggregation	Ard Biesheuvel	2	-68/+52
	Implement a faster version of the GHASH transform which amortizes the reduction modulo the characteristic polynomial across two input blocks at a time. On a Cortex-A53, the gcm(aes) performance increases 24%, from 3.0 cycles per byte to 2.4 cpb for large input sizes. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>