summaryrefslogtreecommitdiffstats
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2018-05-31ARM: bugs: prepare processor bug infrastructureRussell King3-2/+12
Prepare the processor bug infrastructure so that it can be expanded to check for per-processor bugs. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Boot-tested-by: Tony Lindgren <tony@atomide.com> Reviewed-by: Tony Lindgren <tony@atomide.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-31ARM: add more CPU part numbers for Cortex and Brahma B15 CPUsRussell King1-0/+8
Add CPU part numbers for Cortex A53, A57, A72, A73, A75 and the Broadcom Brahma B15 CPU. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Boot-tested-by: Tony Lindgren <tony@atomide.com> Reviewed-by: Tony Lindgren <tony@atomide.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-31Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds1-8/+17
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Two fixlets" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/hwbp: Simplify the perf-hwbp code, fix documentation perf/x86/intel: Fix linear IP of PEBS real_ip on Haswell and later CPUs
2018-03-31Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds5-4/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Two UV platform fixes, and a kbuild fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform/UV: Fix critical UV MMR address error x86/platform/uv/BAU: Add APIC idt entry x86/purgatory: Avoid creating stray .<pid>.d files, remove -MD from KBUILD_CFLAGS
2018-03-31Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds1-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 PTI fixes from Ingo Molnar: "Two fixes: a relatively simple objtool fix that makes Clang built kernels work with ORC debug info, plus an alternatives macro fix" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/alternatives: Fixup alternative_call_2 objtool: Add Clang support
2018-03-30Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds3-24/+25
Pull KVM fixes from Radim Krčmář: "PPC: - Fix a bug causing occasional machine check exceptions on POWER8 hosts (introduced in 4.16-rc1) x86: - Fix a guest crashing regression with nested VMX and restricted guest (introduced in 4.16-rc1) - Fix dependency check for pv tlb flush (the wrong dependency that effectively disabled the feature was added in 4.16-rc4, the original feature in 4.16-rc1, so it got decent testing)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Fix pv tlb flush dependencies KVM: nVMX: sync vmcs02 segment regs prior to vmx_set_cr0 KVM: PPC: Book3S HV: Fix duplication of host SLB entries
2018-03-28Merge tag 'powerpc-4.16-6' of ↵Linus Torvalds13-90/+154
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Some more powerpc fixes for 4.16. Apologies if this is a bit big at rc7, but they're all reasonably important fixes. None are actually for new code, so they aren't indicative of 4.16 being in bad shape from our point of view. - Fix missing AT_BASE_PLATFORM (in auxv) when we're using a new firmware interface for describing CPU features. - Fix lost pending interrupts due to a race in our interrupt soft-masking code. - A workaround for a nest MMU bug with TLB invalidations on Power9. - A workaround for broadcast TLB invalidations on Power9. - Fix a bug in our instruction SLB miss handler, when handling bad addresses (eg. >= TASK_SIZE), which could corrupt non-volatile user GPRs. Thanks to: Aneesh Kumar K.V, Balbir Singh, Benjamin Herrenschmidt, Nicholas Piggin" * tag 'powerpc-4.16-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s: Fix i-side SLB miss bad address handler saving nonvolatile GPRs powerpc/mm: Fixup tlbie vs store ordering issue on POWER9 powerpc/mm/radix: Move the functions that does the actual tlbie closer powerpc/mm/radix: Remove unused code powerpc/mm: Workaround Nest MMU bug with TLB invalidations powerpc/mm: Add tracking of the number of coprocessors using a context powerpc/64s: Fix lost pending interrupt due to race causing lost update to irq_happened powerpc/64s: Fix NULL AT_BASE_PLATFORM when using DT CPU features
2018-03-28Merge tag 'armsoc-fixes' of ↵Linus Torvalds12-39/+123
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Arnd Bergmann: "Here are are a couple of last-minute fixes for 4.16, mostly for regressions. As usual, the majory are device tree changes: - USB 3 support on rk3399 didn't work and is being reverted for now - One fix for an old suspend/resume bug on rk3399 - A few regulator related fixes on Banana Pi M2, and on imx7d-sdb - A boot regression fix for all Aspeed SoCs failing to find their memory - One more dtc warning fix The other changes are: - A few updates to the MAINTAINERS file - A revert for an incorrect orion5x cleanup - Two power management fixes for OMAP" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: OMAP: Fix SRAM W+X mapping ARM: dts: aspeed: Add default memory node mailmap: Update email address for Gregory CLEMENT ARM: davinci: fix the GPIO lookup for omapl138-hawk MAINTAINERS: Update Tegra IOMMU maintainer ARM: dts: imx7d-sdb: Fix regulator-usb-otg2-vbus node name ARM: ux500: Fix PMU IRQ regression ARM: dts: rockchip: Add missing #sound-dai-cells on rk3288 Revert "arm64: dts: rockchip: add usb3-phy otg-port support for rk3399" arm64: dts: rockchip: Fix rk3399-gru-* s2r (pinctrl hogs, wifi reset) ARM: OMAP: Fix dmtimer init for omap1 MAINTAINERS: update email address for Maxime Ripard ARM: dts: sun6i: a31s: bpi-m2: add missing regulators ARM: dts: sun6i: a31s: bpi-m2: improve pmic properties
2018-03-28x86/platform/UV: Fix critical UV MMR address errormike.travis@hpe.com1-1/+1
A critical error was found testing the fixed UV4 HUB in that an MMR address was found to be incorrect. This causes the virtual address space for accessing the MMIOH1 region to be allocated with the incorrect size. Fixes: 673aa20c55a1 ("x86/platform/UV: Update uv_mmrs.h to prepare for UV4A fixes") Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com> Cc: Russ Anderson <russ.anderson@hpe.com> Cc: Andrew Banman <andrew.banman@hpe.com> Link: https://lkml.kernel.org/r/20180328174011.041801248@stormcage.americas.sgi.com
2018-03-28KVM: x86: Fix pv tlb flush dependenciesWanpeng Li1-2/+2
PV TLB FLUSH can only be turned on when steal time is enabled. The condition got reversed during conflict resolution. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Fixes: 4f2f61fc5071 ("KVM: X86: Avoid traversing all the cpus for pv tlb flush when steal time is disabled") [Rebased on top of kvm/master and reworded the commit message. - Radim] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-03-28x86/platform/uv/BAU: Add APIC idt entryAndrew Banman3-2/+4
BAU uses the old alloc_initr_gate90 method to setup its interrupt. This fails silently as the BAU vector is in the range of APIC vectors that are registered to the spurious interrupt handler. As a consequence BAU broadcasts are not handled, and the broadcast source CPU hangs. Update BAU to use new idt structure. Fixes: dc20b2d52653 ("x86/idt: Move interrupt gate initialization to IDT code") Signed-off-by: Andrew Banman <abanman@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Mike Travis <mike.travis@hpe.com> Cc: Dimitri Sivanich <sivanich@hpe.com> Cc: Russ Anderson <rja@hpe.com> Cc: stable@vger.kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lkml.kernel.org/r/1522188546-196177-1-git-send-email-abanman@hpe.com
2018-03-27Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds5-11/+12
Pull ARM fixes from Russell King: "A small number of small fixes for ARM, mostly for some build issues. One fix for a regression caused by the cpu hotplug conversion from a few kernel versions ago" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8750/1: deflate_xip_data.sh: minor fixes ARM: 8748/1: mm: Define vdso_start, vdso_end as array ARM: 8747/1: make CONFIG_DEBUG_WX depend on MMU ARM: 8746/1: vfp: Go back to clearing vfp_current_hw_state[]
2018-03-27Merge tag 'sunxi-fixes-for-4.16' of ↵Arnd Bergmann1-3/+60
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes Pull "Allwinner Fixes for 4.16" from Maxime Ripard: The first and second patches fix the regulator support for the Bananapi M2 board. The last one updates my email address in MAINTAINERS. * tag 'sunxi-fixes-for-4.16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/sunxi/linux: MAINTAINERS: update email address for Maxime Ripard ARM: dts: sun6i: a31s: bpi-m2: add missing regulators ARM: dts: sun6i: a31s: bpi-m2: improve pmic properties
2018-03-27Merge tag 'omap-for-v4.16/sram-fix-signed' of ↵Arnd Bergmann3-16/+38
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes Pull "Two fixes for omap variants for v4.16-rc cycle" from Tony Lindgren: Fix insecure W+X mapping warning for SRAM for omaps that don't yet use drivers/misc/*sram*.c code. An earlier attempt at fixing this turned out to cause problems with PM on omap3, this version works with PM on omap3. Also fix dmtimer probe for omap16xx devices that was noticed with the pending dmtimer move to drivers. It seems this has been broken for a while and is a non-critical for booting. It is needed for PM on omap16xx though. * tag 'omap-for-v4.16/sram-fix-signed' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: OMAP: Fix SRAM W+X mapping ARM: OMAP: Fix dmtimer init for omap1
2018-03-27x86/alternatives: Fixup alternative_call_2Alexey Dobriyan1-3/+1
The following pattern fails to compile while the same pattern with alternative_call() does: if (...) alternative_call_2(...); else alternative_call_2(...); as it expands into if (...) { }; <=== else { }; Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20180114120504.GA11368@avx2
2018-03-27perf/x86/intel: Fix linear IP of PEBS real_ip on Haswell and later CPUsStephane Eranian1-8/+17
this patch fix a bug in how the pebs->real_ip is handled in the PEBS handler. real_ip only exists in Haswell and later processor. It is actually the eventing IP, i.e., where the event occurred. As opposed to the pebs->ip which is the PEBS interrupt IP which is always off by one. The problem is that the real_ip just like the IP needs to be fixed up because PEBS does not record all the machine state registers, and in particular the code segement (cs). This is why we have the set_linear_ip() function. The problem was that set_linear_ip() was only used on the pebs->ip and not the pebs->real_ip. We have profiles which ran into invalid callstacks because of this. Here is an example: ..... 0: ffffffffffffff80 recent entry, marker kernel v ..... 1: 000000000040044d <= user address in kernel space! ..... 2: fffffffffffffe00 marker enter user v ..... 3: 000000000040044d ..... 4: 00000000004004b6 oldest entry Debugging output in get_perf_callchain(): [ 857.769909] CALLCHAIN: CPU8 ip=40044d regs->cs=10 user_mode(regs)=0 The problem is that the kernel entry in 1: points to a user level address. How can that be? The reason is that with PEBS sampling the instruction that caused the event to occur and the instruction where the CPU was when the interrupt was posted may be far apart. And sometime during that time window, the privilege level may change. This happens, for instance, when the PEBS sample is taken close to a kernel entry point. Here PEBS, eventing IP (real_ip) captured a user level instruction. But by the time the PMU interrupt fired, the processor had already entered kernel space. This is why the debug output shows a user address with user_mode() false. The problem comes from PEBS not recording the code segment (cs) register. The register is used in x86_64 to determine if executing in kernel vs user space. This is okay because the kernel has a software workaround called set_linear_ip(). But the issue in setup_pebs_sample_data() is that set_linear_ip() is never called on the real_ip value when it is available (Haswell and later) and precise_ip > 1. This patch fixes this problem and eliminates the callchain discrepancy. The patch restructures the code around set_linear_ip() to minimize the number of times the IP has to be set. Signed-off-by: Stephane Eranian <eranian@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: kan.liang@intel.com Link: http://lkml.kernel.org/r/1521788507-10231-1-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-26powerpc/64s: Fix i-side SLB miss bad address handler saving nonvolatile GPRsNicholas Piggin1-1/+1
The SLB bad address handler's trap number fixup does not preserve the low bit that indicates nonvolatile GPRs have not been saved. This leads save_nvgprs to skip saving them, and subsequent functions and return from interrupt will think they are saved. This causes kernel branch-to-garbage debugging to not have correct registers, can also cause userspace to have its registers clobbered after a segfault. Fixes: f0f558b131db ("powerpc/mm: Preserve CFAR value on SLB miss caused by access to bogus address") Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-25Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds13-95/+24
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 and PTI fixes from Ingo Molnar: "Misc fixes: - fix EFI pagetables freeing - fix vsyscall pagetable setting on Xen PV guests - remove ancient CONFIG_X86_PPRO_FENCE=y - x86 is TSO again - fix two binutils (ld) development version related incompatibilities - clean up breakpoint handling - fix an x86 self-test" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry/64: Don't use IST entry for #BP stack x86/efi: Free efi_pgd with free_pages() x86/vsyscall/64: Use proper accessor to update P4D entry x86/cpu: Remove the CONFIG_X86_PPRO_FENCE=y quirk x86/boot/64: Verify alignment of the LOAD segment x86/build/64: Force the linker to use 2MB page size selftests/x86/ptrace_syscall: Fix for yet more glibc interference
2018-03-25Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds5-28/+33
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc kernel side fixes. Generic: - cgroup events counting fix x86: - Intel PMU truncated-parameter fix - RDPMC fix - API naming fix/rename - uncore driver big-hardware PCI enumeration fix - uncore driver filter constraint fix" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/cgroup: Fix child event counting bug perf/x86/intel/uncore: Fix multi-domain PCI CHA enumeration bug on Skylake servers perf/x86/intel: Rename confusing 'freerunning PEBS' API and implementation to 'large PEBS' perf/x86/intel/uncore: Add missing filter constraint for SKX CHA event perf/x86/intel: Don't accidentally clear high bits in bdw_limit_period() perf/x86/intel: Disable userspace RDPMC usage for large PEBS
2018-03-25x86/purgatory: Avoid creating stray .<pid>.d files, remove -MD from ↵Sven Wegener1-1/+1
KBUILD_CFLAGS The kernel build system already takes care of generating the dependency files. Having the additional -MD in KBUILD_CFLAGS leads to stray .<pid>.d files in the build directory when we call the cc-option macro. Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vivek Goyal <vgoyal@redhat.com> Link: http://lkml.kernel.org/r/alpine.LNX.2.21.1803242219380.30139@titan.int.lan.stealer.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-24ARM: 8750/1: deflate_xip_data.sh: minor fixesNicolas Pitre1-3/+3
Send nm complaints about broken pipe (when sed exits early) to /dev/null. All errors should be printed to stderr. Don't trap on normal exit so the trap can return an error code. Signed-off-by: Nicolas Pitre <nico@linaro.org> Tested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-03-24ARM: 8748/1: mm: Define vdso_start, vdso_end as arrayJinbum Park2-7/+7
Define vdso_start, vdso_end as array to avoid compile-time analysis error for the case of built with CONFIG_FORTIFY_SOURCE. and, since vdso_start, vdso_end are used in vdso.c only, move extern-declaration from vdso.h to vdso.c. If kernel is built with CONFIG_FORTIFY_SOURCE, compile-time error happens at this code. - if (memcmp(&vdso_start, "177ELF", 4)) The size of "&vdso_start" is recognized as 1 byte, but n is 4, So that compile-time error is reported. Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jinbum Park <jinb.park7@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-03-24ARM: 8747/1: make CONFIG_DEBUG_WX depend on MMUArnd Bergmann1-0/+1
Without CONFIG_MMU, this results in a build failure: ./arch/arm/include/asm/memory.h:92:23: error: initializer element is not constant #define VECTORS_BASE vectors_base arch/arm/mm/dump.c:32:4: note: in expansion of macro 'VECTORS_BASE' { VECTORS_BASE, "Vectors" }, arch/arm/mm/dump.c:71:11: error: 'L_PTE_USER' undeclared here (not in a function); did you mean 'VTIME_USER'? .mask = L_PTE_USER, ^~~~~~~~~~ Obviously the feature only makes sense with an MMU, so let's add the dependency here. Fixes: a8e53c151fe7 ("ARM: 8737/1: mm: dump: add checking for writable and executable") Acked-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-03-24ARM: 8746/1: vfp: Go back to clearing vfp_current_hw_state[]Fabio Estevam1-1/+1
Commit 384b38b66947 ("ARM: 7873/1: vfp: clear vfp_current_hw_state for dying cpu") fixed the cpu dying notifier by clearing vfp_current_hw_state[]. However commit e5b61bafe704 ("arm: Convert VFP hotplug notifiers to state machine") incorrectly used the original vfp_force_reload() function in the cpu dying notifier. Fix it by going back to clearing vfp_current_hw_state[]. Fixes: e5b61bafe704 ("arm: Convert VFP hotplug notifiers to state machine") Cc: linux-stable <stable@vger.kernel.org> Reported-by: Kohji Okuno <okuno.kohji@jp.panasonic.com> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-03-23x86/entry/64: Don't use IST entry for #BP stackAndy Lutomirski3-10/+9
There's nothing IST-worthy about #BP/int3. We don't allow kprobes in the small handful of places in the kernel that run at CPL0 with an invalid stack, and 32-bit kernels have used normal interrupt gates for #BP forever. Furthermore, we don't allow kprobes in places that have usergs while in kernel mode, so "paranoid" is also unnecessary. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org
2018-03-23x86/efi: Free efi_pgd with free_pages()Waiman Long1-1/+1
The efi_pgd is allocated as PGD_ALLOCATION_ORDER pages and therefore must also be freed as PGD_ALLOCATION_ORDER pages with free_pages(). Fixes: d9e9a6418065 ("x86/mm/pti: Allocate a separate user PGD") Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1521746333-19593-1-git-send-email-longman@redhat.com
2018-03-23Merge tag 'mips_fixes_4.16_5' of ↵Linus Torvalds4-30/+27
git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips Pull MIPS fixes from James Hogan: "Another miscellaneous pile of MIPS fixes for 4.16: - lantiq: fixes for clocks and Amazon SE (4.14) - ralink: fix booting on MT7621 (4.5) - ralink: fix halt (3.9)" * tag 'mips_fixes_4.16_5' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips: MIPS: ralink: Fix booting on MT7621 MIPS: ralink: Remove ralink_halt() MIPS: lantiq: ase: Enable MFD_SYSCON MIPS: lantiq: Enable AHB Bus for USB MIPS: lantiq: Fix Danube USB clock
2018-03-23KVM: nVMX: sync vmcs02 segment regs prior to vmx_set_cr0Sean Christopherson1-5/+5
Segment registers must be synchronized prior to any code that may trigger a call to emulation_required()/guest_state_valid(), e.g. vmx_set_cr0(). Because preparing vmcs02 writes segmentation fields directly, i.e. doesn't use vmx_set_segment(), emulation_required will not be re-evaluated when synchronizing the segment registers, which can result in L0 incorrectly starting emulation of L2. Fixes: 8665c3f97320 ("KVM: nVMX: initialize descriptor cache fields in prepare_vmcs02_full") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> [Move all of prepare_vmcs02_full earlier, not just segment registers. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-23Merge tag 'kvm-ppc-fixes-4.16-3' of ↵Paolo Bonzini1-17/+18
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-master PPC KVM fix - Fix a bug causing occasional machine check exceptions on POWER8 hosts, introduced in 4.16-rc1.
2018-03-23powerpc/mm: Fixup tlbie vs store ordering issue on POWER9Aneesh Kumar K.V7-2/+50
On POWER9, under some circumstances, a broadcast TLB invalidation might complete before all previous stores have drained, potentially allowing stale stores from becoming visible after the invalidation. This works around it by doubling up those TLB invalidations which was verified by HW to be sufficient to close the risk window. This will be documented in a yet-to-be-published errata. Fixes: 1a472c9dba6b ("powerpc/mm/radix: Add tlbflush routines") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Enable the feature in the DT CPU features code for all Power9, rename the feature to CPU_FTR_P9_TLBIE_BUG per benh.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-23powerpc/mm/radix: Move the functions that does the actual tlbie closerAneesh Kumar K.V1-32/+32
No functionality change. Just code movement to ease code changes later Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-23powerpc/mm/radix: Remove unused codeAneesh Kumar K.V2-43/+0
These function are not used in the code. Remove them. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-23powerpc/mm: Workaround Nest MMU bug with TLB invalidationsBenjamin Herrenschmidt1-7/+43
On POWER9 the Nest MMU may fail to invalidate some translations when doing a tlbie "by PID" or "by LPID" that is targeted at the TLB only and not the page walk cache. This works around it by forcing such invalidations to escalate to RIC=2 (full invalidation of TLB *and* PWC) when a coprocessor is in use for the context. Fixes: 03b8abedf4f4 ("cxl: Enable global TLBIs for cxl contexts") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Balbir Singh <bsingharora@gmail.com> [balbirs: fixed spelling and coding style to quiesce checkpatch.pl] Tested-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-23powerpc/mm: Add tracking of the number of coprocessors using a contextBenjamin Herrenschmidt3-5/+17
Currently, when using coprocessors (which use the Nest MMU), we simply increment the active_cpu count to force all TLB invalidations to be come broadcast. Unfortunately, due to an errata in POWER9, we will need to know more specifically that coprocessors are in use. This maintains a separate copros counter in the MMU context for that purpose. NB. The commit mentioned in the fixes tag below is not at fault for the bug we're fixing in this commit and the next, but this fix applies on top the infrastructure it introduced. Fixes: 03b8abedf4f4 ("cxl: Enable global TLBIs for cxl contexts") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-23KVM: PPC: Book3S HV: Fix duplication of host SLB entriesPaul Mackerras1-17/+18
Since commit 6964e6a4e489 ("KVM: PPC: Book3S HV: Do SLB load/unload with guest LPCR value loaded", 2018-01-11), we have been seeing occasional machine check interrupts on POWER8 systems when running KVM guests, due to SLB multihit errors. This turns out to be due to the guest exit code reloading the host SLB entries from the SLB shadow buffer when the SLB was not previously cleared in the guest entry path. This can happen because the path which skips from the guest entry code to the guest exit code without entering the guest now does the skip before the SLB is cleared and loaded with guest values, but the host values are loaded after the point in the guest exit path that we skip to. To fix this, we move the code that reloads the host SLB values up so that it occurs just before the point in the guest exit code (the label guest_bypass:) where we skip to from the guest entry path. Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Fixes: 6964e6a4e489 ("KVM: PPC: Book3S HV: Do SLB load/unload with guest LPCR value loaded") Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-03-22Merge branch 'akpm' (patches from Andrew)Linus Torvalds3-1/+58
Merge misc fixes from Andrew Morton: "13 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm, thp: do not cause memcg oom for thp mm/vmscan: wake up flushers for legacy cgroups too Revert "mm: page_alloc: skip over regions of invalid pfns where possible" mm/shmem: do not wait for lock_page() in shmem_unused_huge_shrink() mm/thp: do not wait for lock_page() in deferred_split_scan() mm/khugepaged.c: convert VM_BUG_ON() to collapse fail x86/mm: implement free pmd/pte page interfaces mm/vmalloc: add interfaces to free unmapped page table h8300: remove extraneous __BIG_ENDIAN definition hugetlbfs: check for pgoff value overflow lockdep: fix fs_reclaim warning MAINTAINERS: update Mark Fasheh's e-mail mm/mempolicy.c: avoid use uninitialized preferred_node
2018-03-22Merge branch 'libnvdimm-fixes' of ↵Linus Torvalds1-32/+28
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "Two regression fixes, two bug fixes for older issues, two fixes for new functionality added this cycle that have userspace ABI concerns, and a small cleanup. These have appeared in a linux-next release and have a build success report from the 0day robot. * The 4.16 rework of altmap handling led to some configurations leaking page table allocations due to freeing from the altmap reservation rather than the page allocator. The impact without the fix is leaked memory and a WARN() message when tearing down libnvdimm namespaces. The rework also missed a place where error handling code needed to be removed that can lead to a crash if devm_memremap_pages() fails. * acpi_map_pxm_to_node() had a latent bug whereby it could misidentify the closest online node to a given proximity domain. * Block integrity handling was reworked several kernels back to allow calling add_disk() after setting up the integrity profile. The nd_btt and nd_blk drivers are just now catching up to fix automatic partition detection at driver load time. * The new peristence_domain attribute, a platform indicator of whether cpu caches are powerfail protected for example, is meant to be a single value enum and not a set of flags. This oversight was caught while reviewing new userspace code in libndctl to communicate the attribute. Fix this new enabling up so that we are not stuck with an unwanted userspace ABI" * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: libnvdimm, nfit: fix persistence domain reporting libnvdimm, region: hide persistence_domain when unknown acpi, numa: fix pxm to online numa node associations x86, memremap: fix altmap accounting at free libnvdimm: remove redundant assignment to pointer 'dev' libnvdimm, {btt, blk}: do integrity setup before add_disk() kernel/memremap: Remove stale devres_free() call
2018-03-22x86/mm: implement free pmd/pte page interfacesToshi Kani1-2/+26
Implement pud_free_pmd_page() and pmd_free_pte_page() on x86, which clear a given pud/pmd entry and free up lower level page table(s). The address range associated with the pud/pmd entry must have been purged by INVLPG. Link: http://lkml.kernel.org/r/20180314180155.19492-3-toshi.kani@hpe.com Fixes: e61ce6ade404e ("mm: change ioremap to set up huge I/O mappings") Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Reported-by: Lei Li <lious.lilei@hisilicon.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@suse.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-22mm/vmalloc: add interfaces to free unmapped page tableToshi Kani2-0/+34
On architectures with CONFIG_HAVE_ARCH_HUGE_VMAP set, ioremap() may create pud/pmd mappings. A kernel panic was observed on arm64 systems with Cortex-A75 in the following steps as described by Hanjun Guo. 1. ioremap a 4K size, valid page table will build, 2. iounmap it, pte0 will set to 0; 3. ioremap the same address with 2M size, pgd/pmd is unchanged, then set the a new value for pmd; 4. pte0 is leaked; 5. CPU may meet exception because the old pmd is still in TLB, which will lead to kernel panic. This panic is not reproducible on x86. INVLPG, called from iounmap, purges all levels of entries associated with purged address on x86. x86 still has memory leak. The patch changes the ioremap path to free unmapped page table(s) since doing so in the unmap path has the following issues: - The iounmap() path is shared with vunmap(). Since vmap() only supports pte mappings, making vunmap() to free a pte page is an overhead for regular vmap users as they do not need a pte page freed up. - Checking if all entries in a pte page are cleared in the unmap path is racy, and serializing this check is expensive. - The unmap path calls free_vmap_area_noflush() to do lazy TLB purges. Clearing a pud/pmd entry before the lazy TLB purges needs extra TLB purge. Add two interfaces, pud_free_pmd_page() and pmd_free_pte_page(), which clear a given pud/pmd entry and free up a page for the lower level entries. This patch implements their stub functions on x86 and arm64, which work as workaround. [akpm@linux-foundation.org: fix typo in pmd_free_pte_page() stub] Link: http://lkml.kernel.org/r/20180314180155.19492-2-toshi.kani@hpe.com Fixes: e61ce6ade404e ("mm: change ioremap to set up huge I/O mappings") Reported-by: Lei Li <lious.lilei@hisilicon.com> Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Wang Xuefeng <wxf.wang@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@suse.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Chintan Pandya <cpandya@codeaurora.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-22h8300: remove extraneous __BIG_ENDIAN definitionArnd Bergmann1-1/+0
A bugfix I did earlier caused a build regression on h8300, which defines the __BIG_ENDIAN macro in a slightly different way than the generic code: arch/h8300/include/asm/byteorder.h:5:0: warning: "__BIG_ENDIAN" redefined We don't need to define it here, as the same macro is already provided by the linux/byteorder/big_endian.h, and that version does not conflict. While this is a v4.16 regression, my earlier patch also got backported to the 4.14 and 4.15 stable kernels, so we need the fixup there as well. Link: http://lkml.kernel.org/r/20180313120752.2645129-1-arnd@arndb.de Fixes: 101110f6271c ("Kbuild: always define endianess in kconfig.h") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-23powerpc/64s: Fix lost pending interrupt due to race causing lost update to ↵Nicholas Piggin1-0/+8
irq_happened force_external_irq_replay() can be called in the do_IRQ path with interrupts hard enabled and soft disabled if may_hard_irq_enable() set MSR[EE]=1. It updates local_paca->irq_happened with a load, modify, store sequence. If a maskable interrupt hits during this sequence, it will go to the masked handler to be marked pending in irq_happened. This update will be lost when the interrupt returns and the store instruction executes. This can result in unpredictable latencies, timeouts, lockups, etc. Fix this by ensuring hard interrupts are disabled before modifying irq_happened. This could cause any maskable asynchronous interrupt to get lost, but it was noticed on P9 SMP system doing RDMA NVMe target over 100GbE, so very high external interrupt rate and high IPI rate. The hang was bisected down to enabling doorbell interrupts for IPIs. These provided an interrupt type that could run at high rates in the do_IRQ path, stressing the race. Fixes: 1d607bb3bd60 ("powerpc/irq: Add mechanism to force a replay of interrupts") Cc: stable@vger.kernel.org # v4.8+ Reported-by: Carol L. Soto <clsoto@us.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds1-1/+2
Pull networking fixes from David Miller: 1) Always validate XFRM esn replay attribute, from Florian Westphal. 2) Fix RCU read lock imbalance in xfrm_get_tos(), from Xin Long. 3) Don't try to get firmware dump if not loaded in iwlwifi, from Shaul Triebitz. 4) Fix BPF helpers to deal with SCTP GSO SKBs properly, from Daniel Axtens. 5) Fix some interrupt handling issues in e1000e driver, from Benjamin Poitier. 6) Use strlcpy() in several ethtool get_strings methods, from Florian Fainelli. 7) Fix rhlist dup insertion, from Paul Blakey. 8) Fix SKB leak in netem packet scheduler, from Alexey Kodanev. 9) Fix driver unload crash when link is up in smsc911x, from Jeremy Linton. 10) Purge out invalid socket types in l2tp_tunnel_create(), from Eric Dumazet. 11) Need to purge the write queue when TCP connections are aborted, otherwise userspace using MSG_ZEROCOPY can't close the fd. From Soheil Hassas Yeganeh. 12) Fix double free in error path of team driver, from Arkadi Sharshevsky. 13) Filter fixes for hv_netvsc driver, from Stephen Hemminger. 14) Fix non-linear packet access in ipv6 ndisc code, from Lorenzo Bianconi. 15) Properly filter out unsupported feature flags in macvlan driver, from Shannon Nelson. 16) Don't request loading the diag module for a protocol if the protocol itself is not even registered. From Xin Long. 17) If datagram connect fails in ipv6, make sure the socket state is consistent afterwards. From Paolo Abeni. 18) Use after free in qed driver, from Dan Carpenter. 19) If received ipv4 PMTU is less than the min pmtu, lock the mtu in the entry. From Sabrina Dubroca. 20) Fix sleep in atomic in tg3 driver, from Jonathan Toppins. 21) Fix vlan in vlan untagging in some situations, from Toshiaki Makita. 22) Fix double SKB free in genlmsg_mcast(). From Nicolas Dichtel. 23) Fix NULL derefs in error paths of tcf_*_init(), from Davide Caratti. 24) Unbalanced PM runtime calls in FEC driver, from Florian Fainelli. 25) Memory leak in gemini driver, from Igor Pylypiv. 26) IDR leaks in error paths of tcf_*_init() functions, from Davide Caratti. 27) Need to use GFP_ATOMIC in seg6_build_state(), from David Lebrun. 28) Missing dev_put() in error path of macsec_newlink(), from Dan Carpenter. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (201 commits) macsec: missing dev_put() on error in macsec_newlink() net: dsa: Fix functional dsa-loop dependency on FIXED_PHY hv_netvsc: common detach logic hv_netvsc: change GPAD teardown order on older versions hv_netvsc: use RCU to fix concurrent rx and queue changes hv_netvsc: disable NAPI before channel close net/ipv6: Handle onlink flag with multipath routes ppp: avoid loop in xmit recursion detection code ipv6: sr: fix NULL pointer dereference when setting encap source address ipv6: sr: fix scheduling in RCU when creating seg6 lwtunnel state net: aquantia: driver version bump net: aquantia: Implement pci shutdown callback net: aquantia: Allow live mac address changes net: aquantia: Add tx clean budget and valid budget handling logic net: aquantia: Change inefficient wait loop on fw data reads net: aquantia: Fix a regression with reset on old firmware net: aquantia: Fix hardware reset when SPI may rarely hangup s390/qeth: on channel error, reject further cmd requests s390/qeth: lock read device while queueing next buffer s390/qeth: when thread completes, wake up all waiters ...
2018-03-22MIPS: ralink: Fix booting on MT7621NeilBrown1-20/+22
Since commit 3af5a67c86a3 ("MIPS: Fix early CM probing") the MT7621 has not been able to boot. This commit caused mips_cm_probe() to be called before mt7621.c::proc_soc_init(). prom_soc_init() has a comment explaining that mips_cm_probe() "wipes out the bootloader config" and means that configuration registers are no longer available. It has some code to re-enable this config. Before this re-enable code is run, the sysc register cannot be read, so when SYSC_REG_CHIP_NAME0 is read, a garbage value is returned and panic() is called. If we move the config-repair code to the top of prom_soc_init(), the registers can be read and boot can proceed. Very occasionally, the first register read after the reconfiguration returns garbage, so add a call to __sync(). Fixes: 3af5a67c86a3 ("MIPS: Fix early CM probing") Signed-off-by: NeilBrown <neil@brown.name> Reviewed-by: Matt Redfearn <matt.redfearn@mips.com> Cc: John Crispin <john@phrozen.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 4.5+ Patchwork: https://patchwork.linux-mips.org/patch/18859/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-03-21MIPS: ralink: Remove ralink_halt()NeilBrown1-7/+0
ralink_halt() does nothing that machine_halt() doesn't already do, so it adds no value. It actually causes incorrect behaviour due to the "unreachable()" at the end. This tells the compiler that the end of the function will never be reached, which isn't true. The compiler responds by not adding a 'return' instruction, so control simply moves on to whatever bytes come afterwards in memory. In my tested, that was the ralink_restart() function. This means that an attempt to 'halt' the machine would actually cause a reboot. So remove ralink_halt() so that a 'halt' really does halt. Fixes: c06e836ada59 ("MIPS: ralink: adds reset code") Signed-off-by: NeilBrown <neil@brown.name> Cc: John Crispin <john@phrozen.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 3.9+ Patchwork: https://patchwork.linux-mips.org/patch/18851/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-03-21MIPS: lantiq: ase: Enable MFD_SYSCONMathias Kresin1-0/+2
Enable syscon to use it for the RCU MFD on Amazon SE as well. The Amazon SE also has similar reset controller system as Danube and XWAY and use their drivers mostly. As these drivers now need syscon also activate the syscon subsystem for for Amazon SE. Fixes: 2b6639d4c794 ("MIPS: lantiq: Enable MFD_SYSCON to be able to use it for the RCU MFD") Signed-off-by: Mathias Kresin <dev@kresin.me> Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: John Crispin <john@phrozen.org> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 4.14+ Patchwork: https://patchwork.linux-mips.org/patch/18817/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-03-21MIPS: lantiq: Enable AHB Bus for USBMathias Kresin1-3/+3
On Danube and AR9 the USB core is connected though a AHB bus to the main system cross bar, hence we need to enable the gating clock of the AHB Bus as well to make the USB controller work. Fixes: dea54fbad332 ("phy: Add an USB PHY driver for the Lantiq SoCs using the RCU module") Signed-off-by: Mathias Kresin <dev@kresin.me> Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: John Crispin <john@phrozen.org> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 4.14+ Patchwork: https://patchwork.linux-mips.org/patch/18814/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-03-21MIPS: lantiq: Fix Danube USB clockMathias Kresin1-1/+1
On Danube the USB0 controller registers are at 1e101000 and the USB0 PHY register is at 1f203018 similar to all other lantiq SoCs. Activate the USB controller gating clock thorough the USB controller driver and not the PHY. This fixes a problem introduced in a previous commit. Fixes: dea54fbad332 ("phy: Add an USB PHY driver for the Lantiq SoCs using the RCU module") Signed-off-by: Mathias Kresin <dev@kresin.me> Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: John Crispin <john@phrozen.org> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 4.14+ Patchwork: https://patchwork.linux-mips.org/patch/18816/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-03-21ARM: OMAP: Fix SRAM W+X mappingTony Lindgren2-11/+36
We are still using custom SRAM code for some SoCs and are not marking the PM code mapped to SRAM as read-only and executable after we're done. With CONFIG_DEBUG_WX=y, we will get "Found insecure W+X mapping at address" warning. Let's fix this issue the same way as commit 728bbe75c82f ("misc: sram: Introduce support code for protect-exec sram type") is doing for drivers/misc/sram-exec.c. On omap3, we need to restore SRAM when returning from off mode after idle, so init time configuration is not enough. And as we no longer have users for omap_sram_push_address() we can make it static while at it. Note that eventually we should be using sram-exec.c for all SoCs. Cc: stable@vger.kernel.org # v4.12+ Cc: Dave Gerlach <d-gerlach@ti.com> Reported-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Tony Lindgren <tony@atomide.com>
2018-03-20kvm/x86: fix icebp instruction handlingLinus Torvalds2-1/+9
The undocumented 'icebp' instruction (aka 'int1') works pretty much like 'int3' in the absense of in-circuit probing equipment (except, obviously, that it raises #DB instead of raising #BP), and is used by some validation test-suites as such. But Andy Lutomirski noticed that his test suite acted differently in kvm than on bare hardware. The reason is that kvm used an inexact test for the icebp instruction: it just assumed that an all-zero VM exit qualification value meant that the VM exit was due to icebp. That is not unlike the guess that do_debug() does for the actual exception handling case, but it's purely a heuristic, not an absolute rule. do_debug() does it because it wants to ascribe _some_ reasons to the #DB that happened, and an empty %dr6 value means that 'icebp' is the most likely casue and we have no better information. But kvm can just do it right, because unlike the do_debug() case, kvm actually sees the real reason for the #DB in the VM-exit interruption information field. So instead of relying on an inexact heuristic, just use the actual VM exit information that says "it was 'icebp'". Right now the 'icebp' instruction isn't technically documented by Intel, but that will hopefully change. The special "privileged software exception" information _is_ actually mentioned in the Intel SDM, even though the cause of it isn't enumerated. Reported-by: Andy Lutomirski <luto@kernel.org> Tested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-20x86/vsyscall/64: Use proper accessor to update P4D entryBoris Ostrovsky1-1/+1
Writing to it directly does not work for Xen PV guests. Fixes: 49275fef986a ("x86/vsyscall/64: Explicitly set _PAGE_USER in the pagetable hierarchy") Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180319143154.3742-1-boris.ostrovsky@oracle.com Signed-off-by: Ingo Molnar <mingo@kernel.org>