summaryrefslogtreecommitdiffstats
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2018-05-25KVM: arm64: Save host SVE context as appropriateDave Martin5-2/+40
This patch adds SVE context saving to the hyp FPSIMD context switch path. This means that it is no longer necessary to save the host SVE state in advance of entering the guest, when in use. In order to avoid adding pointless complexity to the code, VHE is assumed if SVE is in use. VHE is an architectural prerequisite for SVE, so there is no good reason to turn CONFIG_ARM64_VHE off in kernels that support both SVE and KVM. Historically, software models exist that can expose the architecturally invalid configuration of SVE without VHE, so if this situation is detected at kvm_init() time then KVM will be disabled. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25arm64/sve: Move sve_pffr() to fpsimd.h and make inlineDave Martin3-13/+24
In order to make sve_save_state()/sve_load_state() more easily reusable and to get rid of a potential branch on context switch critical paths, this patch makes sve_pffr() inline and moves it to fpsimd.h. <asm/processor.h> must be included in fpsimd.h in order to make this work, and this creates an #include cycle that is tricky to avoid without modifying core code, due to the way the PR_SVE_*() prctl helpers are included in the core prctl implementation. Instead of breaking the cycle, this patch defers inclusion of <asm/fpsimd.h> in <asm/processor.h> until the point where it is actually needed: i.e., immediately before the prctl definitions. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25arm64/sve: Switch sve_pffr() argument from task to threadDave Martin1-5/+5
sve_pffr(), which is used to derive the base address used for low-level SVE save/restore routines, currently takes the relevant task_struct as an argument. The only accessed fields are actually part of thread_struct, so this patch changes the argument type accordingly. This is done in preparation for moving this function to a header, where we do not want to have to include <linux/sched.h> due to the consequent circular #include problems. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25arm64/sve: Move read_zcr_features() out of cpufeature.hDave Martin5-29/+32
Having read_zcr_features() inline in cpufeature.h results in that header requiring #includes which make it hard to include <asm/fpsimd.h> elsewhere without triggering header inclusion cycles. This is not a hot-path function and arguably should not be in cpufeature.h in the first place, so this patch moves it to fpsimd.c, compiled conditionally if CONFIG_ARM64_SVE=y. This allows some SVE-related #includes to be dropped from cpufeature.h, which will ease future maintenance. A couple of missing #includes of <asm/fpsimd.h> are exposed by this change under arch/arm64/. This patch adds the missing #includes as necessary. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashingDave Martin8-31/+188
This patch refactors KVM to align the host and guest FPSIMD save/restore logic with each other for arm64. This reduces the number of redundant save/restore operations that must occur, and reduces the common-case IRQ blackout time during guest exit storms by saving the host state lazily and optimising away the need to restore the host state before returning to the run loop. Four hooks are defined in order to enable this: * kvm_arch_vcpu_run_map_fp(): Called on PID change to map necessary bits of current to Hyp. * kvm_arch_vcpu_load_fp(): Set up FP/SIMD for entering the KVM run loop (parse as "vcpu_load fp"). * kvm_arch_vcpu_ctxsync_fp(): Get FP/SIMD into a safe state for re-enabling interrupts after a guest exit back to the run loop. For arm64 specifically, this involves updating the host kernel's FPSIMD context tracking metadata so that kernel-mode NEON use will cause the vcpu's FPSIMD state to be saved back correctly into the vcpu struct. This must be done before re-enabling interrupts because kernel-mode NEON may be used by softirqs. * kvm_arch_vcpu_put_fp(): Save guest FP/SIMD state back to memory and dissociate from the CPU ("vcpu_put fp"). Also, the arm64 FPSIMD context switch code is updated to enable it to save back FPSIMD state for a vcpu, not just current. A few helpers drive this: * fpsimd_bind_state_to_cpu(struct user_fpsimd_state *fp): mark this CPU as having context fp (which may belong to a vcpu) currently loaded in its registers. This is the non-task equivalent of the static function fpsimd_bind_to_cpu() in fpsimd.c. * task_fpsimd_save(): exported to allow KVM to save the guest's FPSIMD state back to memory on exit from the run loop. * fpsimd_flush_state(): invalidate any context's FPSIMD state that is currently loaded. Used to disassociate the vcpu from the CPU regs on run loop exit. These changes allow the run loop to enable interrupts (and thus softirqs that may use kernel-mode NEON) without having to save the guest's FPSIMD state eagerly. Some new vcpu_arch fields are added to make all this work. Because host FPSIMD state can now be saved back directly into current's thread_struct as appropriate, host_cpu_context is no longer used for preserving the FPSIMD state. However, it is still needed for preserving other things such as the host's system registers. To avoid ABI churn, the redundant storage space in host_cpu_context is not removed for now. arch/arm is not addressed by this patch and continues to use its current save/restore logic. It could provide implementations of the helpers later if desired. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm64: Repurpose vcpu_arch.debug_flags for general-purpose flagsDave Martin6-19/+18
In struct vcpu_arch, the debug_flags field is used to store debug-related flags about the vcpu state. Since we are about to add some more flags related to FPSIMD and SVE, it makes sense to add them to the existing flags field rather than adding new fields. Since there is only one debug_flags flag defined so far, there is plenty of free space for expansion. In preparation for adding more flags, this patch renames the debug_flags field to simply "flags", and updates comments appropriately. The flag definitions are also moved to <asm/kvm_host.h>, since their presence in <asm/kvm_asm.h> was for purely historical reasons: these definitions are not used from asm any more, and not very likely to be as more Hyp asm is migrated to C. KVM_ARM64_DEBUG_DIRTY_SHIFT has not been used since commit 1ea66d27e7b0 ("arm64: KVM: Move away from the assembly version of the world switch"), so this patch gets rid of that too. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@arm.com> [maz: fixed minor conflict] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25arm64/sve: Refactor user SVE trap maintenance for external useDave Martin1-15/+15
In preparation for optimising the way KVM manages switching the guest and host FPSIMD state, it is necessary to provide a means for code outside arch/arm64/kernel/fpsimd.c to restore the user trap configuration for SVE correctly for the current task. Rather than requiring external code to duplicate the maintenance explicitly, this patch moves the trap maintenenace to fpsimd_bind_to_cpu(), since it is logically part of the work of associating the current task with the cpu. Because fpsimd_bind_to_cpu() is rather a cryptic name to publish alongside fpsimd_bind_state_to_cpu(), the former function is renamed to fpsimd_bind_task_to_cpu() to make its purpose more explicit. This patch makes appropriate changes to ensure that fpsimd_bind_task_to_cpu() is always called alongside task_fpsimd_load(), so that the trap maintenance continues to be done in every situation where it was done prior to this patch. As a side-effect, the metadata updates done by fpsimd_bind_task_to_cpu() now change from conditional to unconditional in the "already bound" case of sigreturn. This is harmless, and a couple of extra stores on this slow path will not impact performance. I consider this a reasonable price to pay for a slightly cleaner interface. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25arm64: fpsimd: Eliminate task->mm checksDave Martin2-25/+19
Currently the FPSIMD handling code uses the condition task->mm == NULL as a hint that task has no FPSIMD register context. The ->mm check is only there to filter out tasks that cannot possibly have FPSIMD context loaded, for optimisation purposes. Also, TIF_FOREIGN_FPSTATE must always be checked anyway before saving FPSIMD context back to memory. For these reasons, the ->mm checks are not useful, providing that TIF_FOREIGN_FPSTATE is maintained in a consistent way for all threads. The context switch logic is already deliberately optimised to defer reloads of the regs until ret_to_user (or sigreturn as a special case), and save them only if they have been previously loaded. These paths are the only places where the wrong_task and wrong_cpu conditions can be made false, by calling fpsimd_bind_task_to_cpu(). Kernel threads by definition never reach these paths. As a result, the wrong_task and wrong_cpu tests in fpsimd_thread_switch() will always yield true for kernel threads. This patch removes the redundant checks and special-case code, ensuring that TIF_FOREIGN_FPSTATE is set whenever a kernel thread is scheduled in, and ensures that this flag is set for the init task. The fpsimd_flush_task_state() call already present in copy_thread() ensures the same for any new task. With TIF_FOREIGN_FPSTATE always set for kernel threads, this patch ensures that no extra context save work is added for kernel threads, and eliminates the redundant context saving that may currently occur for kernel threads that have acquired an mm via use_mm(). Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25arm64: fpsimd: Avoid FPSIMD context leakage for the init taskDave Martin1-6/+7
The init task is started with thread_flags equal to 0, which means that TIF_FOREIGN_FPSTATE is initially clear. It is theoretically possible (if unlikely) that the init task could reach userspace without ever being scheduled out. If this occurs, data left in the FPSIMD registers by the kernel could be exposed. This patch fixes this anomaly by ensuring that the init task's initial TIF_FOREIGN_FPSTATE is set. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Fixes: 005f78cd8849 ("arm64: defer reloading a task's FPSIMD state to userland resume") Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25arm64: fpsimd: Generalise context saving for non-task contextsDave Martin1-12/+14
In preparation for allowing non-task (i.e., KVM vcpu) FPSIMD contexts to be handled by the fpsimd common code, this patch adapts task_fpsimd_save() to save back the currently loaded context, removing the explicit dependency on current. The relevant storage to write back to in memory is now found by examining the fpsimd_last_state percpu struct. fpsimd_save() does nothing unless TIF_FOREIGN_FPSTATE is clear, and fpsimd_last_state is updated under local_bh_disable() or local_irq_disable() everywhere that TIF_FOREIGN_FPSTATE is cleared: thus, fpsimd_save() will write back to the correct storage for the loaded context. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm64: Convert lazy FPSIMD context switch trap to CDave Martin2-35/+46
To make the lazy FPSIMD context switch trap code easier to hack on, this patch converts it to C. This is not amazingly efficient, but the trap should typically only be taken once per host context switch. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25arm64: Use update{,_tsk}_thread_flag()Dave Martin1-10/+8
This patch uses the new update_thread_flag() helpers to simplify a couple of if () set; else clear; constructs. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25arm64: fpsimd: Fix TIF_FOREIGN_FPSTATE after invalidating cpu regsDave Martin1-5/+2
fpsimd_last_state.st is set to NULL as a way of indicating that current's FPSIMD registers are no longer loaded in the cpu. In particular, this is done when the kernel temporarily uses or clobbers the FPSIMD registers for its own purposes, as in CPU PM or kernel-mode NEON, resulting in them being populated with garbage data not belonging to a task. Commit 17eed27b02da ("arm64/sve: KVM: Prevent guests from using SVE") factors this operation out as a new helper fpsimd_flush_cpu_state() to make it clearer what is being done here, and on SVE systems this helper is now used, via kvm_fpsimd_flush_cpu_state(), to invalidate the registers after KVM has run a vcpu. The reason for this is that KVM does not yet understand how to restore the full host SVE registers itself after loading the guest FPSIMD context into them. This exposes a particular problem: if fpsimd_last_state.st is set to NULL without also setting TIF_FOREIGN_FPSTATE, the kernel may continue to think that current's FPSIMD registers are live even though they have actually been clobbered. Prior to the aforementioned commit, the only path where fpsimd_last_state.st is set to NULL without setting TIF_FOREIGN_FPSTATE is when kernel_neon_begin() is called by a kernel thread (where current->mm can be NULL). This does not matter, because the only harm is that at context-switch time fpsimd_thread_switch() may unnecessarily save the FPSIMD registers back to current's thread_struct (even though kernel threads are not considered to have any FPSIMD context of their own and the registers will never be reloaded). Note that although CPU_PM_ENTER lacks the TIF_FOREIGN_FPSTATE setting, every CPU passing through that path must subsequently pass through CPU_PM_EXIT before it can re-enter the kernel proper. CPU_PM_EXIT sets the flag. The sve_flush_cpu_state() function added by commit 17eed27b02da also lacks the proper maintenance of TIF_FOREIGN_FPSTATE. This may cause the bits of a host task's SVE registers that do not alias the FPSIMD register file to spontaneously appear zeroed if a KVM vcpu runs in the same task in the meantime. Although this effect is hidden by the fact that the non-FPSIMD bits of the SVE registers are zeroed by a syscall anyway, it is doubtless a bad idea to rely on these different code paths interacting correctly under future maintenance. This patch makes TIF_FOREIGN_FPSTATE an unconditional side-effect of fpsimd_flush_cpu_state(), and removes the set_thread_flag() calls that become redundant as a result. This ensures that TIF_FOREIGN_FPSTATE cannot remain clear if the FPSIMD state in the FPSIMD registers is invalid. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-20arm64: KVM: Use lm_alias() for kvm_ksym_ref()Mark Rutland1-2/+5
For historical reasons, we open-code lm_alias() in kvm_ksym_ref(). Let's use lm_alias() to avoid duplication and make things clearer. As we have to pull this from <linux/mm.h> (which is not safe for inclusion in assembly), we may as well move the kvm_ksym_ref() definition into the existing !__ASSEMBLY__ block. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: kvmarm@lists.cs.columbia.edu Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-06Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds3-23/+40
Pll KVM fixes from Radim Krčmář: "ARM: - Fix proxying of GICv2 CPU interface accesses - Fix crash when switching to BE - Track source vcpu git GICv2 SGIs - Fix an outdated bit of documentation x86: - Speed up injection of expired timers (for stable)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: remove APIC Timer periodic/oneshot spikes arm64: vgic-v2: Fix proxying of cpuif access KVM: arm/arm64: vgic_init: Cleanup reference to process_maintenance KVM: arm64: Fix order of vcpu_write_sys_reg() arguments KVM: arm/arm64: vgic: Fix source vcpu issues for GICv2 SGI
2018-05-06Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds1-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Thomas Gleixner: "Unbreak the CPUID CPUID_8000_0008_EBX reload which got dropped when the evaluation of physical and virtual bits which uses the same CPUID leaf was moved out of get_cpu_cap()" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Restore CPUID_8000_0008_EBX reload
2018-05-06Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds1-11/+11
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull clocksource fixes from Thomas Gleixner: "The recent addition of the early TSC clocksource breaks on machines which have an unstable TSC because in case that TSC is disabled, then the clocksource selection logic falls back to the early TSC which is obviously bogus. That also unearthed a few robustness issues in the clocksource derating code which are addressed as well" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource: Rework stale comment clocksource: Consistent de-rate when marking unstable x86/tsc: Fix mark_tsc_unstable() clocksource: Initialize cs->wd_list clocksource: Allow clocksource_mark_unstable() on unregistered clocksources x86/tsc: Always unregister clocksource_tsc_early
2018-05-05KVM: x86: remove APIC Timer periodic/oneshot spikesAnthoine Bourgeois1-17/+20
Since the commit "8003c9ae204e: add APIC Timer periodic/oneshot mode VMX preemption timer support", a Windows 10 guest has some erratic timer spikes. Here the results on a 150000 times 1ms timer without any load: Before 8003c9ae204e | After 8003c9ae204e Max 1834us | 86000us Mean 1100us | 1021us Deviation 59us | 149us Here the results on a 150000 times 1ms timer with a cpu-z stress test: Before 8003c9ae204e | After 8003c9ae204e Max 32000us | 140000us Mean 1006us | 1997us Deviation 140us | 11095us The root cause of the problem is starting hrtimer with an expiry time already in the past can take more than 20 milliseconds to trigger the timer function. It can be solved by forward such past timers immediately, rather than submitting them to hrtimer_start(). In case the timer is periodic, update the target expiration and call hrtimer_start with it. v2: Check if the tsc deadline is already expired. Thank you Mika. v3: Execute the past timers immediately rather than submitting them to hrtimer_start(). v4: Rearm the periodic timer with advance_periodic_target_expiration() a simpler version of set_target_expiration(). Thank you Paolo. Cc: Mika Penttilä <mika.penttila@nextfour.com> Cc: Wanpeng Li <kernellwp@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Anthoine Bourgeois <anthoine.bourgeois@blade-group.com> 8003c9ae204e ("KVM: LAPIC: add APIC Timer periodic/oneshot mode VMX preemption timer support") Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-05-05Merge tag 'kvmarm-fixes-for-4.17-2' of ↵Radim Krčmář2-6/+20
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm KVM/arm fixes for 4.17, take #2 - Fix proxying of GICv2 CPU interface accesses - Fix crash when switching to BE - Track source vcpu git GICv2 SGIs - Fix an outdated bit of documentation
2018-05-04Merge tag 'for-linus-4.17-rc4-tag' of ↵Linus Torvalds1-55/+31
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen cleanup from Juergen Gross: "One cleanup to remove VLAs from the kernel" * tag 'for-linus-4.17-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: Remove use of VLAs
2018-05-04arm64: vgic-v2: Fix proxying of cpuif accessJames Morse1-5/+19
Proxying the cpuif accesses at EL2 makes use of vcpu_data_guest_to_host and co, which check the endianness, which call into vcpu_read_sys_reg... which isn't mapped at EL2 (it was inlined before, and got moved OoL with the VHE optimizations). The result is of course a nice panic. Let's add some specialized cruft to keep the broken platforms that require this hack alive. But, this code used vcpu_data_guest_to_host(), which expected us to write the value to host memory, instead we have trapped the guest's read or write to an mmio-device, and are about to replay it using the host's readl()/writel() which also perform swabbing based on the host endianness. This goes wrong when both host and guest are big-endian, as readl()/writel() will undo the guest's swabbing, causing the big-endian value to be written to device-memory. What needs doing? A big-endian guest will have pre-swabbed data before storing, undo this. If its necessary for the host, writel() will re-swab it. For a read a big-endian guest expects to swab the data after the load. The hosts's readl() will correct for host endianness, giving us the device-memory's value in the register. For a big-endian guest, swab it as if we'd only done the load. For a little-endian guest, nothing needs doing as readl()/writel() leave the correct device-memory value in registers. Tested on Juno with that rarest of things: a big-endian 64K host. Based on a patch from Marc Zyngier. Reported-by: Suzuki K Poulose <suzuki.poulose@arm.com> Fixes: bf8feb39642b ("arm64: KVM: vgic-v2: Add GICV access from HYP") Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-04KVM: arm64: Fix order of vcpu_write_sys_reg() argumentsJames Morse1-1/+1
A typo in kvm_vcpu_set_be()'s call: | vcpu_write_sys_reg(vcpu, SCTLR_EL1, sctlr) causes us to use the 32bit register value as an index into the sys_reg[] array, and sail off the end of the linear map when we try to bring up big-endian secondaries. | Unable to handle kernel paging request at virtual address ffff80098b982c00 | Mem abort info: | ESR = 0x96000045 | Exception class = DABT (current EL), IL = 32 bits | SET = 0, FnV = 0 | EA = 0, S1PTW = 0 | Data abort info: | ISV = 0, ISS = 0x00000045 | CM = 0, WnR = 1 | swapper pgtable: 4k pages, 48-bit VAs, pgdp = 000000002ea0571a | [ffff80098b982c00] pgd=00000009ffff8803, pud=0000000000000000 | Internal error: Oops: 96000045 [#1] PREEMPT SMP | Modules linked in: | CPU: 2 PID: 1561 Comm: kvm-vcpu-0 Not tainted 4.17.0-rc3-00001-ga912e2261ca6-dirty #1323 | Hardware name: ARM Juno development board (r1) (DT) | pstate: 60000005 (nZCv daif -PAN -UAO) | pc : vcpu_write_sys_reg+0x50/0x134 | lr : vcpu_write_sys_reg+0x50/0x134 | Process kvm-vcpu-0 (pid: 1561, stack limit = 0x000000006df4728b) | Call trace: | vcpu_write_sys_reg+0x50/0x134 | kvm_psci_vcpu_on+0x14c/0x150 | kvm_psci_0_2_call+0x244/0x2a4 | kvm_hvc_call_handler+0x1cc/0x258 | handle_hvc+0x20/0x3c | handle_exit+0x130/0x1ec | kvm_arch_vcpu_ioctl_run+0x340/0x614 | kvm_vcpu_ioctl+0x4d0/0x840 | do_vfs_ioctl+0xc8/0x8d0 | ksys_ioctl+0x78/0xa8 | sys_ioctl+0xc/0x18 | el0_svc_naked+0x30/0x34 | Code: 73620291 604d00b0 00201891 1ab10194 (957a33f8) |---[ end trace 4b4a4f9628596602 ]--- Fix the order of the arguments. Fixes: 8d404c4c24613 ("KVM: arm64: Rewrite system register accessors to read/write functions") CC: Christoffer Dall <cdall@cs.columbia.edu> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds1-4/+14
Pull networking fixes from David Miller: 1) Various sockmap fixes from John Fastabend (pinned map handling, blocking in recvmsg, double page put, error handling during redirect failures, etc.) 2) Fix dead code handling in x86-64 JIT, from Gianluca Borello. 3) Missing device put in RDS IB code, from Dag Moxnes. 4) Don't process fast open during repair mode in TCP< from Yuchung Cheng. 5) Move address/port comparison fixes in SCTP, from Xin Long. 6) Handle add a bond slave's master into a bridge properly, from Hangbin Liu. 7) IPv6 multipath code can operate on unitialized memory due to an assumption that the icmp header is in the linear SKB area. Fix from Eric Dumazet. 8) Don't invoke do_tcp_sendpages() recursively via TLS, from Dave Watson. 9) Fix memory leaks in x86-64 JIT, from Daniel Borkmann. 10) RDS leaks kernel memory to userspace, from Eric Dumazet. 11) DCCP can invoke a tasklet on a freed socket, take a refcount. Also from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (78 commits) dccp: fix tasklet usage smc: fix sendpage() call net/smc: handle unregistered buffers net/smc: call consolidation qed: fix spelling mistake: "offloded" -> "offloaded" net/mlx5e: fix spelling mistake: "loobpack" -> "loopback" tcp: restore autocorking rds: do not leak kernel memory to user land qmi_wwan: do not steal interfaces from class drivers ipv4: fix fnhe usage by non-cached routes bpf: sockmap, fix error handling in redirect failures bpf: sockmap, zero sg_size on error when buffer is released bpf: sockmap, fix scatterlist update on error path in send with apply net_sched: fq: take care of throttled flows before reuse ipv6: Revert "ipv6: Allow non-gateway ECMP for IPv6" bpf, x64: fix memleak when not converging on calls bpf, x64: fix memleak when not converging after image net/smc: restrict non-blocking connect finish 8139too: Use disable_irq_nosync() in rtl8139_poll_controller() sctp: fix the issue that the cookie-ack with auth can't get processed ...
2018-05-03Merge branch 'parisc-4.17-4' of ↵Linus Torvalds6-6/+21
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: "Fix two section mismatches, convert to read_persistent_clock64(), add further documentation regarding the HPMC crash handler and make bzImage the default build target" * 'parisc-4.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Fix section mismatches parisc: drivers.c: Fix section mismatches parisc: time: Convert read_persistent_clock() to read_persistent_clock64() parisc: Document rules regarding checksum of HPMC handler parisc: Make bzImage default build target
2018-05-02parisc: Fix section mismatchesHelge Deller2-2/+2
Fix three section mismatches: 1) Section mismatch in reference from the function ioread8() to the function .init.text:pcibios_init_bridge() 2) Section mismatch in reference from the function free_initmem() to the function .init.text:map_pages() 3) Section mismatch in reference from the function ccio_ioc_init() to the function .init.text:count_parisc_driver() Signed-off-by: Helge Deller <deller@gmx.de>
2018-05-02parisc: drivers.c: Fix section mismatchesHelge Deller1-3/+4
Fix two section mismatches in drivers.c: 1) Section mismatch in reference from the function alloc_tree_node() to the function .init.text:create_tree_node(). 2) Section mismatch in reference from the function walk_native_bus() to the function .init.text:alloc_pa_dev(). Signed-off-by: Helge Deller <deller@gmx.de>
2018-05-02bpf, x64: fix memleak when not converging on callsDaniel Borkmann1-1/+1
The JIT logic in jit_subprogs() is as follows: for all subprogs we allocate a bpf_prog_alloc(), populate it (prog->is_func = 1 here), and pass it to bpf_int_jit_compile(). If a failure occurred during JIT and prog->jited is not set, then we bail out from attempting to JIT the whole program, and punt to the interpreter instead. In case JITing went successful, we fixup BPF call offsets and do another pass to bpf_int_jit_compile() (extra_pass is true at that point) to complete JITing calls. Given that requires to pass JIT context around addrs and jit_data from x86 JIT are freed in the extra_pass in bpf_int_jit_compile() when calls are involved (if not, they can be freed immediately). However, if in the original pass, the JIT image didn't converge then we leak addrs and jit_data since image itself is NULL, the prog->is_func is set and extra_pass is false in that case, meaning both will become unreachable and are never cleaned up, therefore we need to free as well on !image. Only x64 JIT is affected. Fixes: 1c2a088a6626 ("bpf: x64: add JIT support for multi-function programs") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-05-02bpf, x64: fix memleak when not converging after imageDaniel Borkmann1-2/+2
While reviewing x64 JIT code, I noticed that we leak the prior allocated JIT image in the case where proglen != oldproglen during the JIT passes. Prior to the commit e0ee9c12157d ("x86: bpf_jit: fix two bugs in eBPF JIT compiler") we would just break out of the loop, and using the image as the JITed prog since it could only shrink in size anyway. After e0ee9c12157d, we would bail out to out_addrs label where we free addrs and jit_data but not the image coming from bpf_jit_binary_alloc(). Fixes: e0ee9c12157d ("x86: bpf_jit: fix two bugs in eBPF JIT compiler") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-05-02x86/cpu: Restore CPUID_8000_0008_EBX reloadThomas Gleixner1-1/+5
The recent commt which addresses the x86_phys_bits corruption with encrypted memory on CPUID reload after a microcode update lost the reload of CPUID_8000_0008_EBX as well. As a consequence IBRS and IBRS_FW are not longer detected Restore the behaviour by bringing the reload of CPUID_8000_0008_EBX back. This restore has a twist due to the convoluted way the cpuid analysis works: CPUID_8000_0008_EBX is used by AMD to enumerate IBRB, IBRS, STIBP. On Intel EBX is not used. But the speculation control code sets the AMD bits when running on Intel depending on the Intel specific speculation control bits. This was done to use the same bits for alternatives. The change which moved the 8000_0008 evaluation out of get_cpu_cap() broke this nasty scheme due to ordering. So that on Intel the store to CPUID_8000_0008_EBX clears the IBRB, IBRS, STIBP bits which had been set before by software. So the actual CPUID_8000_0008_EBX needs to go back to the place where it was and the phys/virt address space calculation cannot touch it. In hindsight this should have used completely synthetic bits for IBRB, IBRS, STIBP instead of reusing the AMD bits, but that's for 4.18. /me needs to find time to cleanup that steaming pile of ... Fixes: d94a155c59c9 ("x86/cpu: Prevent cpuinfo_x86::x86_phys_bits adjustment corruption") Reported-by: Jörg Otte <jrg.otte@gmail.com> Reported-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jörg Otte <jrg.otte@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: kirill.shutemov@linux.intel.com Cc: Borislav Petkov <bp@alien8.de Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1805021043510.1668@nanos.tec.linutronix.de
2018-05-02x86/tsc: Fix mark_tsc_unstable()Peter Zijlstra1-7/+5
mark_tsc_unstable() also needs to affect tsc_early, Now that clocksource_mark_unstable() can be used on a clocksource irrespective of its registration state, use it on both tsc_early and tsc. This does however require cs->list to be initialized empty, otherwise it cannot tell the registation state before registation. Fixes: aa83c45762a2 ("x86/tsc: Introduce early tsc clocksource") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Diego Viola <diego.viola@gmail.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: len.brown@intel.com Cc: rjw@rjwysocki.net Cc: rui.zhang@intel.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180430100344.533326547@infradead.org
2018-05-02x86/tsc: Always unregister clocksource_tsc_earlyPeter Zijlstra1-4/+6
Don't leave the tsc-early clocksource registered if it errors out early. This was reported by Diego, who on his Core2 era machine got TSC invalidated while it was running with tsc-early (due to C-states). This results in keeping tsc-early with very bad effects. Reported-and-Tested-by: Diego Viola <diego.viola@gmail.com> Fixes: aa83c45762a2 ("x86/tsc: Introduce early tsc clocksource") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: len.brown@intel.com Cc: rjw@rjwysocki.net Cc: diego.viola@gmail.com Cc: rui.zhang@intel.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180430100344.350507853@infradead.org
2018-05-01Merge branch 'for-linus' of ↵Linus Torvalds2-0/+7
git://git.kernel.org/pub/scm/linux/kernel/git/rkuo/linux-hexagon-kernel Pull hexagon fixes from Richard Kuo: "Some small fixes for module compilation" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rkuo/linux-hexagon-kernel: hexagon: export csum_partial_copy_nocheck hexagon: add memset_io() helper
2018-05-01hexagon: export csum_partial_copy_nocheckArnd Bergmann1-0/+1
This is needed to link ipv6 as a loadable module, which in turn happens in allmodconfig. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Richard Kuo <rkuo@codeaurora.org>
2018-05-01hexagon: add memset_io() helperArnd Bergmann1-0/+6
We already have memcpy_toio(), but not memset_io(), so let's add the obvious version to allow building an allmodconfig kernel without errors like drivers/gpu/drm/ttm/ttm_bo_util.c: In function 'ttm_bo_move_memcpy': drivers/gpu/drm/ttm/ttm_bo_util.c:390:3: error: implicit declaration of function 'memset_io' [-Werror=implicit-function-declaration] Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Richard Kuo <rkuo@codeaurora.org>
2018-04-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds2-2/+2
- Fixup license text for oradax driver, from Rob Gardner. - Release device object with put_device() instead of straight kfree(), from Arvind Yadav. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc: vio: use put_device() instead of kfree() sparc64: Fix mistake in oradax license text
2018-04-30sparc: vio: use put_device() instead of kfree()Arvind Yadav1-1/+1
Never directly free @dev after calling device_register(), even if it returned an error. Always use put_device() to give up the reference initialized. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-30sparc64: Fix mistake in oradax license textRob Gardner1-1/+1
The license text in both oradax files mistakenly specifies "version 3" of the GNU General Public License. This is corrected to specify "version 2". Signed-off-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: Jonathan Helman <jonathan.helman@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-29Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds12-19/+93
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "Another set of x86 related updates: - Fix the long broken x32 version of the IPC user space headers which was noticed by Arnd Bergman in course of his ongoing y2038 work. GLIBC seems to have non broken private copies of these headers so this went unnoticed. - Two microcode fixlets which address some more fallout from the recent modifications in that area: - Unconditionally save the microcode patch, which was only saved when CPU_HOTPLUG was enabled causing failures in the late loading mechanism - Make the later loader synchronization finally work under all circumstances. It was exiting early and causing timeout failures due to a missing synchronization point. - Do not use mwait_play_dead() on AMD systems to prevent excessive power consumption as the CPU cannot go into deep power states from there. - Address an annoying sparse warning due to lost type qualifiers of the vmemmap and vmalloc base address constants. - Prevent reserving crash kernel region on Xen PV as this leads to the wrong perception that crash kernels actually work there which is not the case. Xen PV has its own crash mechanism handled by the hypervisor. - Add missing TLB cpuid values to the table to make the printout on certain machines correct. - Enumerate the new CLDEMOTE instruction - Fix an incorrect SPDX identifier - Remove stale macros" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ipc: Fix x32 version of shmid64_ds and msqid64_ds x86/setup: Do not reserve a crash kernel region if booted on Xen PV x86/cpu/intel: Add missing TLB cpuid values x86/smpboot: Don't use mwait_play_dead() on AMD systems x86/mm: Make vmemmap and vmalloc base address constants unsigned long x86/vector: Remove the unused macro FPU_IRQ x86/vector: Remove the macro VECTOR_OFFSET_START x86/cpufeatures: Enumerate cldemote instruction x86/microcode: Do not exit early from __reload_late() x86/microcode/intel: Save microcode patch unconditionally x86/jailhouse: Fix incorrect SPDX identifier
2018-04-29Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds5-19/+68
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 pti fixes from Thomas Gleixner: "A set of updates for the x86/pti related code: - Preserve r8-r11 in int $0x80. r8-r11 need to be preserved, but the int$80 entry code removed that quite some time ago. Make it correct again. - A set of fixes for the Global Bit work which went into 4.17 and caused a bunch of interesting regressions: - Triggering a BUG in the page attribute code due to a missing check for early boot stage - Warnings in the page attribute code about holes in the kernel text mapping which are caused by the freeing of the init code. Handle such holes gracefully. - Reduce the amount of kernel memory which is set global to the actual text and do not incidentally overlap with data. - Disable the global bit when RANDSTRUCT is enabled as it partially defeats the hardening. - Make the page protection setup correct for vma->page_prot population again. The adjustment of the protections fell through the crack during the Global bit rework and triggers warnings on machines which do not support certain features, e.g. NX" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry/64/compat: Preserve r8-r11 in int $0x80 x86/pti: Filter at vma->vm_page_prot population x86/pti: Disallow global kernel text with RANDSTRUCT x86/pti: Reduce amount of kernel text allowed to be Global x86/pti: Fix boot warning from Global-bit setting x86/pti: Fix boot problems from Global-bit setting
2018-04-29Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds1-3/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "The perf update contains the following bits: x86: - Prevent setting freeze_on_smi on PerfMon V1 CPUs to avoid #GP perf stat: - Keep the '/' event modifier separator in fallback, for example when fallbacking from 'cpu/cpu-cycles/' to user level only, where it should become 'cpu/cpu-cycles/u' and not 'cpu/cpu-cycles/:u' (Jiri Olsa) - Fix PMU events parsing rule, improving error reporting for invalid events (Jiri Olsa) - Disable write_backward and other event attributes for !group events in a group, fixing, for instance this group: '{cycles,msr/aperf/}:S' that has leader sampling (:S) and where just the 'cycles', the leader event, should have the write_backward attribute set, in this case it all fails because the PMU where 'msr/aperf/' lives doesn't accepts write_backward style sampling (Jiri Olsa) - Only fall back group read for leader (Kan Liang) - Fix core PMU alias list for x86 platform (Kan Liang) - Print out hint for mixed PMU group error (Kan Liang) - Fix duplicate PMU name for interval print (Kan Liang) Core: - Set main kernel end address properly when reading kernel and module maps (Namhyung Kim) perf mem: - Fix incorrect entries and add missing man options (Sangwon Hong) s/390: - Remove s390 specific strcmp_cpuid_cmp function (Thomas Richter) - Adapt 'perf test' case record+probe_libc_inet_pton.sh for s390 - Fix s390 undefined record__auxtrace_init() return value in 'perf record' (Thomas Richter)" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Don't enable freeze-on-smi for PerfMon V1 perf stat: Fix duplicate PMU name for interval print perf evsel: Only fall back group read for leader perf stat: Print out hint for mixed PMU group error perf pmu: Fix core PMU alias list for X86 platform perf record: Fix s390 undefined record__auxtrace_init() return value perf mem: Document incorrect and missing options perf evsel: Disable write_backward for leader sampling group events perf pmu: Fix pmu events parsing rule perf stat: Keep the / modifier separator in fallback perf test: Adapt test case record+probe_libc_inet_pton.sh for s390 perf list: Remove s390 specific strcmp_cpuid_cmp function perf machine: Set main kernel end address properly
2018-04-28Merge tag 'powerpc-4.17-4' of ↵Linus Torvalds8-48/+132
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "A bunch of fixes, mostly for existing code and going to stable. Our memory hot-unplug path wasn't flushing the cache before removing memory. That is a problem now that we are doing memory hotplug on bare metal. Three fixes for the NPU code that supports devices connected via NVLink (ie. GPUs). The main one tweaks the TLB flush algorithm to avoid soft lockups for large flushes. A fix for our memory error handling where we would loop infinitely, returning back to the bad access and hard lockup the CPU. Fixes for the OPAL RTC driver, which wasn't handling some error cases correctly. A fix for a hardlockup in the powernv cpufreq driver. And finally two fixes to our smp_send_stop(), required due to a recent change to use it on shutdown. Thanks to: Alistair Popple, Balbir Singh, Laurentiu Tudor, Mahesh Salgaonkar, Mark Hairgrove, Nicholas Piggin, Rashmica Gupta, Shilpasri G Bhat" * tag 'powerpc-4.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/kvm/booke: Fix altivec related build break powerpc: Fix deadlock with multiple calls to smp_send_stop cpufreq: powernv: Fix hardlockup due to synchronous smp_call in timer interrupt powerpc: Fix smp_send_stop NMI IPI handling rtc: opal: Fix OPAL RTC driver OPAL_BUSY loops powerpc/mce: Fix a bug where mce loops on memory UE. powerpc/powernv/npu: Do a PID GPU TLB flush when invalidating a large address range powerpc/powernv/npu: Prevent overwriting of pnv_npu2_init_contex() callback parameters powerpc/powernv/npu: Add lock to prevent race in concurrent context init/destroy powerpc/powernv/memtrace: Let the arch hotunplug code flush cache powerpc/mm: Flush cache on memory hot(un)plug
2018-04-27rMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds9-22/+50
Pull KVM fixes from Radim Krčmář: "ARM: - PSCI selection API, a leftover from 4.16 (for stable) - Kick vcpu on active interrupt affinity change - Plug a VMID allocation race on oversubscribed systems - Silence debug messages - Update Christoffer's email address (linaro -> arm) x86: - Expose userspace-relevant bits of a newly added feature - Fix TLB flushing on VMX with VPID, but without EPT" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: x86/headers/UAPI: Move DISABLE_EXITS KVM capability bits to the UAPI kvm: apic: Flush TLB after APIC mode/address change if VPIDs are in use arm/arm64: KVM: Add PSCI version selection API KVM: arm/arm64: vgic: Kick new VCPU on interrupt migration arm64: KVM: Demote SVE and LORegion warnings to debug only MAINTAINERS: Update e-mail address for Christoffer Dall KVM: arm/arm64: Close VMID generation race
2018-04-27Merge tag 'arm64-fixes' of ↵Linus Torvalds10-17/+27
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "Nothing too bad, but the spectre updates to smatch identified a few places that may need sanitising so we've got those covered. Details: - Close some potential spectre-v1 vulnerabilities found by smatch - Add missing list sentinel for CPUs that don't require KPTI - Removal of unused 'addr' parameter for I/D cache coherency - Removal of redundant set_fs(KERNEL_DS) calls in ptrace - Fix single-stepping state machine handling in response to kernel traps - Clang support for 128-bit integers - Avoid instrumenting our out-of-line atomics in preparation for enabling LSE atomics by default in 4.18" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: avoid instrumenting atomic_ll_sc.o KVM: arm/arm64: vgic: fix possible spectre-v1 in vgic_mmio_read_apr() KVM: arm/arm64: vgic: fix possible spectre-v1 in vgic_get_irq() arm64: fix possible spectre-v1 in ptrace_hbp_get_event() arm64: support __int128 with clang arm64: only advance singlestep for user instruction traps arm64/kernel: rename module_emit_adrp_veneer->module_emit_veneer_for_adrp arm64: ptrace: remove addr_limit manipulation arm64: mm: drop addr parameter from sync icache and dcache arm64: add sentinel to kpti_safe_list
2018-04-27x86/headers/UAPI: Move DISABLE_EXITS KVM capability bits to the UAPIKarimAllah Ahmed1-7/+0
Move DISABLE_EXITS KVM capability bits to the UAPI just like the rest of capabilities. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-04-27Merge tag 'armsoc-fixes' of ↵Linus Torvalds18-71/+196
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Arnd Bergmann: "This round of fixes has two larger changes that came in last week: - a couple of patches all intended to finally turn on USB support on various Amlogic SoC based boards. The respective driver were not finalized until very late before the merge window and the DT portion is the last bit now. - a defconfig update for gemini that had repeatedly missed the cut but that is required to actually boot any real machines with the default build. The rest are the usual small changes: - a fix for a nasty build regression on the OMAP memory drivers - a fix for a boot problem on Intel/Altera SocFPGA - a MAINTAINER file update - a couple of fixes for issues found by automated testing (kernelci, coverity, sparse, ...) - a few incorrect DT entries are updated to match the hardware" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: defconfig: Update Gemini defconfig ARM: s3c24xx: jive: Fix some GPIO names HISI LPC: Add Kconfig MFD_CORE dependency ARM: dts: Fix NAS4220B pin config MAINTAINERS: Remove myself as maintainer arm64: dts: correct SATA addresses for Stingray ARM64: dts: meson-gxm-khadas-vim2: enable the USB controller ARM64: dts: meson-gxl-nexbox-a95x: enable the USB controller ARM64: dts: meson-gxl-s905x-libretech-cc: enable the USB controller ARM64: dts: meson-gx-p23x-q20x: enable the USB controller ARM64: dts: meson-gxl-s905x-p212: enable the USB controller ARM64: dts: meson-gxm: add GXM specific USB host configuration ARM64: dts: meson-gxl: add USB host support ARM: OMAP2+: Fix build when using split object directories soc: bcm2835: Make !RASPBERRYPI_FIRMWARE dummies return failure soc: bcm: raspberrypi-power: Fix use of __packed ARM: dts: Fix cm2 and prm sizes for omap4 ARM: socfpga_defconfig: Remove QSPI Sector 4K size force firmware: arm_scmi: remove redundant null check on array arm64: dts: juno: drop unnecessary address-cells and size-cells properties
2018-04-27kvm: apic: Flush TLB after APIC mode/address change if VPIDs are in useJunaid Shahid1-10/+4
Currently, KVM flushes the TLB after a change to the APIC access page address or the APIC mode when EPT mode is enabled. However, even in shadow paging mode, a TLB flush is needed if VPIDs are being used, as specified in the Intel SDM Section 29.4.5. So replace vmx_flush_tlb_ept_only() with vmx_flush_tlb(), which will flush if either EPT or VPIDs are in use. Signed-off-by: Junaid Shahid <junaids@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-04-27x86/entry/64/compat: Preserve r8-r11 in int $0x80Andy Lutomirski1-4/+4
32-bit user code that uses int $80 doesn't care about r8-r11. There is, however, some 64-bit user code that intentionally uses int $0x80 to invoke 32-bit system calls. From what I've seen, basically all such code assumes that r8-r15 are all preserved, but the kernel clobbers r8-r11. Since I doubt that there's any code that depends on int $0x80 zeroing r8-r11, change the kernel to preserve them. I suspect that very little user code is broken by the old clobber, since r8-r11 are only rarely allocated by gcc, and they're clobbered by function calls, so they only way we'd see a problem is if the same function that invokes int $0x80 also spills something important to one of these registers. The current behavior seems to date back to the historical commit "[PATCH] x86-64 merge for 2.6.4". Before that, all regs were preserved. I can't find any explanation of why this change was made. Update the test_syscall_vdso_32 testcase as well to verify the new behavior, and it strengthens the test to make sure that the kernel doesn't accidentally permute r8..r15. Suggested-by: Denys Vlasenko <dvlasenk@redhat.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Link: https://lkml.kernel.org/r/d4c4d9985fbe64f8c9e19291886453914b48caee.1523975710.git.luto@kernel.org
2018-04-27x86/ipc: Fix x32 version of shmid64_ds and msqid64_dsArnd Bergmann2-0/+73
A bugfix broke the x32 shmid64_ds and msqid64_ds data structure layout (as seen from user space) a few years ago: Originally, __BITS_PER_LONG was defined as 64 on x32, so we did not have padding after the 64-bit __kernel_time_t fields, After __BITS_PER_LONG got changed to 32, applications would observe extra padding. In other parts of the uapi headers we seem to have a mix of those expecting either 32 or 64 on x32 applications, so we can't easily revert the path that broke these two structures. Instead, this patch decouples x32 from the other architectures and moves it back into arch specific headers, partially reverting the even older commit 73a2d096fdf2 ("x86: remove all now-duplicate header files"). It's not clear whether this ever made any difference, since at least glibc carries its own (correct) copy of both of these header files, so possibly no application has ever observed the definitions here. Based on a suggestion from H.J. Lu, I tried out the tool from https://github.com/hjl-tools/linux-header to find other such bugs, which pointed out the same bug in statfs(), which also has a separate (correct) copy in glibc. Fixes: f4b4aae18288 ("x86/headers/uapi: Fix __BITS_PER_LONG value for x32 builds") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: "H . J . Lu" <hjl.tools@gmail.com> Cc: Jeffrey Walton <noloader@gmail.com> Cc: stable@vger.kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lkml.kernel.org/r/20180424212013.3967461-1-arnd@arndb.de
2018-04-27x86/setup: Do not reserve a crash kernel region if booted on Xen PVPetr Tesarik1-0/+6
Xen PV domains cannot shut down and start a crash kernel. Instead, the crashing kernel makes a SCHEDOP_shutdown hypercall with the reason code SHUTDOWN_crash, cf. xen_crash_shutdown() machine op in arch/x86/xen/enlighten_pv.c. A crash kernel reservation is merely a waste of RAM in this case. It may also confuse users of kexec_load(2) and/or kexec_file_load(2). When flags include KEXEC_ON_CRASH or KEXEC_FILE_ON_CRASH, respectively, these syscalls return success, which is technically correct, but the crash kexec image will never be actually used. Signed-off-by: Petr Tesarik <ptesarik@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: xen-devel@lists.xenproject.org Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@suse.de> Cc: Jean Delvare <jdelvare@suse.de> Link: https://lkml.kernel.org/r/20180425120835.23cef60c@ezekiel.suse.cz
2018-04-27arm64: avoid instrumenting atomic_ll_sc.oMark Rutland1-0/+4
Our out-of-line atomics are built with a special calling convention, preventing pointless stack spilling, and allowing us to patch call sites with ARMv8.1 atomic instructions. Instrumentation inserted by the compiler may result in calls to functions not following this special calling convention, resulting in registers being unexpectedly clobbered, and various problems resulting from this. For example, if a kernel is built with KCOV and ARM64_LSE_ATOMICS, the compiler inserts calls to __sanitizer_cov_trace_pc in the prologues of the atomic functions. This has been observed to result in spurious cmpxchg failures, leading to a hang early on in the boot process. This patch avoids such issues by preventing instrumentation of our out-of-line atomics. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>