summaryrefslogtreecommitdiffstats
path: root/arch/arm64/include/asm/kvm_emulate.h
AgeCommit message (Collapse)AuthorFilesLines
2020-12-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-56/+14
Pull KVM updates from Paolo Bonzini: "Much x86 work was pushed out to 5.12, but ARM more than made up for it. ARM: - PSCI relay at EL2 when "protected KVM" is enabled - New exception injection code - Simplification of AArch32 system register handling - Fix PMU accesses when no PMU is enabled - Expose CSV3 on non-Meltdown hosts - Cache hierarchy discovery fixes - PV steal-time cleanups - Allow function pointers at EL2 - Various host EL2 entry cleanups - Simplification of the EL2 vector allocation s390: - memcg accouting for s390 specific parts of kvm and gmap - selftest for diag318 - new kvm_stat for when async_pf falls back to sync x86: - Tracepoints for the new pagetable code from 5.10 - Catch VFIO and KVM irqfd events before userspace - Reporting dirty pages to userspace with a ring buffer - SEV-ES host support - Nested VMX support for wait-for-SIPI activity state - New feature flag (AVX512 FP16) - New system ioctl to report Hyper-V-compatible paravirtualization features Generic: - Selftest improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (171 commits) KVM: SVM: fix 32-bit compilation KVM: SVM: Add AP_JUMP_TABLE support in prep for AP booting KVM: SVM: Provide support to launch and run an SEV-ES guest KVM: SVM: Provide an updated VMRUN invocation for SEV-ES guests KVM: SVM: Provide support for SEV-ES vCPU loading KVM: SVM: Provide support for SEV-ES vCPU creation/loading KVM: SVM: Update ASID allocation to support SEV-ES guests KVM: SVM: Set the encryption mask for the SVM host save area KVM: SVM: Add NMI support for an SEV-ES guest KVM: SVM: Guest FPU state save/restore not needed for SEV-ES guest KVM: SVM: Do not report support for SMM for an SEV-ES guest KVM: x86: Update __get_sregs() / __set_sregs() to support SEV-ES KVM: SVM: Add support for CR8 write traps for an SEV-ES guest KVM: SVM: Add support for CR4 write traps for an SEV-ES guest KVM: SVM: Add support for CR0 write traps for an SEV-ES guest KVM: SVM: Add support for EFER write traps for an SEV-ES guest KVM: SVM: Support string IO operations for an SEV-ES guest KVM: SVM: Support MMIO for an SEV-ES guest KVM: SVM: Create trace events for VMGEXIT MSR protocol processing KVM: SVM: Create trace events for VMGEXIT processing ...
2020-12-02KVM: arm64: Add usage of stage 2 fault lookup level in user_mem_abort()Yanan Wang1-0/+5
If we get a FSC_PERM fault, just using (logging_active && writable) to determine calling kvm_pgtable_stage2_map(). There will be two more cases we should consider. (1) After logging_active is configged back to false from true. When we get a FSC_PERM fault with write_fault and adjustment of hugepage is needed, we should merge tables back to a block entry. This case is ignored by still calling kvm_pgtable_stage2_relax_perms(), which will lead to an endless loop and guest panic due to soft lockup. (2) We use (FSC_PERM && logging_active && writable) to determine collapsing a block entry into a table by calling kvm_pgtable_stage2_map(). But sometimes we may only need to relax permissions when trying to write to a page other than a block. In this condition,using kvm_pgtable_stage2_relax_perms() will be fine. The ISS filed bit[1:0] in ESR_EL2 regesiter indicates the stage2 lookup level at which a D-abort or I-abort occurred. By comparing granule of the fault lookup level with vma_pagesize, we can strictly distinguish conditions of calling kvm_pgtable_stage2_relax_perms() or kvm_pgtable_stage2_map(), and the above two cases will be well considered. Suggested-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201201201034.116760-4-wangyanan55@huawei.com
2020-11-10KVM: arm64: Get rid of the AArch32 register mapping codeMarc Zyngier1-2/+0
The only use of the register mapping code was for the sake of the LR mapping, which we trivially solved in a previous patch. Get rid of the whole thing now. Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10KVM: arm64: Consolidate exception injectionMarc Zyngier1-3/+0
Move the AArch32 exception injection code back into the inject_fault.c file, removing the need for a few non-static functions now that AArch32 host support is a thing of the past. Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10KVM: arm64: Remove SPSR manipulation primitivesMarc Zyngier1-26/+0
The SPSR setting code is now completely unused, including that dealing with banked AArch32 SPSRs. Cleanup time. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10KVM: arm64: Inject AArch64 exceptions from HYPMarc Zyngier1-0/+12
Move the AArch64 exception injection code from EL1 to HYP, leaving only the ESR_EL1 updates to EL1. In order to come with the differences between VHE and nVHE, two set of system register accessors are provided. SPSR, ELR, PC and PSTATE are now completely handled in the hypervisor. Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10KVM: arm64: Make kvm_skip_instr() and co private to HYPMarc Zyngier1-25/+2
In an effort to remove the vcpu PC manipulations from EL1 on nVHE systems, move kvm_skip_instr() to be HYP-specific. EL1's intent to increment PC post emulation is now signalled via a flag in the vcpu structure. Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10KVM: arm64: Move kvm_vcpu_trap_il_is32bit into kvm_skip_instr32()Marc Zyngier1-4/+4
There is no need to feed the result of kvm_vcpu_trap_il_is32bit() to kvm_skip_instr(), as only AArch32 has a variable length ISA, and this helper can equally be called from kvm_skip_instr32(), reducing the complexity at all the call sites. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-10-12Merge tag 'arm64-upstream' of ↵Linus Torvalds1-14/+0
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "There's quite a lot of code here, but much of it is due to the addition of a new PMU driver as well as some arm64-specific selftests which is an area where we've traditionally been lagging a bit. In terms of exciting features, this includes support for the Memory Tagging Extension which narrowly missed 5.9, hopefully allowing userspace to run with use-after-free detection in production on CPUs that support it. Work is ongoing to integrate the feature with KASAN for 5.11. Another change that I'm excited about (assuming they get the hardware right) is preparing the ASID allocator for sharing the CPU page-table with the SMMU. Those changes will also come in via Joerg with the IOMMU pull. We do stray outside of our usual directories in a few places, mostly due to core changes required by MTE. Although much of this has been Acked, there were a couple of places where we unfortunately didn't get any review feedback. Other than that, we ran into a handful of minor conflicts in -next, but nothing that should post any issues. Summary: - Userspace support for the Memory Tagging Extension introduced by Armv8.5. Kernel support (via KASAN) is likely to follow in 5.11. - Selftests for MTE, Pointer Authentication and FPSIMD/SVE context switching. - Fix and subsequent rewrite of our Spectre mitigations, including the addition of support for PR_SPEC_DISABLE_NOEXEC. - Support for the Armv8.3 Pointer Authentication enhancements. - Support for ASID pinning, which is required when sharing page-tables with the SMMU. - MM updates, including treating flush_tlb_fix_spurious_fault() as a no-op. - Perf/PMU driver updates, including addition of the ARM CMN PMU driver and also support to handle CPU PMU IRQs as NMIs. - Allow prefetchable PCI BARs to be exposed to userspace using normal non-cacheable mappings. - Implementation of ARCH_STACKWALK for unwinding. - Improve reporting of unexpected kernel traps due to BPF JIT failure. - Improve robustness of user-visible HWCAP strings and their corresponding numerical constants. - Removal of TEXT_OFFSET. - Removal of some unused functions, parameters and prototypes. - Removal of MPIDR-based topology detection in favour of firmware description. - Cleanups to handling of SVE and FPSIMD register state in preparation for potential future optimisation of handling across syscalls. - Cleanups to the SDEI driver in preparation for support in KVM. - Miscellaneous cleanups and refactoring work" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (148 commits) Revert "arm64: initialize per-cpu offsets earlier" arm64: random: Remove no longer needed prototypes arm64: initialize per-cpu offsets earlier kselftest/arm64: Check mte tagged user address in kernel kselftest/arm64: Verify KSM page merge for MTE pages kselftest/arm64: Verify all different mmap MTE options kselftest/arm64: Check forked child mte memory accessibility kselftest/arm64: Verify mte tag inclusion via prctl kselftest/arm64: Add utilities and a test to validate mte memory perf: arm-cmn: Fix conversion specifiers for node type perf: arm-cmn: Fix unsigned comparison to less than zero arm64: dbm: Invalidate local TLB when setting TCR_EL1.HD arm64: mm: Make flush_tlb_fix_spurious_fault() a no-op arm64: Add support for PR_SPEC_DISABLE_NOEXEC prctl() option arm64: Pull in task_stack_page() to Spectre-v4 mitigation code KVM: arm64: Allow patching EL2 vectors even with KASLR is not enabled arm64: Get rid of arm64_ssbd_state KVM: arm64: Convert ARCH_WORKAROUND_2 to arm64_get_spectre_v4_state() KVM: arm64: Get rid of kvm_arm_have_ssbd() KVM: arm64: Simplify handling of ARCH_WORKAROUND_2 ...
2020-09-29KVM: arm64: Simplify handling of ARCH_WORKAROUND_2Marc Zyngier1-14/+0
Owing to the fact that the host kernel is always mitigated, we can drastically simplify the WA2 handling by keeping the mitigation state ON when entering the guest. This means the guest is either unaffected or not mitigated. This results in a nice simplification of the mitigation space, and the removal of a lot of code that was never really used anyway. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-09-18KVM: arm64: Remove S1PTW check from kvm_vcpu_dabt_iswrite()Marc Zyngier1-2/+2
Now that kvm_vcpu_trap_is_write_fault() checks for S1PTW, there is no need for kvm_vcpu_dabt_iswrite() to do the same thing, as we already check for this condition on all existing paths. Drop the check and add a comment instead. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200915104218.1284701-3-maz@kernel.org
2020-09-18KVM: arm64: Assume write fault on S1PTW permission fault on instruction fetchMarc Zyngier1-2/+10
KVM currently assumes that an instruction abort can never be a write. This is in general true, except when the abort is triggered by a S1PTW on instruction fetch that tries to update the S1 page tables (to set AF, for example). This can happen if the page tables have been paged out and brought back in without seeing a direct write to them (they are thus marked read only), and the fault handling code will make the PT executable(!) instead of writable. The guest gets stuck forever. In these conditions, the permission fault must be considered as a write so that the Stage-1 update can take place. This is essentially the I-side equivalent of the problem fixed by 60e21a0ef54c ("arm64: KVM: Take S1 walks into account when determining S2 write faults"). Update kvm_is_write_fault() to return true on IABT+S1PTW, and introduce kvm_vcpu_trap_is_exec_fault() that only return true when no faulting on a S1 fault. Additionally, kvm_vcpu_dabt_iss1tw() is renamed to kvm_vcpu_abt_iss1tw(), as the above makes it plain that it isn't specific to data abort. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Will Deacon <will@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200915104218.1284701-2-maz@kernel.org
2020-07-30Merge branch 'kvm-arm64/misc-5.9' into kvmarm-master/nextMarc Zyngier1-1/+1
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-30KVM: arm64: Rename kvm_vcpu_dabt_isextabt()Will Deacon1-1/+1
kvm_vcpu_dabt_isextabt() is not specific to data aborts and, unlike kvm_vcpu_dabt_issext(), has nothing to do with sign extension. Rename it to 'kvm_vcpu_abt_issea()'. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20200729102821.23392-2-will@kernel.org
2020-07-28Merge branch 'kvm-arm64/misc-5.9' into kvmarm-master/next-WIPMarc Zyngier1-17/+17
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07KVM: arm64: Move SPSR_EL1 to the system register arrayMarc Zyngier1-2/+2
SPSR_EL1 being a VNCR-capable register with ARMv8.4-NV, move it to the sysregs array and update the accessors. Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07KVM: arm64: Disintegrate SPSR arrayMarc Zyngier1-2/+2
As we're about to move SPSR_EL1 into the VNCR page, we need to disassociate it from the rest of the 32bit cruft. Let's break the array into individual fields. Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07KVM: arm64: Move ELR_EL1 to the system register arrayMarc Zyngier1-21/+0
As ELR-EL1 is a VNCR-capable register with ARMv8.4-NV, let's move it to the sys_regs array and repaint the accessors. While we're at it, let's kill the now useless accessors used only on the fault injection path. Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07KVM: arm64: Make struct kvm_regs userspace-onlyMarc Zyngier1-9/+9
struct kvm_regs is used by userspace to indicate which register gets accessed by the {GET,SET}_ONE_REG API. But as we're about to refactor the layout of the in-kernel register structures, we need the kernel to move away from it. Let's make kvm_regs userspace only, and let the kernel map it to its own internal representation. Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-05KVM: arm64: Rename HSR to ESRGavin Shan1-17/+17
kvm/arm32 isn't supported since commit 541ad0150ca4 ("arm: Remove 32bit KVM host support"). So HSR isn't meaningful since then. This renames HSR to ESR accordingly. This shouldn't cause any functional changes: * Rename kvm_vcpu_get_hsr() to kvm_vcpu_get_esr() to make the function names self-explanatory. * Rename variables from @hsr to @esr to make them self-explanatory. Note that the renaming on uapi and tracepoint will cause ABI changes, which we should avoid. Specificly, there are 4 related source files in this regard: * arch/arm64/include/uapi/asm/kvm.h (struct kvm_debug_exit_arch::hsr) * arch/arm64/kvm/handle_exit.c (struct kvm_debug_exit_arch::hsr) * arch/arm64/kvm/trace_arm.h (tracepoints) * arch/arm64/kvm/trace_handle_exit.h (tracepoints) Signed-off-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Andrew Scull <ascull@google.com> Link: https://lore.kernel.org/r/20200630015705.103366-1-gshan@redhat.com
2020-07-05KVM: arm64: Remove __hyp_text macro, use build rules insteadDavid Brazdil1-1/+1
With nVHE code now fully separated from the rest of the kernel, the effects of the __hyp_text macro (which had to be applied on all nVHE code) can be achieved with build rules instead. The macro used to: (a) move code to .hyp.text ELF section, now done by renaming .text using `objcopy`, and (b) `notrace` and `__noscs` would negate effects of CC_FLAGS_FTRACE and CC_FLAGS_SCS, respectivelly, now those flags are erased from KBUILD_CFLAGS (same way as in EFI stub). Note that by removing __hyp_text from code shared with VHE, all VHE code is now compiled into .text and without `notrace` and `__noscs`. Use of '.pushsection .hyp.text' removed from assembly files as this is now also covered by the build rules. For MAINTAINERS: if needed to re-run, uses of macro were removed with the following command. Formatting was fixed up manually. find arch/arm64/kvm/hyp -type f -name '*.c' -o -name '*.h' \ -exec sed -i 's/ __hyp_text//g' {} + Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200625131420.71444-15-dbrazdil@google.com
2020-06-11Merge tag 'kvmarm-fixes-5.8-1' of ↵Paolo Bonzini1-6/+0
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for Linux 5.8, take #1 * 32bit VM fixes: - Fix embarassing mapping issue between AArch32 CSSELR and AArch64 ACTLR - Add ACTLR2 support for AArch32 - Get rid of the useless ACTLR_EL1 save/restore - Fix CP14/15 accesses for AArch32 guests on BE hosts - Ensure that we don't loose any state when injecting a 32bit exception when running on a VHE host * 64bit VM fixes: - Fix PtrAuth host saving happening in preemptible contexts - Optimize PtrAuth lazy enable - Drop vcpu to cpu context pointer - Fix sparse warnings for HYP per-CPU accesses
2020-06-09KVM: arm64: Save the host's PtrAuth keys in non-preemptible contextMarc Zyngier1-6/+0
When using the PtrAuth feature in a guest, we need to save the host's keys before allowing the guest to program them. For that, we dump them in a per-CPU data structure (the so called host context). But both call sites that do this are in preemptible context, which may end up in disaster should the vcpu thread get preempted before reentering the guest. Instead, save the keys eagerly on each vcpu_load(). This has an increased overhead, but is at least safe. Cc: stable@vger.kernel.org Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-05Merge branch 'for-next/bti-user' into for-next/btiWill Deacon1-2/+4
Merge in user support for Branch Target Identification, which narrowly missed the cut for 5.7 after a late ABI concern. * for-next/bti-user: arm64: bti: Document behaviour for dynamically linked binaries arm64: elf: Fix allnoconfig kernel build with !ARCH_USE_GNU_PROPERTY arm64: BTI: Add Kconfig entry for userspace BTI mm: smaps: Report arm64 guarded pages in smaps arm64: mm: Display guarded pages in ptdump KVM: arm64: BTI: Reset BTYPE when skipping emulated instructions arm64: BTI: Reset BTYPE when skipping emulated instructions arm64: traps: Shuffle code to eliminate forward declarations arm64: unify native/compat instruction skipping arm64: BTI: Decode BYTPE bits when printing PSTATE arm64: elf: Enable BTI at exec based on ELF program properties elf: Allow arch to tweak initial mmap prot flags arm64: Basic Branch Target Identification support ELF: Add ELF program property parsing support ELF: UAPI and Kconfig additions for ELF program properties
2020-03-24KVM: arm64: GICv4.1: Allow non-trapping WFI when using HW SGIsMarc Zyngier1-1/+2
Just like for VLPIs, it is beneficial to avoid trapping on WFI when the vcpu is using the GICv4.1 SGIs. Add such a check to vcpu_clear_wfx_traps(). Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-23-maz@kernel.org
2020-03-16KVM: arm64: BTI: Reset BTYPE when skipping emulated instructionsDave Martin1-2/+4
Since normal execution of any non-branch instruction resets the PSTATE BTYPE field to 0, so do the same thing when emulating a trapped instruction. Branches don't trap directly, so we should never need to assign a non-zero value to BTYPE here. Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-02-22KVM: arm64: Ask the compiler to __always_inline functions used at HYPJames Morse1-24/+24
On non VHE CPUs, KVM's __hyp_text contains code run at EL2 while the rest of the kernel runs at EL1. This code lives in its own section with start and end markers so we can map it to EL2. The compiler may decide not to inline static-inline functions from the header file. It may also decide not to put these out-of-line functions in the same section, meaning they aren't mapped when called at EL2. Clang-9 does exactly this with __kern_hyp_va() and a few others when x18 is reserved for the shadow call stack. Add the additional __always_ hint to all the static-inlines that are called from a hyp file. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200220165839.256881-2-james.morse@arm.com ---- kvm_get_hyp_vector() pulls in all the regular per-cpu accessors and this_cpu_has_cap(), fortunately its only called for VHE.
2020-01-23KVM: arm/arm64: Cleanup MMIO handlingMarc Zyngier1-2/+1
Our MMIO handling is a bit odd, in the sense that it uses an intermediate per-vcpu structure to store the various decoded information that describe the access. But the same information is readily available in the HSR/ESR_EL2 field, and we actually use this field to populate the structure. Let's simplify the whole thing by getting rid of the superfluous structure and save a (tiny) bit of space in the vcpu structure. [32bit fix courtesy of Olof Johansson <olof@lixom.net>] Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-01-19KVM: arm/arm64: Correct AArch32 SPSR on exception entryMark Rutland1-0/+32
Confusingly, there are three SPSR layouts that a kernel may need to deal with: (1) An AArch64 SPSR_ELx view of an AArch64 pstate (2) An AArch64 SPSR_ELx view of an AArch32 pstate (3) An AArch32 SPSR_* view of an AArch32 pstate When the KVM AArch32 support code deals with SPSR_{EL2,HYP}, it's either dealing with #2 or #3 consistently. On arm64 the PSR_AA32_* definitions match the AArch64 SPSR_ELx view, and on arm the PSR_AA32_* definitions match the AArch32 SPSR_* view. However, when we inject an exception into an AArch32 guest, we have to synthesize the AArch32 SPSR_* that the guest will see. Thus, an AArch64 host needs to synthesize layout #3 from layout #2. This patch adds a new host_spsr_to_spsr32() helper for this, and makes use of it in the KVM AArch32 support code. For arm64 we need to shuffle the DIT bit around, and remove the SS bit, while for arm we can use the value as-is. I've open-coded the bit manipulation for now to avoid having to rework the existing PSR_* definitions into PSR64_AA32_* and PSR32_AA32_* definitions. I hope to perform a more thorough refactoring in future so that we can handle pstate view manipulation more consistently across the kernel tree. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200108134324.46500-4-mark.rutland@arm.com
2020-01-19KVM: arm64: Only sign-extend MMIO up to register widthChristoffer Dall1-0/+5
On AArch64 you can do a sign-extended load to either a 32-bit or 64-bit register, and we should only sign extend the register up to the width of the register as specified in the operation (by using the 32-bit Wn or 64-bit Xn register specifier). As it turns out, the architecture provides this decoding information in the SF ("Sixty-Four" -- how cute...) bit. Let's take advantage of this with the usual 32-bit/64-bit header file dance and do the right thing on AArch64 hosts. Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20191212195055.5541-1-christoffer.dall@arm.com
2019-11-08Merge remote-tracking branch 'kvmarm/misc-5.5' into kvmarm/nextMarc Zyngier1-3/+18
2019-11-08KVM: arm64: Opportunistically turn off WFI trapping when using direct LPI ↵Marc Zyngier1-2/+7
injection Just like we do for WFE trapping, it can be useful to turn off WFI trapping when the physical CPU is not oversubscribed (that is, the vcpu is the only runnable process on this CPU) *and* that we're using direct injection of interrupts. The conditions are reevaluated on each vcpu_load(), ensuring that we don't switch to this mode on a busy system. On a GICv4 system, this has the effect of reducing the generation of doorbell interrupts to zero when the right conditions are met, which is a huge improvement over the current situation (where the doorbells are screaming if the CPU ever hits a blocking WFI). Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Link: https://lore.kernel.org/r/20191107160412.30301-3-maz@kernel.org
2019-10-28KVM: arm64: Don't set HCR_EL2.TVM when S2FWB is supportedChristoffer Dall1-1/+11
On CPUs that support S2FWB (Armv8.4+), KVM configures the stage 2 page tables to override the memory attributes of memory accesses, regardless of the stage 1 page table configurations, and also when the stage 1 MMU is turned off. This results in all memory accesses to RAM being cacheable, including during early boot of the guest. On CPUs without this feature, memory accesses were non-cacheable during boot until the guest turned on the stage 1 MMU, and we had to detect when the guest turned on the MMU, such that we could invalidate all cache entries and ensure a consistent view of memory with the MMU turned on. When the guest turned on the caches, we would call stage2_flush_vm() from kvm_toggle_cache(). However, stage2_flush_vm() walks all the stage 2 tables, and calls __kvm_flush-dcache_pte, which on a system with S2FWB does ... absolutely nothing. We can avoid that whole song and dance, and simply not set TVM when creating a VM on a system that has S2FWB. Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20191028130541.30536-1-christoffer.dall@arm.com
2019-10-21KVM: arm/arm64: Allow reporting non-ISV data aborts to userspaceChristoffer Dall1-0/+5
For a long time, if a guest accessed memory outside of a memslot using any of the load/store instructions in the architecture which doesn't supply decoding information in the ESR_EL2 (the ISV bit is not set), the kernel would print the following message and terminate the VM as a result of returning -ENOSYS to userspace: load/store instruction decoding not implemented The reason behind this message is that KVM assumes that all accesses outside a memslot is an MMIO access which should be handled by userspace, and we originally expected to eventually implement some sort of decoding of load/store instructions where the ISV bit was not set. However, it turns out that many of the instructions which don't provide decoding information on abort are not safe to use for MMIO accesses, and the remaining few that would potentially make sense to use on MMIO accesses, such as those with register writeback, are not used in practice. It also turns out that fetching an instruction from guest memory can be a pretty horrible affair, involving stopping all CPUs on SMP systems, handling multiple corner cases of address translation in software, and more. It doesn't appear likely that we'll ever implement this in the kernel. What is much more common is that a user has misconfigured his/her guest and is actually not accessing an MMIO region, but just hitting some random hole in the IPA space. In this scenario, the error message above is almost misleading and has led to a great deal of confusion over the years. It is, nevertheless, ABI to userspace, and we therefore need to introduce a new capability that userspace explicitly enables to change behavior. This patch introduces KVM_CAP_ARM_NISV_TO_USER (NISV meaning Non-ISV) which does exactly that, and introduces a new exit reason to report the event to userspace. User space can then emulate an exception to the guest, restart the guest, suspend the guest, or take any other appropriate action as per the policy of the running system. Reported-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Alexander Graf <graf@amazon.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2019-07-05KVM: arm64: Migrate _elx sysreg accessors to msr_s/mrs_sDave Martin1-8/+8
Currently, the {read,write}_sysreg_el*() accessors for accessing particular ELs' sysregs in the presence of VHE rely on some local hacks and define their system register encodings in a way that is inconsistent with the core definitions in <asm/sysreg.h>. As a result, it is necessary to add duplicate definitions for any system register that already needs a definition in sysreg.h for other reasons. This is a bit of a maintenance headache, and the reasons for the _el*() accessors working the way they do is a bit historical. This patch gets rid of the shadow sysreg definitions in <asm/kvm_hyp.h>, converts the _el*() accessors to use the core __msr_s/__mrs_s interface, and converts all call sites to use the standard sysreg #define names (i.e., upper case, with SYS_ prefix). This patch will conflict heavily anyway, so the opportunity to clean up some bad whitespace in the context of the changes is taken. The change exposes a few system registers that have no sysreg.h definition, due to msr_s/mrs_s being used in place of msr/mrs: additions are made in order to fill in the gaps. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Link: https://www.spinics.net/lists/kvm-arm/msg31717.html [Rebased to v4.21-rc1] Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> [Rebased to v5.2-rc5, changelog updates] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-07-05KVM: arm/arm64: Add save/restore support for firmware workaround stateAndre Przywara1-0/+14
KVM implements the firmware interface for mitigating cache speculation vulnerabilities. Guests may use this interface to ensure mitigation is active. If we want to migrate such a guest to a host with a different support level for those workarounds, migration might need to fail, to ensure that critical guests don't loose their protection. Introduce a way for userland to save and restore the workarounds state. On restoring we do checks that make sure we don't downgrade our mitigation level. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-06-19treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234Thomas Gleixner1-12/+1
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 503 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Enrico Weigelt <info@metux.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-24KVM: arm/arm64: Context-switch ptrauth registersMark Rutland1-0/+16
When pointer authentication is supported, a guest may wish to use it. This patch adds the necessary KVM infrastructure for this to work, with a semi-lazy context switch of the pointer auth state. Pointer authentication feature is only enabled when VHE is built in the kernel and present in the CPU implementation so only VHE code paths are modified. When we schedule a vcpu, we disable guest usage of pointer authentication instructions and accesses to the keys. While these are disabled, we avoid context-switching the keys. When we trap the guest trying to use pointer authentication functionality, we change to eagerly context-switching the keys, and enable the feature. The next time the vcpu is scheduled out/in, we start again. However the host key save is optimized and implemented inside ptrauth instruction/register access trap. Pointer authentication consists of address authentication and generic authentication, and CPUs in a system might have varied support for either. Where support for either feature is not uniform, it is hidden from guests via ID register emulation, as a result of the cpufeature framework in the host. Unfortunately, address authentication and generic authentication cannot be trapped separately, as the architecture provides a single EL2 trap covering both. If we wish to expose one without the other, we cannot prevent a (badly-written) guest from intermittently using a feature which is not uniformly supported (when scheduled on a physical CPU which supports the relevant feature). Hence, this patch expects both type of authentication to be present in a cpu. This switch of key is done from guest enter/exit assembly as preparation for the upcoming in-kernel pointer authentication support. Hence, these key switching routines are not implemented in C code as they may cause pointer authentication key signing error in some situations. Signed-off-by: Mark Rutland <mark.rutland@arm.com> [Only VHE, key switch in full assembly, vcpu_has_ptrauth checks , save host key in ptrauth exception trap] Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Reviewed-by: Julien Thierry <julien.thierry@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: kvmarm@lists.cs.columbia.edu [maz: various fixups] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-02-19arm64: KVM: Describe data or unified caches as having 1 set and 1 wayArd Biesheuvel1-1/+2
On SMP ARM systems, cache maintenance by set/way should only ever be done in the context of onlining or offlining CPUs, which is typically done by bare metal firmware and never in a virtual machine. For this reason, we trap set/way cache maintenance operations and replace them with conditional flushing of the entire guest address space. Due to this trapping, the set/way arguments passed into the set/way ops are completely ignored, and thus irrelevant. This also means that the set/way geometry is equally irrelevant, and we can simply report it as 1 set and 1 way, so that legacy 32-bit ARM system software (i.e., the kind that only receives odd fixes) doesn't take a performance hit due to the trapping when iterating over the cachelines. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-02-19arm64: KVM: Expose sanitised cache type register to guestArd Biesheuvel1-0/+3
We currently permit CPUs in the same system to deviate in the exact topology of the caches, and we subsequently hide this fact from user space by exposing a sanitised value of the cache type register CTR_EL0. However, guests running under KVM see the bare value of CTR_EL0, which could potentially result in issues with, e.g., JITs or other pieces of code that are sensitive to misreported cache line sizes. So let's start trapping cache ID instructions if there is a mismatch, and expose the sanitised version of CTR_EL0 to guests. Note that CTR_EL0 is treated as an invariant to KVM user space, so update that part as well. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-02-19KVM: arm/arm64: Move kvm_is_write_fault to header fileChristoffer Dall1-0/+8
Move this little function to the header files for arm/arm64 so other code can make use of it directly. Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-18arm64: KVM: Consistently advance singlestep when emulating instructionsMark Rutland1-8/+27
When we emulate a guest instruction, we don't advance the hardware singlestep state machine, and thus the guest will receive a software step exception after a next instruction which is not emulated by the host. We bodge around this in an ad-hoc fashion. Sometimes we explicitly check whether userspace requested a single step, and fake a debug exception from within the kernel. Other times, we advance the HW singlestep state rely on the HW to generate the exception for us. Thus, the observed step behaviour differs for host and guest. Let's make this simpler and consistent by always advancing the HW singlestep state machine when we skip an instruction. Thus we can rely on the hardware to generate the singlestep exception for us, and never need to explicitly check for an active-pending step, nor do we need to fake a debug exception from the guest. Cc: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-09-21arm64/cpufeatures: Introduce ESR_ELx_SYS64_ISS_RT()Anshuman Khandual1-1/+1
Extracting target register from ESR.ISS encoding has already been required at multiple instances. Just make it a macro definition and replace all the existing use cases. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-08-22Merge tag 'kvmarm-for-v4.19' of ↵Paolo Bonzini1-0/+17
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm updates for 4.19 - Support for Group0 interrupts in guests - Cache management optimizations for ARMv8.4 systems - Userspace interface for RAS, allowing error retrival and injection - Fault path optimization - Emulated physical timer fixes - Random cleanups
2018-07-21arm/arm64: KVM: Add KVM_GET/SET_VCPU_EVENTSDongjiu Geng1-0/+5
For the migrating VMs, user space may need to know the exception state. For example, in the machine A, KVM make an SError pending, when migrate to B, KVM also needs to pend an SError. This new IOCTL exports user-invisible states related to SError. Together with appropriate user space changes, user space can get/set the SError exception state to do migrate/snapshot/suspend. Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com> Reviewed-by: James Morse <james.morse@arm.com> [expanded documentation wording] Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-07-09KVM: arm/arm64: Enable adaptative WFE trappingMarc Zyngier1-0/+10
Trapping blocking WFE is extremely beneficial in situations where the system is oversubscribed, as it allows another thread to run while being blocked. In a non-oversubscribed environment, this is the complete opposite, and trapping WFE is just unnecessary overhead. Let's only enable WFE trapping if the CPU has more than a single task to run (that is, more than just the vcpu thread). Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-07-09arm64: KVM: Add support for Stage-2 control of memory types and cacheabilityMarc Zyngier1-0/+2
Up to ARMv8.3, the combinaison of Stage-1 and Stage-2 attributes results in the strongest attribute of the two stages. This means that the hypervisor has to perform quite a lot of cache maintenance just in case the guest has some non-cacheable mappings around. ARMv8.4 solves this problem by offering a different mode (FWB) where Stage-2 has total control over the memory attribute (this is limited to systems where both I/O and instruction fetches are coherent with the dcache). This is achieved by having a different set of memory attributes in the page tables, and a new bit set in HCR_EL2. On such a system, we can then safely sidestep any form of dcache management. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-07-05kvm/arm: use PSR_AA32 definitionsMark Rutland1-5/+5
Some code cares about the SPSR_ELx format for exceptions taken from AArch32 to inspect or manipulate the SPSR_ELx value, which is already in the SPSR_ELx format, and not in the AArch32 PSR format. To separate these from cases where we care about the AArch32 PSR format, migrate these cases to use the PSR_AA32_* definitions rather than COMPAT_PSR_*. There should be no functional change as a result of this patch. Note that arm64 KVM does not support a compat KVM API, and always uses the SPSR_ELx format, even for AArch32 guests. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-05-04KVM: arm64: Fix order of vcpu_write_sys_reg() argumentsJames Morse1-1/+1
A typo in kvm_vcpu_set_be()'s call: | vcpu_write_sys_reg(vcpu, SCTLR_EL1, sctlr) causes us to use the 32bit register value as an index into the sys_reg[] array, and sail off the end of the linear map when we try to bring up big-endian secondaries. | Unable to handle kernel paging request at virtual address ffff80098b982c00 | Mem abort info: | ESR = 0x96000045 | Exception class = DABT (current EL), IL = 32 bits | SET = 0, FnV = 0 | EA = 0, S1PTW = 0 | Data abort info: | ISV = 0, ISS = 0x00000045 | CM = 0, WnR = 1 | swapper pgtable: 4k pages, 48-bit VAs, pgdp = 000000002ea0571a | [ffff80098b982c00] pgd=00000009ffff8803, pud=0000000000000000 | Internal error: Oops: 96000045 [#1] PREEMPT SMP | Modules linked in: | CPU: 2 PID: 1561 Comm: kvm-vcpu-0 Not tainted 4.17.0-rc3-00001-ga912e2261ca6-dirty #1323 | Hardware name: ARM Juno development board (r1) (DT) | pstate: 60000005 (nZCv daif -PAN -UAO) | pc : vcpu_write_sys_reg+0x50/0x134 | lr : vcpu_write_sys_reg+0x50/0x134 | Process kvm-vcpu-0 (pid: 1561, stack limit = 0x000000006df4728b) | Call trace: | vcpu_write_sys_reg+0x50/0x134 | kvm_psci_vcpu_on+0x14c/0x150 | kvm_psci_0_2_call+0x244/0x2a4 | kvm_hvc_call_handler+0x1cc/0x258 | handle_hvc+0x20/0x3c | handle_exit+0x130/0x1ec | kvm_arch_vcpu_ioctl_run+0x340/0x614 | kvm_vcpu_ioctl+0x4d0/0x840 | do_vfs_ioctl+0xc8/0x8d0 | ksys_ioctl+0x78/0xa8 | sys_ioctl+0xc/0x18 | el0_svc_naked+0x30/0x34 | Code: 73620291 604d00b0 00201891 1ab10194 (957a33f8) |---[ end trace 4b4a4f9628596602 ]--- Fix the order of the arguments. Fixes: 8d404c4c24613 ("KVM: arm64: Rewrite system register accessors to read/write functions") CC: Christoffer Dall <cdall@cs.columbia.edu> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19KVM: arm64: Prepare to handle deferred save/restore of 32-bit registersChristoffer Dall1-23/+9
32-bit registers are not used by a 64-bit host kernel and can be deferred, but we need to rework the accesses to these register to access the latest values depending on whether or not guest system registers are loaded on the CPU or only reside in memory. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>