summaryrefslogtreecommitdiffstats
path: root/arch/arm64/kvm
AgeCommit message (Collapse)AuthorFilesLines
2021-09-20KVM: arm64: Fix PMU probe orderingMarc Zyngier2-4/+8
Russell reported that since 5.13, KVM's probing of the PMU has started to fail on his HW. As it turns out, there is an implicit ordering dependency between the architectural PMU probing code and and KVM's own probing. If, due to probe ordering reasons, KVM probes before the PMU driver, it will fail to detect the PMU and prevent it from being advertised to guests as well as the VMM. Obviously, this is one probing too many, and we should be able to deal with any ordering. Add a callback from the PMU code into KVM to advertise the registration of a host CPU PMU, allowing for any probing order. Fixes: 5421db1be3b1 ("KVM: arm64: Divorce the perf code from oprofile helpers") Reported-by: "Russell King (Oracle)" <linux@armlinux.org.uk> Tested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/YUYRKVflRtUytzy5@shell.armlinux.org.uk Cc: stable@vger.kernel.org
2021-09-20KVM: arm64: nvhe: Fix missing FORCE for hyp-reloc.S build ruleZenghui Yu1-1/+1
Add FORCE so that if_changed can detect the command line change. We'll otherwise see a compilation warning since commit e1f86d7b4b2a ("kbuild: warn if FORCE is missing for if_changed(_dep,_rule) and filechk"). arch/arm64/kvm/hyp/nvhe/Makefile:58: FORCE prerequisite is missing Cc: David Brazdil <dbrazdil@google.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210907052137.1059-1-yuzenghui@huawei.com
2021-09-07Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds35-527/+894
Pull KVM updates from Paolo Bonzini: "ARM: - Page ownership tracking between host EL1 and EL2 - Rely on userspace page tables to create large stage-2 mappings - Fix incompatibility between pKVM and kmemleak - Fix the PMU reset state, and improve the performance of the virtual PMU - Move over to the generic KVM entry code - Address PSCI reset issues w.r.t. save/restore - Preliminary rework for the upcoming pKVM fixed feature - A bunch of MM cleanups - a vGIC fix for timer spurious interrupts - Various cleanups s390: - enable interpretation of specification exceptions - fix a vcpu_idx vs vcpu_id mixup x86: - fast (lockless) page fault support for the new MMU - new MMU now the default - increased maximum allowed VCPU count - allow inhibit IRQs on KVM_RUN while debugging guests - let Hyper-V-enabled guests run with virtualized LAPIC as long as they do not enable the Hyper-V "AutoEOI" feature - fixes and optimizations for the toggling of AMD AVIC (virtualized LAPIC) - tuning for the case when two-dimensional paging (EPT/NPT) is disabled - bugfixes and cleanups, especially with respect to vCPU reset and choosing a paging mode based on CR0/CR4/EFER - support for 5-level page table on AMD processors Generic: - MMU notifier invalidation callbacks do not take mmu_lock unless necessary - improved caching of LRU kvm_memory_slot - support for histogram statistics - add statistics for halt polling and remote TLB flush requests" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (210 commits) KVM: Drop unused kvm_dirty_gfn_invalid() KVM: x86: Update vCPU's hv_clock before back to guest when tsc_offset is adjusted KVM: MMU: mark role_regs and role accessors as maybe unused KVM: MIPS: Remove a "set but not used" variable x86/kvm: Don't enable IRQ when IRQ enabled in kvm_wait KVM: stats: Add VM stat for remote tlb flush requests KVM: Remove unnecessary export of kvm_{inc,dec}_notifier_count() KVM: x86/mmu: Move lpage_disallowed_link further "down" in kvm_mmu_page KVM: x86/mmu: Relocate kvm_mmu_page.tdp_mmu_page for better cache locality Revert "KVM: x86: mmu: Add guest physical address check in translate_gpa()" KVM: x86/mmu: Remove unused field mmio_cached in struct kvm_mmu_page kvm: x86: Increase KVM_SOFT_MAX_VCPUS to 710 kvm: x86: Increase MAX_VCPUS to 1024 kvm: x86: Set KVM_MAX_VCPU_ID to 4*KVM_MAX_VCPUS KVM: VMX: avoid running vmx_handle_exit_irqoff in case of emulation KVM: x86/mmu: Don't freak out if pml5_root is NULL on 4-level host KVM: s390: index kvm->arch.idle_mask by vcpu_idx KVM: s390: Enable specification exception interpretation KVM: arm64: Trim guest debug exception handling KVM: SVM: Add 5-level page table support for SVM ...
2021-09-06Merge tag 'kvmarm-5.15' of ↵Paolo Bonzini35-527/+901
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 5.15 - Page ownership tracking between host EL1 and EL2 - Rely on userspace page tables to create large stage-2 mappings - Fix incompatibility between pKVM and kmemleak - Fix the PMU reset state, and improve the performance of the virtual PMU - Move over to the generic KVM entry code - Address PSCI reset issues w.r.t. save/restore - Preliminary rework for the upcoming pKVM fixed feature - A bunch of MM cleanups - a vGIC fix for timer spurious interrupts - Various cleanups
2021-09-06KVM: stats: Add VM stat for remote tlb flush requestsJing Zhang1-0/+1
Add a new stat that counts the number of times a remote TLB flush is requested, regardless of whether it kicks vCPUs out of guest mode. This allows us to look at how often flushes are initiated. Unlike remote_tlb_flush, this one applies to ARM's instruction-set-based TLB flush implementation, so apply it there too. Original-by: David Matlack <dmatlack@google.com> Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210817002639.3856694-1-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-03memblock: make memblock_find_in_range method privateMike Rapoport1-6/+3
There are a lot of uses of memblock_find_in_range() along with memblock_reserve() from the times memblock allocation APIs did not exist. memblock_find_in_range() is the very core of memblock allocations, so any future changes to its internal behaviour would mandate updates of all the users outside memblock. Replace the calls to memblock_find_in_range() with an equivalent calls to memblock_phys_alloc() and memblock_phys_alloc_range() and make memblock_find_in_range() private method of memblock. This simplifies the callers, ensures that (unlikely) errors in memblock_reserve() are handled and improves maintainability of memblock_find_in_range(). Link: https://lkml.kernel.org/r/20210816122622.30279-1-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Kirill A. Shutemov <kirill.shtuemov@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [ACPI] Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Nick Kossifidis <mick@ics.forth.gr> [riscv] Tested-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-26Merge branch kvm-arm64/misc-5.15 into kvmarm-master/nextMarc Zyngier1-17/+3
* kvm-arm64/misc-5.15: : Misc improvements for 5.15: : : - Account the number of VMID-wide TLB invalidations as : remote TLB flushes : - Fix comments in the VGIC code : - Cleanup the PMU IMPDEF identification : - Streamline the TGRAN2 usage : - Avoid advertising a 52bit IPA range for non-64KB configs : - Avoid spurious signalling when a HW-mapped interrupt is in the : A+P state on entry, and in the P state on exit, but that the : physical line is not pending anymore. : - Bunch of minor cleanups KVM: arm64: Trim guest debug exception handling Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-08-26KVM: arm64: Trim guest debug exception handlingRaghavendra Rao Ananta1-17/+3
The switch-case for handling guest debug exception covers all the debug exception classes, but functionally, doesn't do anything with them other than ESR_ELx_EC_WATCHPT_LOW. Moreover, even though handled well, the 'default' case could be confusing from a security point of view, stating that the guests' actions can potentially flood the syslog. But in reality, the code is unreachable. Hence, trim down the function to only handle the case with ESR_ELx_EC_WATCHPT_LOW with a simple 'if' check. Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210823223940.1878930-1-rananta@google.com
2021-08-20KVM: stats: Support linear and logarithmic histogram statisticsJing Zhang1-4/+0
Add new types of KVM stats, linear and logarithmic histogram. Histogram are very useful for observing the value distribution of time or size related stats. Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210802165633.1866976-2-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-20Merge branch kvm-arm64/pkvm-fixed-features-prologue into kvmarm-master/nextMarc Zyngier11-74/+71
* kvm-arm64/pkvm-fixed-features-prologue: : Rework a bunch of common infrastructure as a prologue : to Fuad Tabba's protected VM fixed feature series. KVM: arm64: Upgrade trace_kvm_arm_set_dreg32() to 64bit KVM: arm64: Add config register bit definitions KVM: arm64: Add feature register flag definitions KVM: arm64: Track value of cptr_el2 in struct kvm_vcpu_arch KVM: arm64: Keep mdcr_el2's value as set by __init_el2_debug KVM: arm64: Restore mdcr_el2 from vcpu KVM: arm64: Refactor sys_regs.h,c for nVHE reuse KVM: arm64: Fix names of config register fields KVM: arm64: MDCR_EL2 is a 64-bit register KVM: arm64: Remove trailing whitespace in comment KVM: arm64: placeholder to check if VM is protected Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-08-20Merge branch kvm-arm64/mmu/vmid-cleanups into kvmarm-master/nextMarc Zyngier8-14/+16
* kvm-arm64/mmu/vmid-cleanups: : Cleanup the stage-2 configuration by providing a single helper, : and tidy up some of the ordering requirements for the VMID : allocator. KVM: arm64: Upgrade VMID accesses to {READ,WRITE}_ONCE KVM: arm64: Unify stage-2 programming behind __load_stage2() KVM: arm64: Move kern_hyp_va() usage in __load_guest_stage2() into the callers Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-08-20Merge branch kvm-arm64/generic-entry into kvmarm-master/nextMarc Zyngier3-27/+46
Switch KVM/arm64 to the generic entry code, courtesy of Oliver Upton * kvm-arm64/generic-entry: KVM: arm64: Use generic KVM xfer to guest work function entry: KVM: Allow use of generic KVM entry w/o full generic support KVM: arm64: Record number of signal exits as a vCPU stat Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-08-20Merge branch kvm-arm64/psci/cpu_on into kvmarm-master/nextMarc Zyngier3-9/+30
PSCI fixes from Oliver Upton: - Plug race on reset - Ensure that a pending reset is applied before userspace accesses - Reject PSCI requests with illegal affinity bits * kvm-arm64/psci/cpu_on: selftests: KVM: Introduce psci_cpu_on_test KVM: arm64: Enforce reserved bits for PSCI target affinities KVM: arm64: Handle PSCI resets before userspace touches vCPU state KVM: arm64: Fix read-side race on updates to vcpu reset state Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-08-20Merge branch kvm-arm64/mmu/el2-tracking into kvmarm-master/nextMarc Zyngier13-256/+513
* kvm-arm64/mmu/el2-tracking: (25 commits) : Enable tracking of page sharing between host EL1 and EL2 KVM: arm64: Minor optimization of range_is_memory KVM: arm64: Make hyp_panic() more robust when protected mode is enabled KVM: arm64: Return -EPERM from __pkvm_host_share_hyp() KVM: arm64: Make __pkvm_create_mappings static KVM: arm64: Restrict EL2 stage-1 changes in protected mode KVM: arm64: Refactor protected nVHE stage-1 locking KVM: arm64: Remove __pkvm_mark_hyp KVM: arm64: Mark host bss and rodata section as shared KVM: arm64: Enable retrieving protections attributes of PTEs KVM: arm64: Introduce addr_is_memory() KVM: arm64: Expose pkvm_hyp_id KVM: arm64: Expose host stage-2 manipulation helpers KVM: arm64: Add helpers to tag shared pages in SW bits KVM: arm64: Allow populating software bits KVM: arm64: Enable forcing page-level stage-2 mappings KVM: arm64: Tolerate re-creating hyp mappings to set software bits KVM: arm64: Don't overwrite software bits with owner id KVM: arm64: Rename KVM_PTE_LEAF_ATTR_S2_IGNORED KVM: arm64: Optimize host memory aborts KVM: arm64: Expose page-table helpers ... Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-08-20Merge branch kvm-arm64/mmu/kmemleak-pkvm into kvmarm-master/nextMarc Zyngier1-0/+7
Prevent kmemleak from peeking into the HYP data, which is fatal in protected mode. * kvm-arm64/mmu/kmemleak-pkvm: KVM: arm64: Unregister HYP sections from kmemleak in protected mode arm64: Move .hyp.rodata outside of the _sdata.._edata range Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-08-20Merge branch kvm-arm64/misc-5.15 into kvmarm-master/nextMarc Zyngier11-112/+72
* kvm-arm64/misc-5.15: : Misc improvements for 5.15: : : - Account the number of VMID-wide TLB invalidations as : remote TLB flushes : - Fix comments in the VGIC code : - Cleanup the PMU IMPDEF identification : - Streamline the TGRAN2 usage : - Avoid advertising a 52bit IPA range for non-64KB configs : - Avoid spurious signalling when a HW-mapped interrupt is in the : A+P state on entry, and in the P state on exit, but that the : physical line is not pending anymore. : - Bunch of minor cleanups KVM: arm64: vgic: Resample HW pending state on deactivation KVM: arm64: vgic: Drop WARN from vgic_get_irq KVM: arm64: Drop unused REQUIRES_VIRT KVM: arm64: Drop check_kvm_target_cpu() based percpu probe KVM: arm64: Drop init_common_resources() KVM: arm64: Use ARM64_MIN_PARANGE_BITS as the minimum supported IPA arm64/mm: Add remaining ID_AA64MMFR0_PARANGE_ macros KVM: arm64: Restrict IPA size to maximum 48 bits on 4K and 16K page size arm64/mm: Define ID_AA64MMFR0_TGRAN_2_SHIFT KVM: arm64: perf: Replace '0xf' instances with ID_AA64DFR0_PMUVER_IMP_DEF KVM: arm64: Fix comments related to GICv2 PMR reporting KVM: arm64: Count VMID-wide TLB invalidations arm64/kexec: Test page size support with new TGRAN range values Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-08-20Merge branch kvm-arm64/mmu/mapping-levels into kvmarm-master/nextMarc Zyngier2-7/+77
Revamp the KVM/arm64 THP code by parsing the userspace page tables instead of relying on an infrastructure that is about to disappear (we are the last user). * kvm-arm64/mmu/mapping-levels: KVM: Get rid of kvm_get_pfn() KVM: arm64: Use get_page() instead of kvm_get_pfn() KVM: Remove kvm_is_transparent_hugepage() and PageTransCompoundMap() KVM: arm64: Avoid mapping size adjustment on permission fault KVM: arm64: Walk userspace page tables to compute the THP mapping size KVM: arm64: Introduce helper to retrieve a PTE and its level Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-08-20KVM: arm64: Minor optimization of range_is_memoryDavid Brazdil1-5/+8
Currently range_is_memory finds the corresponding struct memblock_region for both the lower and upper bounds of the given address range with two rounds of binary search, and then checks that the two memblocks are the same. Simplify this by only doing binary search on the lower bound and then checking that the upper bound is in the same memblock. Signed-off-by: David Brazdil <dbrazdil@google.com> Reviewed-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210728153232.1018911-3-dbrazdil@google.com
2021-08-20Merge tag 'kvmarm-fixes-5.14-2' into kvm-arm64/mmu/el2-trackingMarc Zyngier2-5/+9
KVM/arm64 fixes for 5.14, take #2 - Plug race between enabling MTE and creating vcpus - Fix off-by-one bug when checking whether an address range is RAM Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-08-20KVM: arm64: Upgrade trace_kvm_arm_set_dreg32() to 64bitMarc Zyngier1-3/+7
A number of registers pased to trace_kvm_arm_set_dreg32() are actually 64bit. Upgrade the tracepoint to take a 64bit value, despite the name... Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-08-20KVM: arm64: Track value of cptr_el2 in struct kvm_vcpu_archFuad Tabba2-1/+2
Track the baseline guest value for cptr_el2 in struct kvm_vcpu_arch, similar to the other registers that control traps. Use this value when setting cptr_el2 for the guest. Currently this value is unchanged (CPTR_EL2_DEFAULT), but future patches will set trapping bits based on features supported for the guest. No functional change intended. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210817081134.2918285-9-tabba@google.com
2021-08-20KVM: arm64: Keep mdcr_el2's value as set by __init_el2_debugFuad Tabba2-8/+0
__init_el2_debug configures mdcr_el2 at initialization based on, among other things, available hardware support. Trap deactivation doesn't check that, so keep the initial value. No functional change intended. Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210817081134.2918285-8-tabba@google.com
2021-08-20KVM: arm64: Restore mdcr_el2 from vcpuFuad Tabba4-19/+16
On deactivating traps, restore the value of mdcr_el2 from the newly created and preserved host value vcpu context, rather than directly reading the hardware register. Up until and including this patch the two values are the same, i.e., the hardware register and the vcpu one. A future patch will be changing the value of mdcr_el2 on activating traps, and this ensures that its value will be restored. No functional change intended. Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210817081134.2918285-7-tabba@google.com
2021-08-20KVM: arm64: Refactor sys_regs.h,c for nVHE reuseFuad Tabba2-44/+47
Refactor sys_regs.h and sys_regs.c to make it easier to reuse common code. It will be used in nVHE in a later patch. Note that the refactored code uses __inline_bsearch for find_reg instead of bsearch to avoid copying the bsearch code for nVHE. No functional change intended. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210817081134.2918285-6-tabba@google.com
2021-08-20KVM: arm64: MDCR_EL2 is a 64-bit registerFuad Tabba3-3/+3
Fix the places in KVM that treat MDCR_EL2 as a 32-bit register. More recent features (e.g., FEAT_SPEv1p2) use bits above 31. No functional change intended. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210817081134.2918285-4-tabba@google.com
2021-08-20KVM: arm64: Remove trailing whitespace in commentFuad Tabba1-2/+2
Remove trailing whitespace from comment in trap_dbgauthstatus_el1(). No functional change intended. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210817081134.2918285-3-tabba@google.com
2021-08-20KVM: arm64: Upgrade VMID accesses to {READ,WRITE}_ONCEMarc Zyngier3-4/+4
Since TLB invalidation can run in parallel with VMID allocation, we need to be careful and avoid any sort of load/store tearing. Use {READ,WRITE}_ONCE consistently to avoid any surprise. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jade Alglave <jade.alglave@arm.com> Cc: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20210806113109.2475-6-will@kernel.org
2021-08-20KVM: arm64: Unify stage-2 programming behind __load_stage2()Marc Zyngier6-10/+10
The protected mode relies on a separate helper to load the S2 context. Move over to the __load_guest_stage2() helper instead, and rename it to __load_stage2() to present a unified interface. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jade Alglave <jade.alglave@arm.com> Cc: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210806113109.2475-5-will@kernel.org
2021-08-20KVM: arm64: Move kern_hyp_va() usage in __load_guest_stage2() into the callersMarc Zyngier4-4/+6
It is a bit awkward to use kern_hyp_va() in __load_guest_stage2(), specially as the helper is shared between VHE and nVHE. Instead, move the use of kern_hyp_va() in the nVHE code, and pass a pointer to the kvm->arch structure instead. Although this may look a bit awkward, it allows for some further simplification. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jade Alglave <jade.alglave@arm.com> Cc: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210806113109.2475-4-will@kernel.org
2021-08-20KVM: arm64: vgic: Resample HW pending state on deactivationMarc Zyngier4-62/+50
When a mapped level interrupt (a timer, for example) is deactivated by the guest, the corresponding host interrupt is equally deactivated. However, the fate of the pending state still needs to be dealt with in SW. This is specially true when the interrupt was in the active+pending state in the virtual distributor at the point where the guest was entered. On exit, the pending state is potentially stale (the guest may have put the interrupt in a non-pending state). If we don't do anything, the interrupt will be spuriously injected in the guest. Although this shouldn't have any ill effect (spurious interrupts are always possible), we can improve the emulation by detecting the deactivation-while-pending case and resample the interrupt. While we're at it, move the logic into a common helper that can be shared between the two GIC implementations. Fixes: e40cc57bac79 ("KVM: arm/arm64: vgic: Support level-triggered mapped interrupts") Reported-by: Raghavendra Rao Ananta <rananta@google.com> Tested-by: Raghavendra Rao Ananta <rananta@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210819180305.1670525-1-maz@kernel.org
2021-08-19KVM: arm64: vgic: Drop WARN from vgic_get_irqRicardo Koller1-1/+0
vgic_get_irq(intid) is used all over the vgic code in order to get a reference to a struct irq. It warns whenever intid is not a valid number (like when it's a reserved IRQ number). The issue is that this warning can be triggered from userspace (e.g., KVM_IRQ_LINE for intid 1020). Drop the WARN call from vgic_get_irq. Signed-off-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210818213205.598471-1-ricarkol@google.com
2021-08-19KVM: arm64: Use generic KVM xfer to guest work functionOliver Upton2-28/+45
Clean up handling of checks for pending work by switching to the generic infrastructure to do so. We pick up handling for TIF_NOTIFY_RESUME from this switch, meaning that task work will be correctly handled. Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210802192809.1851010-4-oupton@google.com
2021-08-19KVM: arm64: Record number of signal exits as a vCPU statOliver Upton2-0/+2
Most other architectures that implement KVM record a statistic indicating the number of times a vCPU has exited due to a pending signal. Add support for that stat to arm64. Reviewed-by: Jing Zhang <jingzhangos@google.com> Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210802192809.1851010-2-oupton@google.com
2021-08-19KVM: arm64: Enforce reserved bits for PSCI target affinitiesOliver Upton1-3/+12
According to the PSCI specification, ARM DEN 0022D, 5.1.4 "CPU_ON", the CPU_ON function takes a target_cpu argument that is bit-compatible with the affinity fields in MPIDR_EL1. All other bits in the argument are RES0. Note that the same constraints apply to the target_affinity argument for the AFFINITY_INFO call. Enforce the spec by returning INVALID_PARAMS if a guest incorrectly sets a RES0 bit. Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210818202133.1106786-4-oupton@google.com
2021-08-19KVM: arm64: Handle PSCI resets before userspace touches vCPU stateOliver Upton1-0/+8
The CPU_ON PSCI call takes a payload that KVM uses to configure a destination vCPU to run. This payload is non-architectural state and not exposed through any existing UAPI. Effectively, we have a race between CPU_ON and userspace saving/restoring a guest: if the target vCPU isn't ran again before the VMM saves its state, the requested PC and context ID are lost. When restored, the target vCPU will be runnable and start executing at its old PC. We can avoid this race by making sure the reset payload is serviced before userspace can access a vCPU's state. Fixes: 358b28f09f0a ("arm/arm64: KVM: Allow a VCPU to fully reset itself") Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210818202133.1106786-3-oupton@google.com
2021-08-19KVM: arm64: Fix read-side race on updates to vcpu reset stateOliver Upton1-6/+10
KVM correctly serializes writes to a vCPU's reset state, however since we do not take the KVM lock on the read side it is entirely possible to read state from two different reset requests. Cure the race for now by taking the KVM lock when reading the reset_state structure. Fixes: 358b28f09f0a ("arm/arm64: KVM: Allow a VCPU to fully reset itself") Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210818202133.1106786-2-oupton@google.com
2021-08-18KVM: arm64: Make hyp_panic() more robust when protected mode is enabledWill Deacon2-13/+31
When protected mode is enabled, the host is unable to access most parts of the EL2 hypervisor image, including 'hyp_physvirt_offset' and the contents of the hypervisor's '.rodata.str' section. Unfortunately, nvhe_hyp_panic_handler() tries to read from both of these locations when handling a BUG() triggered at EL2; the former for converting the ELR to a physical address and the latter for displaying the name of the source file where the BUG() occurred. Hack the EL2 panic asm to pass both physical and virtual ELR values to the host and utilise the newly introduced CONFIG_NVHE_EL2_DEBUG so that we disable stage-2 protection for the host before returning to the EL1 panic handler. If the debug option is not enabled, display the address instead of the source file:line information. Cc: Andrew Scull <ascull@google.com> Cc: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210813130336.8139-1-will@kernel.org
2021-08-18KVM: arm64: Drop unused REQUIRES_VIRTAnshuman Khandual1-4/+0
This seems like a residue from the past. REQUIRES_VIRT is no more available . Hence it can just be dropped along with the related code section. Cc: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.cs.columbia.edu Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/1628744994-16623-6-git-send-email-anshuman.khandual@arm.com
2021-08-18KVM: arm64: Drop check_kvm_target_cpu() based percpu probeAnshuman Khandual2-17/+3
kvm_target_cpu() never returns a negative error code, so check_kvm_target() would never have 'ret' filled with a negative error code. Hence the percpu probe via check_kvm_target_cpu() does not make sense as its never going to find an unsupported CPU, forcing kvm_arch_init() to exit early. Hence lets just drop this percpu probe (and also check_kvm_target_cpu()) altogether. While here, this also changes kvm_target_cpu() return type to a u32, making it explicit that an error code will not be returned from this function. Cc: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.cs.columbia.edu Cc: linux-kernel@vger.kernel.org Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/1628744994-16623-5-git-send-email-anshuman.khandual@arm.com
2021-08-18KVM: arm64: Drop init_common_resources()Anshuman Khandual1-6/+1
Could do without this additional indirection via init_common_resources() by just calling kvm_set_ipa_limit() directly instead. Cc: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.cs.columbia.edu Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/1628744994-16623-4-git-send-email-anshuman.khandual@arm.com
2021-08-18KVM: arm64: Use ARM64_MIN_PARANGE_BITS as the minimum supported IPAAnshuman Khandual1-1/+1
Drop hard coded value for the minimum supported IPA range bits (i.e 32). Instead use ARM64_MIN_PARANGE_BITS which improves the code readability. Cc: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.cs.columbia.edu Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/1628744994-16623-3-git-send-email-anshuman.khandual@arm.com
2021-08-11KVM: arm64: Return -EPERM from __pkvm_host_share_hyp()Quentin Perret1-1/+1
Fix the error code returned by __pkvm_host_share_hyp() when the host attempts to share with EL2 a page that has already been shared with another entity. Reported-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210811173630.2536721-1-qperret@google.com
2021-08-11KVM: arm64: Restrict IPA size to maximum 48 bits on 4K and 16K page sizeAnshuman Khandual1-0/+8
Even though ID_AA64MMFR0.PARANGE reports 52 bit PA size support, it cannot be enabled as guest IPA size on 4K or 16K page size configurations. Hence kvm_ipa_limit must be restricted to 48 bits. This change achieves required IPA capping. Before the commit c9b69a0cf0b4 ("KVM: arm64: Don't constrain maximum IPA size based on host configuration"), the problem here would have been just latent via PHYS_MASK_SHIFT (which earlier in turn capped kvm_ipa_limit), which remains capped at 48 bits on 4K and 16K configs. Cc: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.cs.columbia.edu Cc: linux-kernel@vger.kernel.org Fixes: c9b69a0cf0b4 ("KVM: arm64: Don't constrain maximum IPA size based on host configuration") Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/1628680275-16578-1-git-send-email-anshuman.khandual@arm.com
2021-08-11arm64/mm: Define ID_AA64MMFR0_TGRAN_2_SHIFTAnshuman Khandual1-15/+2
Streamline the Stage-2 TGRAN value extraction from ID_AA64MMFR0 register by adding a page size agnostic ID_AA64MMFR0_TGRAN_2_SHIFT. This is similar to the existing Stage-1 TGRAN shift i.e ID_AA64MMFR0_TGRAN_SHIFT. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.cs.columbia.edu Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/1628569782-30213-1-git-send-email-anshuman.khandual@arm.com
2021-08-11KVM: arm64: perf: Replace '0xf' instances with ID_AA64DFR0_PMUVER_IMP_DEFAnshuman Khandual2-4/+4
ID_AA64DFR0_PMUVER_IMP_DEF which indicate implementation defined PMU, never actually gets used although there are '0xf' instances scattered all around. Just do the macro replacement to improve readability. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Marc Zyngier <maz@kernel.org> Cc: linux-perf-users@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.cs.columbia.edu Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-08-11KVM: arm64: Make __pkvm_create_mappings staticQuentin Perret2-4/+2
The __pkvm_create_mappings() function is no longer used outside of nvhe/mm.c, make it static. Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-22-qperret@google.com
2021-08-11KVM: arm64: Restrict EL2 stage-1 changes in protected modeQuentin Perret4-11/+117
The host kernel is currently able to change EL2 stage-1 mappings without restrictions thanks to the __pkvm_create_mappings() hypercall. But in a world where the host is no longer part of the TCB, this clearly poses a problem. To fix this, introduce a new hypercall to allow the host to share a physical memory page with the hypervisor, and remove the __pkvm_create_mappings() variant. The new hypercall implements ownership and permission checks before allowing the sharing operation, and it annotates the shared page in the hypervisor stage-1 and host stage-2 page-tables. Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-21-qperret@google.com
2021-08-11KVM: arm64: Refactor protected nVHE stage-1 lockingQuentin Perret2-2/+17
Refactor the hypervisor stage-1 locking in nVHE protected mode to expose a new pkvm_create_mappings_locked() function. This will be used in later patches to allow walking and changing the hypervisor stage-1 without releasing the lock. Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-20-qperret@google.com
2021-08-11KVM: arm64: Remove __pkvm_mark_hypQuentin Perret4-75/+0
Now that we mark memory owned by the hypervisor in the host stage-2 during __pkvm_init(), we no longer need to rely on the host to explicitly mark the hyp sections later on. Remove the __pkvm_mark_hyp() hypercall altogether. Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-19-qperret@google.com
2021-08-11KVM: arm64: Mark host bss and rodata section as sharedQuentin Perret1-8/+74
As the hypervisor maps the host's .bss and .rodata sections in its stage-1, make sure to tag them as shared in hyp and host page-tables. But since the hypervisor relies on the presence of these mappings, we cannot let the host in complete control of the memory regions -- it must not unshare or donate them to another entity for example. To prevent this, let's transfer the ownership of those ranges to the hypervisor itself, and share the pages back with the host. Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-18-qperret@google.com