summaryrefslogtreecommitdiffstats
path: root/arch/arm64/include/asm/kvm_host.h
AgeCommit message (Collapse)AuthorFilesLines
2022-12-15Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-2/+74
Pull kvm updates from Paolo Bonzini: "ARM64: - Enable the per-vcpu dirty-ring tracking mechanism, together with an option to keep the good old dirty log around for pages that are dirtied by something other than a vcpu. - Switch to the relaxed parallel fault handling, using RCU to delay page table reclaim and giving better performance under load. - Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping option, which multi-process VMMs such as crosvm rely on (see merge commit 382b5b87a97d: "Fix a number of issues with MTE, such as races on the tags being initialised vs the PG_mte_tagged flag as well as the lack of support for VM_SHARED when KVM is involved. Patches from Catalin Marinas and Peter Collingbourne"). - Merge the pKVM shadow vcpu state tracking that allows the hypervisor to have its own view of a vcpu, keeping that state private. - Add support for the PMUv3p5 architecture revision, bringing support for 64bit counters on systems that support it, and fix the no-quite-compliant CHAIN-ed counter support for the machines that actually exist out there. - Fix a handful of minor issues around 52bit VA/PA support (64kB pages only) as a prefix of the oncoming support for 4kB and 16kB pages. - Pick a small set of documentation and spelling fixes, because no good merge window would be complete without those. s390: - Second batch of the lazy destroy patches - First batch of KVM changes for kernel virtual != physical address support - Removal of a unused function x86: - Allow compiling out SMM support - Cleanup and documentation of SMM state save area format - Preserve interrupt shadow in SMM state save area - Respond to generic signals during slow page faults - Fixes and optimizations for the non-executable huge page errata fix. - Reprogram all performance counters on PMU filter change - Cleanups to Hyper-V emulation and tests - Process Hyper-V TLB flushes from a nested guest (i.e. from a L2 guest running on top of a L1 Hyper-V hypervisor) - Advertise several new Intel features - x86 Xen-for-KVM: - Allow the Xen runstate information to cross a page boundary - Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured - Add support for 32-bit guests in SCHEDOP_poll - Notable x86 fixes and cleanups: - One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0). - Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few years back when eliminating unnecessary barriers when switching between vmcs01 and vmcs02. - Clean up vmread_error_trampoline() to make it more obvious that params must be passed on the stack, even for x86-64. - Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective of the current guest CPUID. - Fudge around a race with TSC refinement that results in KVM incorrectly thinking a guest needs TSC scaling when running on a CPU with a constant TSC, but no hardware-enumerated TSC frequency. - Advertise (on AMD) that the SMM_CTL MSR is not supported - Remove unnecessary exports Generic: - Support for responding to signals during page faults; introduces new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks Selftests: - Fix an inverted check in the access tracking perf test, and restore support for asserting that there aren't too many idle pages when running on bare metal. - Fix build errors that occur in certain setups (unsure exactly what is unique about the problematic setup) due to glibc overriding static_assert() to a variant that requires a custom message. - Introduce actual atomics for clear/set_bit() in selftests - Add support for pinning vCPUs in dirty_log_perf_test. - Rename the so called "perf_util" framework to "memstress". - Add a lightweight psuedo RNG for guest use, and use it to randomize the access pattern and write vs. read percentage in the memstress tests. - Add a common ucall implementation; code dedup and pre-work for running SEV (and beyond) guests in selftests. - Provide a common constructor and arch hook, which will eventually be used by x86 to automatically select the right hypercall (AMD vs. Intel). - A bunch of added/enabled/fixed selftests for ARM64, covering memslots, breakpoints, stage-2 faults and access tracking. - x86-specific selftest changes: - Clean up x86's page table management. - Clean up and enhance the "smaller maxphyaddr" test, and add a related test to cover generic emulation failure. - Clean up the nEPT support checks. - Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values. - Fix an ordering issue in the AMX test introduced by recent conversions to use kvm_cpu_has(), and harden the code to guard against similar bugs in the future. Anything that tiggers caching of KVM's supported CPUID, kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if the caching occurs before the test opts in via prctl(). Documentation: - Remove deleted ioctls from documentation - Clean up the docs for the x86 MSR filter. - Various fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (361 commits) KVM: x86: Add proper ReST tables for userspace MSR exits/flags KVM: selftests: Allocate ucall pool from MEM_REGION_DATA KVM: arm64: selftests: Align VA space allocator with TTBR0 KVM: arm64: Fix benign bug with incorrect use of VA_BITS KVM: arm64: PMU: Fix period computation for 64bit counters with 32bit overflow KVM: x86: Advertise that the SMM_CTL MSR is not supported KVM: x86: remove unnecessary exports KVM: selftests: Fix spelling mistake "probabalistic" -> "probabilistic" tools: KVM: selftests: Convert clear/set_bit() to actual atomics tools: Drop "atomic_" prefix from atomic test_and_set_bit() tools: Drop conflicting non-atomic test_and_{clear,set}_bit() helpers KVM: selftests: Use non-atomic clear/set bit helpers in KVM tests perf tools: Use dedicated non-atomic clear/set bit helpers tools: Take @bit as an "unsigned long" in {clear,set}_bit() helpers KVM: arm64: selftests: Enable single-step without a "full" ucall() KVM: x86: fix APICv/x2AVIC disabled when vm reboot by itself KVM: Remove stale comment about KVM_REQ_UNHALT KVM: Add missing arch for KVM_CREATE_DEVICE and KVM_{SET,GET}_DEVICE_ATTR KVM: Reference to kvm_userspace_memory_region in doc and comments KVM: Delete all references to removed KVM_SET_MEMORY_ALIAS ioctl ...
2022-12-05Merge branch kvm-arm64/pmu-unchained into kvmarm-master/nextMarc Zyngier1-0/+4
* kvm-arm64/pmu-unchained: : . : PMUv3 fixes and improvements: : : - Make the CHAIN event handling strictly follow the architecture : : - Add support for PMUv3p5 (64bit counters all the way) : : - Various fixes and cleanups : . KVM: arm64: PMU: Fix period computation for 64bit counters with 32bit overflow KVM: arm64: PMU: Sanitise PMCR_EL0.LP on first vcpu run KVM: arm64: PMU: Simplify PMCR_EL0 reset handling KVM: arm64: PMU: Replace version number '0' with ID_AA64DFR0_EL1_PMUVer_NI KVM: arm64: PMU: Make kvm_pmc the main data structure KVM: arm64: PMU: Simplify vcpu computation on perf overflow notification KVM: arm64: PMU: Allow PMUv3p5 to be exposed to the guest KVM: arm64: PMU: Implement PMUv3p5 long counter support KVM: arm64: PMU: Allow ID_DFR0_EL1.PerfMon to be set from userspace KVM: arm64: PMU: Allow ID_AA64DFR0_EL1.PMUver to be set from userspace KVM: arm64: PMU: Move the ID_AA64DFR0_EL1.PMUver limit to VM creation KVM: arm64: PMU: Do not let AArch32 change the counters' top 32 bits KVM: arm64: PMU: Simplify setting a counter to a specific value KVM: arm64: PMU: Add counter_index_to_*reg() helpers KVM: arm64: PMU: Only narrow counters that are not 64bit wide KVM: arm64: PMU: Narrow the overflow checking when required KVM: arm64: PMU: Distinguish between 64bit counter and 64bit overflow KVM: arm64: PMU: Always advertise the CHAIN event KVM: arm64: PMU: Align chained counter implementation with architecture pseudocode arm64: Add ID_DFR0_EL1.PerfMon values for PMUv3p7 and IMP_DEF Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-11-29arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVEMark Brown1-1/+11
When we save the state for the floating point registers this can be done in the form visible through either the FPSIMD V registers or the SVE Z and P registers. At present we track which format is currently used based on TIF_SVE and the SME streaming mode state but particularly in the SVE case this limits our options for optimising things, especially around syscalls. Introduce a new enum which we place together with saved floating point state in both thread_struct and the KVM guest state which explicitly states which format is active and keep it up to date when we change it. At present we do not use this state except to verify that it has the expected value when loading the state, future patches will introduce functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221115094640.112848-3-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-11-19KVM: arm64: PMU: Move the ID_AA64DFR0_EL1.PMUver limit to VM creationMarc Zyngier1-0/+4
As further patches will enable the selection of a PMU revision from userspace, sample the supported PMU revision at VM creation time, rather than building each time the ID_AA64DFR0_EL1 register is accessed. This shouldn't result in any change in behaviour. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221113163832.3154370-11-maz@kernel.org
2022-11-11KVM: arm64: Return guest memory from EL2 via dedicated teardown memcacheQuentin Perret1-6/+1
Rather than relying on the host to free the previously-donated pKVM hypervisor VM pages explicitly on teardown, introduce a dedicated teardown memcache which allows the host to reclaim guest memory resources without having to keep track of all of the allocations made by the pKVM hypervisor at EL2. Tested-by: Vincent Donnefort <vdonnefort@google.com> Co-developed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> [maz: dropped __maybe_unused from unmap_donated_memory_noclear()] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-21-will@kernel.org
2022-11-11KVM: arm64: Consolidate stage-2 initialisation into a single functionQuentin Perret1-2/+0
The initialisation of guest stage-2 page-tables is currently split across two functions: kvm_init_stage2_mmu() and kvm_arm_setup_stage2(). That is presumably for historical reasons as kvm_arm_setup_stage2() originates from the (now defunct) KVM port for 32-bit Arm. Simplify this code path by merging both functions into one, taking care to map the 'struct kvm' into the hypervisor stage-1 early on in order to simplify the failure path. Tested-by: Vincent Donnefort <vdonnefort@google.com> Co-developed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-19-will@kernel.org
2022-11-11KVM: arm64: Add generic hyp_memcache helpersQuentin Perret1-0/+57
The host at EL1 and the pKVM hypervisor at EL2 will soon need to exchange memory pages dynamically for creating and destroying VM state. Indeed, the hypervisor will rely on the host to donate memory pages it can use to create guest stage-2 page-tables and to store VM and vCPU metadata. In order to ease this process, introduce a 'struct hyp_memcache' which is essentially a linked list of available pages, indexed by physical addresses so that it can be passed meaningfully between the different virtual address spaces configured at EL1 and EL2. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-18-will@kernel.org
2022-11-11KVM: arm64: Instantiate pKVM hypervisor VM and vCPU structures from EL1Fuad Tabba1-2/+12
With the pKVM hypervisor at EL2 now offering hypercalls to the host for creating and destroying VM and vCPU structures, plumb these in to the existing arm64 KVM backend to ensure that the hypervisor data structures are allocated and initialised on first vCPU run for a pKVM guest. In the host, 'struct kvm_protected_vm' is introduced to hold the handle of the pKVM VM instance as well as to track references to the memory donated to the hypervisor so that it can be freed back to the host allocator following VM teardown. The stage-2 page-table, hypervisor VM and vCPU structures are allocated separately so as to avoid the need for a large physically-contiguous allocation in the host at run-time. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-14-will@kernel.org
2022-11-11KVM: arm64: Add infrastructure to create and track pKVM instances at EL2Fuad Tabba1-0/+8
Introduce a global table (and lock) to track pKVM instances at EL2, and provide hypercalls that can be used by the untrusted host to create and destroy pKVM VMs and their vCPUs. pKVM VM/vCPU state is directly accessible only by the trusted hypervisor (EL2). Each pKVM VM is directly associated with an untrusted host KVM instance, and is referenced by the host using an opaque handle. Future patches will provide hypercalls to allow the host to initialize/set/get pKVM VM/vCPU state using the opaque handle. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Co-developed-by: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> [maz: silence warning on unmap_donated_memory_noclear()] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-13-will@kernel.org
2022-09-19KVM: arm64: Clear PSTATE.SS when the Software Step state was Active-pendingReiji Watanabe1-0/+3
While userspace enables single-step, if the Software Step state at the last guest exit was "Active-pending", clear PSTATE.SS on guest entry to restore the state. Currently, KVM sets PSTATE.SS to 1 on every guest entry while userspace enables single-step for the vCPU (with KVM_GUESTDBG_SINGLESTEP). It means KVM always makes the vCPU's Software Step state "Active-not-pending" on the guest entry, which lets the VCPU perform single-step (then Software Step exception is taken). This could cause extra single-step (without returning to userspace) if the Software Step state at the last guest exit was "Active-pending" (i.e. the last exit was triggered by an asynchronous exception after the single-step is performed, but before the Software Step exception is taken. See "Figure D2-3 Software step state machine" and "D2.12.7 Behavior in the active-pending state" in ARM DDI 0487I.a for more info about this behavior). Fix this by clearing PSTATE.SS on guest entry if the Software Step state at the last exit was "Active-pending" so that KVM restore the state (and the exception is taken before further single-step is performed). Fixes: 337b99bf7edf ("KVM: arm64: guest debug, add support for single-step") Signed-off-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220917010600.532642-3-reijiw@google.com
2022-09-19KVM: arm64: Preserve PSTATE.SS for the guest while single-step is enabledReiji Watanabe1-0/+1
Preserve the PSTATE.SS value for the guest while userspace enables single-step (i.e. while KVM manipulates the PSTATE.SS) for the vCPU. Currently, while userspace enables single-step for the vCPU (with KVM_GUESTDBG_SINGLESTEP), KVM sets PSTATE.SS to 1 on every guest entry, not saving its original value. When userspace disables single-step, KVM doesn't restore the original value for the subsequent guest entry (use the current value instead). Exception return instructions copy PSTATE.SS from SPSR_ELx.SS only in certain cases when single-step is enabled (and set it to 0 in other cases). So, the value matters only when the guest enables single-step (and when the guest's Software step state isn't affected by single-step enabled by userspace, practically), though. Fix this by preserving the original PSTATE.SS value while userspace enables single-step, and restoring the value once it is disabled. This fix modifies the behavior of GET_ONE_REG/SET_ONE_REG for the PSTATE.SS while single-step is enabled by userspace. Presently, GET_ONE_REG/SET_ONE_REG gets/sets the current PSTATE.SS value, which KVM will override on the next guest entry (i.e. the value userspace gets/sets is not used for the next guest entry). With this patch, GET_ONE_REG/SET_ONE_REG will get/set the guest's preserved value, which KVM will preserve and try to restore after single-step is disabled. Fixes: 337b99bf7edf ("KVM: arm64: guest debug, add support for single-step") Signed-off-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220917010600.532642-2-reijiw@google.com
2022-08-17KVM: arm64: Treat PMCR_EL1.LC as RES1 on asymmetric systemsOliver Upton1-0/+4
KVM does not support AArch32 on asymmetric systems. To that end, enforce AArch64-only behavior on PMCR_EL1.LC when on an asymmetric system. Fixes: 2122a833316f ("arm64: Allow mismatched 32-bit EL0 support") Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220816192554.1455559-2-oliver.upton@linux.dev
2022-07-17Merge branch kvm-arm64/sysreg-cleanup-5.20 into kvmarm-master/nextMarc Zyngier1-2/+0
* kvm-arm64/sysreg-cleanup-5.20: : . : Long overdue cleanup of the sysreg userspace access, : with extra scrubbing on the vgic side of things. : From the cover letter: : : "Schspa Shi recently reported[1] that some of the vgic code interacting : with userspace was reading uninitialised stack memory, and although : that read wasn't used any further, it prompted me to revisit this part : of the code. : : Needless to say, this area of the kernel is pretty crufty, and shows a : bunch of issues in other parts of the KVM/arm64 infrastructure. This : series tries to remedy a bunch of them: : : - Sanitise the way we deal with sysregs from userspace: at the moment, : each and every .set_user/.get_user callback has to implement its own : userspace accesses (directly or indirectly). It'd be much better if : that was centralised so that we can reason about it. : : - Enforce that all AArch64 sysregs are 64bit. Always. This was sort of : implied by the code, but it took some effort to convince myself that : this was actually the case. : : - Move the vgic-v3 sysreg userspace accessors to the userspace : callbacks instead of hijacking the vcpu trap callback. This allows : us to reuse the sysreg infrastructure. : : - Consolidate userspace accesses for both GICv2, GICv3 and common code : as much as possible. : : - Cleanup a bunch of not-very-useful helpers, tidy up some of the code : as we touch it. : : [1] https://lore.kernel.org/r/m2h740zz1i.fsf@gmail.com" : . KVM: arm64: Get rid or outdated comments KVM: arm64: Descope kvm_arm_sys_reg_{get,set}_reg() KVM: arm64: Get rid of find_reg_by_id() KVM: arm64: vgic: Tidy-up calls to vgic_{get,set}_common_attr() KVM: arm64: vgic: Consolidate userspace access for base address setting KVM: arm64: vgic-v2: Add helper for legacy dist/cpuif base address setting KVM: arm64: vgic: Use {get,put}_user() instead of copy_{from.to}_user KVM: arm64: vgic-v2: Consolidate userspace access for MMIO registers KVM: arm64: vgic-v3: Consolidate userspace access for MMIO registers KVM: arm64: vgic-v3: Use u32 to manage the line level from userspace KVM: arm64: vgic-v3: Convert userspace accessors over to FIELD_GET/FIELD_PREP KVM: arm64: vgic-v3: Make the userspace accessors use sysreg API KVM: arm64: vgic-v3: Push user access into vgic_v3_cpu_sysregs_uaccess() KVM: arm64: vgic-v3: Simplify vgic_v3_has_cpu_sysregs_attr() KVM: arm64: Get rid of reg_from/to_user() KVM: arm64: Consolidate sysreg userspace accesses KVM: arm64: Rely on index_to_param() for size checks on userspace access KVM: arm64: Introduce generic get_user/set_user helpers for system registers KVM: arm64: Reorder handling of invariant sysregs from userspace KVM: arm64: Add get_reg_by_id() as a sys_reg_desc retrieving helper Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-07-17KVM: arm64: Descope kvm_arm_sys_reg_{get,set}_reg()Marc Zyngier1-2/+0
Having kvm_arm_sys_reg_get_reg and co in kvm_host.h gives the impression that these functions are free to be called from anywhere. Not quite. They really are tied to out internal sysreg handling, and they would be better off in the sys_regs.h header, which is private. kvm_host.h could also get a bit of a diet, so let's just do that. Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-29Merge branch kvm-arm64/burn-the-flags into kvmarm-master/nextMarc Zyngier1-55/+148
* kvm-arm64/burn-the-flags: : . : Rework the per-vcpu flags to make them more manageable, : splitting them in different sets that have specific : uses: : : - configuration flags : - input to the world-switch : - state bookkeeping for the kernel itself : : The FP tracking is also simplified and tracked outside : of the flags as a separate state. : . KVM: arm64: Move the handling of !FP outside of the fast path KVM: arm64: Document why pause cannot be turned into a flag KVM: arm64: Reduce the size of the vcpu flag members KVM: arm64: Add build-time sanity checks for flags KVM: arm64: Warn when PENDING_EXCEPTION and INCREMENT_PC are set together KVM: arm64: Convert vcpu sysregs_loaded_on_cpu to a state flag KVM: arm64: Kill unused vcpu flags field KVM: arm64: Move vcpu WFIT flag to the state flag set KVM: arm64: Move vcpu ON_UNSUPPORTED_CPU flag to the state flag set KVM: arm64: Move vcpu SVE/SME flags to the state flag set KVM: arm64: Move vcpu debug/SPE/TRBE flags to the input flag set KVM: arm64: Move vcpu PC/Exception flags to the input flag set KVM: arm64: Move vcpu configuration flags into their own set KVM: arm64: Add three sets of flags to the vcpu state KVM: arm64: Add helpers to manipulate vcpu flags among a set KVM: arm64: Move FP state ownership from flag to a tristate KVM: arm64: Drop FP_FOREIGN_STATE from the hypervisor code Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-29KVM: arm64: Document why pause cannot be turned into a flagMarc Zyngier1-3/+9
It would be tempting to turn the 'pause' state into a flag. However, this cannot easily be done as it is updated out of context, while all the flags expect to only be updated from the vcpu thread. Turning it into a flag would require to make all flag updates atomic, which isn't necessary desireable. Document this, and take this opportunity to move the field next to the flag sets, filling a hole in the vcpu structure. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-29KVM: arm64: Reduce the size of the vcpu flag membersMarc Zyngier1-3/+3
Now that we can detect flags overflowing their container, reduce the size of all flag set members in the vcpu struct, turning them into 8bit quantities. Even with the FP state enum occupying 32bit, the whole of the state that was represented by flags is smaller by one byte. Profit! Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-29KVM: arm64: Add build-time sanity checks for flagsMarc Zyngier1-0/+16
Flags are great, but flags can also be dangerous: it is easy to encode a flag that is bigger than its container (unless the container is a u64), and it is easy to construct a flag value that doesn't fit in the mask that is associated with it. Add a couple of build-time sanity checks that ensure we catch these two cases. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-29KVM: arm64: Convert vcpu sysregs_loaded_on_cpu to a state flagMarc Zyngier1-4/+2
The aptly named boolean 'sysregs_loaded_on_cpu' tracks whether some of the vcpu system registers are resident on the physical CPU when running in VHE mode. This is obviously a flag in hidding, so let's convert it to a state flag, since this is solely a host concern (the hypervisor itself always knows which state we're in). Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-29KVM: arm64: Kill unused vcpu flags fieldMarc Zyngier1-3/+0
Horray, we have now sorted all the preexisting flags, and the 'flags' field is now unused. Get rid of it while nobody is looking. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-29KVM: arm64: Move vcpu WFIT flag to the state flag setMarc Zyngier1-2/+2
The host kernel uses the WFIT flag to remember that a vcpu has used this instruction and wake it up as required. Move it to the state set, as nothing in the hypervisor uses this information. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-29KVM: arm64: Move vcpu ON_UNSUPPORTED_CPU flag to the state flag setMarc Zyngier1-4/+5
The ON_UNSUPPORTED_CPU flag is only there to track the sad fact that we have ended-up on a CPU where we cannot really run. Since this is only for the host kernel's use, move it to the state set. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-29KVM: arm64: Move vcpu SVE/SME flags to the state flag setMarc Zyngier1-3/+5
The two HOST_{SVE,SME}_ENABLED are only used for the host kernel to track its own state across a vcpu run so that it can be fully restored. Move these flags to the so called state set. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-29KVM: arm64: Move vcpu debug/SPE/TRBE flags to the input flag setMarc Zyngier1-3/+6
The three debug flags (which deal with the debug registers, SPE and TRBE) all are input flags to the hypervisor code. Move them into the input set and convert them to the new accessors. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-10KVM: arm64: Move vcpu PC/Exception flags to the input flag setMarc Zyngier1-24/+34
The PC update flags (which also deal with exception injection) is one of the most complicated use of the flag we have. Make it more fool prof by: - moving it over to the new accessors and assign it to the input flag set - turn the combination of generic ELx flags with another flag indicating the target EL itself into an explicit set of flags for each EL and vector combination - add a new accessor to pend the exception This is otherwise a pretty straightformward conversion. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-09KVM: arm64: Move vcpu configuration flags into their own setMarc Zyngier1-7/+10
The KVM_ARM64_{GUEST_HAS_SVE,VCPU_SVE_FINALIZED,GUEST_HAS_PTRAUTH} flags are purely configuration flags. Once set, they are never cleared, but evaluated all over the code base. Move these three flags into the configuration set in one go, using the new accessors, and take this opportunity to drop the KVM_ARM64_ prefix which doesn't provide any help. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-09KVM: arm64: Drop stale commentMarc Zyngier1-5/+0
The layout of 'struct kvm_vcpu_arch' has evolved significantly since the initial port of KVM/arm64, so remove the stale comment suggesting that a prefix of the structure is used exclusively from assembly code. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220609121223.2551-7-will@kernel.org
2022-06-09KVM: arm64: Add three sets of flags to the vcpu stateMarc Zyngier1-0/+9
It so appears that each of the vcpu flags is really belonging to one of three categories: - a configuration flag, set once and for all - an input flag generated by the kernel for the hypervisor to use - a state flag that is only for the kernel's own bookkeeping As we are going to split all the existing flags into these three sets, introduce all three in one go. No functional change other than a bit of bloat... Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-09KVM: arm64: Add helpers to manipulate vcpu flags among a setMarc Zyngier1-0/+44
Careful analysis of the vcpu flags show that this is a mix of configuration, communication between the host and the hypervisor, as well as anciliary state that has no consistency. It'd be a lot better if we could split these flags into consistent categories. However, even if we split these flags apart, we want to make sure that each flag can only be applied to its own set, and not across sets. To achieve this, use a preprocessor hack so that each flag is always associated with: - the set that contains it, - a mask that describe all the bits that contain it (for a simple flag, this is the same thing as the flag itself, but we will eventually have values that cover multiple bits at once). Each flag is thus a triplet that is not directly usable as a value, but used by three helpers that allow the flag to be set, cleared, and fetched. By mandating the use of such helper, we can easily enforce that a flag can only be used with the set it belongs to. Finally, one last helper "unpacks" the raw value from the triplet that represents a flag, which is useful for multi-bit values that need to be enumerated (in a switch statement, for example). Further patches will start making use of this infrastructure. Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-09KVM: arm64: Move FP state ownership from flag to a tristateMarc Zyngier1-2/+7
The KVM FP code uses a pair of flags to denote three states: - FP_ENABLED set: the guest owns the FP state - FP_HOST set: the host owns the FP state - FP_ENABLED and FP_HOST clear: nobody owns the FP state at all and both flags set is an illegal state, which nothing ever checks for... As it turns out, this isn't really a good match for flags, and we'd be better off if this was a simpler tristate, each state having a name that actually reflect the state: - FP_STATE_FREE - FP_STATE_HOST_OWNED - FP_STATE_GUEST_OWNED Kill the two flags, and move over to an enum encoding these three states. This results in less confusing code, and less risk of ending up in the uncharted territory of a 4th state if we forget to clear one of the two flags. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Reiji Watanabe <reijiw@google.com>
2022-06-09KVM: arm64: Drop FP_FOREIGN_STATE from the hypervisor codeMarc Zyngier1-1/+0
The vcpu KVM_ARM64_FP_FOREIGN_FPSTATE flag tracks the thread's own TIF_FOREIGN_FPSTATE so that we can evaluate just before running the vcpu whether it the FP regs contain something that is owned by the vcpu or not by updating the rest of the FP flags. We do this in the hypervisor code in order to make sure we're in a context where we are not interruptible. But we already have a hook in the run loop to generate this flag. We may as well update the FP flags directly and save the pointless flag tracking. Whilst we're at it, rename update_fp_enabled() to guest_owns_fp_regs() to indicate what the leftover of this helper actually do. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Reiji Watanabe <reijiw@google.com> Reviewed-by: Mark Brown <broonie@kernel.org>
2022-05-26Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-17/+27
Pull kvm updates from Paolo Bonzini: "S390: - ultravisor communication device driver - fix TEID on terminating storage key ops RISC-V: - Added Sv57x4 support for G-stage page table - Added range based local HFENCE functions - Added remote HFENCE functions based on VCPU requests - Added ISA extension registers in ONE_REG interface - Updated KVM RISC-V maintainers entry to cover selftests support ARM: - Add support for the ARMv8.6 WFxT extension - Guard pages for the EL2 stacks - Trap and emulate AArch32 ID registers to hide unsupported features - Ability to select and save/restore the set of hypercalls exposed to the guest - Support for PSCI-initiated suspend in collaboration with userspace - GICv3 register-based LPI invalidation support - Move host PMU event merging into the vcpu data structure - GICv3 ITS save/restore fixes - The usual set of small-scale cleanups and fixes x86: - New ioctls to get/set TSC frequency for a whole VM - Allow userspace to opt out of hypercall patching - Only do MSR filtering for MSRs accessed by rdmsr/wrmsr AMD SEV improvements: - Add KVM_EXIT_SHUTDOWN metadata for SEV-ES - V_TSC_AUX support Nested virtualization improvements for AMD: - Support for "nested nested" optimizations (nested vVMLOAD/VMSAVE, nested vGIF) - Allow AVIC to co-exist with a nested guest running - Fixes for LBR virtualizations when a nested guest is running, and nested LBR virtualization support - PAUSE filtering for nested hypervisors Guest support: - Decoupling of vcpu_is_preempted from PV spinlocks" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (199 commits) KVM: x86: Fix the intel_pt PMI handling wrongly considered from guest KVM: selftests: x86: Sync the new name of the test case to .gitignore Documentation: kvm: reorder ARM-specific section about KVM_SYSTEM_EVENT_SUSPEND x86, kvm: use correct GFP flags for preemption disabled KVM: LAPIC: Drop pending LAPIC timer injection when canceling the timer x86/kvm: Alloc dummy async #PF token outside of raw spinlock KVM: x86: avoid calling x86 emulator without a decoded instruction KVM: SVM: Use kzalloc for sev ioctl interfaces to prevent kernel data leak x86/fpu: KVM: Set the base guest FPU uABI size to sizeof(struct kvm_xsave) s390/uv_uapi: depend on CONFIG_S390 KVM: selftests: x86: Fix test failure on arch lbr capable platforms KVM: LAPIC: Trace LAPIC timer expiration on every vmentry KVM: s390: selftest: Test suppression indication on key prot exception KVM: s390: Don't indicate suppression on dirtying, failing memop selftests: drivers/s390x: Add uvdevice tests drivers/s390/char: Add Ultravisor io device MAINTAINERS: Update KVM RISC-V entry to cover selftests support RISC-V: KVM: Introduce ISA extension register RISC-V: KVM: Cleanup stale TLB entries when host CPU changes RISC-V: KVM: Add remote HFENCE functions based on VCPU requests ...
2022-05-25Merge tag 'kvmarm-5.19' of ↵Paolo Bonzini1-14/+31
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 5.19 - Add support for the ARMv8.6 WFxT extension - Guard pages for the EL2 stacks - Trap and emulate AArch32 ID registers to hide unsupported features - Ability to select and save/restore the set of hypercalls exposed to the guest - Support for PSCI-initiated suspend in collaboration with userspace - GICv3 register-based LPI invalidation support - Move host PMU event merging into the vcpu data structure - GICv3 ITS save/restore fixes - The usual set of small-scale cleanups and fixes [Due to the conflict, KVM_SYSTEM_EVENT_SEV_TERM is relocated from 4 to 6. - Paolo]
2022-05-20Merge branch 'for-next/esr-elx-64-bit' into for-next/coreCatalin Marinas1-1/+1
* for-next/esr-elx-64-bit: : Treat ESR_ELx as a 64-bit register. KVM: arm64: uapi: Add kvm_debug_exit_arch.hsr_high KVM: arm64: Treat ESR_EL2 as a 64-bit register arm64: Treat ESR_ELx as a 64-bit register arm64: compat: Do not treat syscall number as ESR_ELx for a bad syscall arm64: Make ESR_ELx_xVC_IMM_MASK compatible with assembly
2022-05-16Merge branch kvm-arm64/misc-5.19 into kvmarm-master/nextMarc Zyngier1-1/+1
* kvm-arm64/misc-5.19: : . : Misc fixes and general improvements for KVMM/arm64: : : - Better handle out of sequence sysregs in the global tables : : - Remove a couple of unnecessary loads from constant pool : : - Drop unnecessary pKVM checks : : - Add all known M1 implementations to the SEIS workaround : : - Cleanup kerneldoc warnings : . KVM: arm64: vgic-v3: List M1 Pro/Max as requiring the SEIS workaround KVM: arm64: pkvm: Don't mask already zeroed FEAT_SVE KVM: arm64: pkvm: Drop unnecessary FP/SIMD trap handler KVM: arm64: nvhe: Eliminate kernel-doc warnings KVM: arm64: Avoid unnecessary absolute addressing via literals KVM: arm64: Print emulated register table name when it is unsorted KVM: arm64: Don't BUG_ON() if emulated register table is unsorted Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-05-16Merge branch kvm-arm64/per-vcpu-host-pmu-data into kvmarm-master/nextMarc Zyngier1-11/+0
* kvm-arm64/per-vcpu-host-pmu-data: : . : Pass the host PMU state in the vcpu to avoid the use of additional : shared memory between EL1 and EL2 (this obviously only applies : to nVHE and Protected setups). : : Patches courtesy of Fuad Tabba. : . KVM: arm64: pmu: Restore compilation when HW_PERF_EVENTS isn't selected KVM: arm64: Reenable pmu in Protected Mode KVM: arm64: Pass pmu events to hyp via vcpu KVM: arm64: Repack struct kvm_pmu to reduce size KVM: arm64: Wrapper for getting pmu_events Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-05-16Merge branch kvm-arm64/psci-suspend into kvmarm-master/nextMarc Zyngier1-2/+8
* kvm-arm64/psci-suspend: : . : Add support for PSCI SYSTEM_SUSPEND and allow userspace to : filter the wake-up events. : : Patches courtesy of Oliver. : . Documentation: KVM: Fix title level for PSCI_SUSPEND selftests: KVM: Test SYSTEM_SUSPEND PSCI call selftests: KVM: Refactor psci_test to make it amenable to new tests selftests: KVM: Use KVM_SET_MP_STATE to power off vCPU in psci_test selftests: KVM: Create helper for making SMCCC calls selftests: KVM: Rename psci_cpu_on_test to psci_test KVM: arm64: Implement PSCI SYSTEM_SUSPEND KVM: arm64: Add support for userspace to suspend a vCPU KVM: arm64: Return a value from check_vcpu_requests() KVM: arm64: Rename the KVM_REQ_SLEEP handler KVM: arm64: Track vCPU power state using MP state values KVM: arm64: Dedupe vCPU power off helpers KVM: arm64: Don't depend on fallthrough to hide SYSTEM_RESET2 Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-05-16Merge branch kvm-arm64/hcall-selection into kvmarm-master/nextMarc Zyngier1-0/+16
* kvm-arm64/hcall-selection: : . : Introduce a new set of virtual sysregs for userspace to : select the hypercalls it wants to see exposed to the guest. : : Patches courtesy of Raghavendra and Oliver. : . KVM: arm64: Fix hypercall bitmap writeback when vcpus have already run KVM: arm64: Hide KVM_REG_ARM_*_BMAP_BIT_COUNT from userspace Documentation: Fix index.rst after psci.rst renaming selftests: KVM: aarch64: Add the bitmap firmware registers to get-reg-list selftests: KVM: aarch64: Introduce hypercall ABI test selftests: KVM: Create helper for making SMCCC calls selftests: KVM: Rename psci_cpu_on_test to psci_test tools: Import ARM SMCCC definitions Docs: KVM: Add doc for the bitmap firmware registers Docs: KVM: Rename psci.rst to hypercalls.rst KVM: arm64: Add vendor hypervisor firmware register KVM: arm64: Add standard hypervisor firmware register KVM: arm64: Setup a framework for hypercall bitmap firmware registers KVM: arm64: Factor out firmware register handling from psci.c Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-05-16KVM: arm64: pmu: Restore compilation when HW_PERF_EVENTS isn't selectedMarc Zyngier1-6/+0
Moving kvm_pmu_events into the vcpu (and refering to it) broke the somewhat unusual case where the kernel has no support for a PMU at all. In order to solve this, move things around a bit so that we can easily avoid refering to the pmu structure outside of PMU-aware code. As a bonus, pmu.c isn't compiled in when HW_PERF_EVENTS isn't selected. Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/202205161814.KQHpOzsJ-lkp@intel.com
2022-05-15KVM: arm64: Pass pmu events to hyp via vcpuFuad Tabba1-6/+1
Instead of the host accessing hyp data directly, pass the pmu events of the current cpu to hyp via the vcpu. This adds 64 bits (in two fields) to the vcpu that need to be synced before every vcpu run in nvhe and protected modes. However, it isolates the hypervisor from the host, which allows us to use pmu in protected mode in a subsequent patch. No visible side effects in behavior intended. Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220510095710.148178-4-tabba@google.com
2022-05-04KVM: arm64: Don't BUG_ON() if emulated register table is unsortedAlexandru Elisei1-1/+1
To emulate a register access, KVM uses a table of registers sorted by register encoding to speed up queries using binary search. When Linux boots, KVM checks that the table is sorted and uses a BUG_ON() statement to let the user know if it's not. The unfortunate side effect is that an unsorted sysreg table brings down the whole kernel, not just KVM, even though the rest of the kernel can function just fine without KVM. To make matters worse, on machines which lack a serial console, the user is left pondering why the machine is taking so long to boot. Improve this situation by returning an error from kvm_arch_init() if the sysreg tables are not in the correct order. The machine is still very much usable for the user, with the exception of virtualization, who can now easily determine what went wrong. A minor typo has also been corrected in the check_sysreg_table() function. Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220428103405.70884-2-alexandru.elisei@arm.com
2022-05-04Merge branch kvm-arm64/aarch32-idreg-trap into kvmarm-master/nextMarc Zyngier1-0/+1
* kvm-arm64/aarch32-idreg-trap: : . : Add trapping/sanitising infrastructure for AArch32 systen registers, : allowing more control over what we actually expose (such as the PMU). : : Patches courtesy of Oliver and Alexandru. : . KVM: arm64: Fix new instances of 32bit ESRs KVM: arm64: Hide AArch32 PMU registers when not available KVM: arm64: Start trapping ID registers for 32 bit guests KVM: arm64: Plumb cp10 ID traps through the AArch64 sysreg handler KVM: arm64: Wire up CP15 feature registers to their AArch64 equivalents KVM: arm64: Don't write to Rt unless sys_reg emulation succeeds KVM: arm64: Return a bool from emulate_cp() Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-05-04Merge branch kvm-arm64/wfxt into kvmarm-master/nextMarc Zyngier1-0/+1
* kvm-arm64/wfxt: : . : Add support for the WFET/WFIT instructions that provide the same : service as WFE/WFI, only with a timeout. : . KVM: arm64: Expose the WFXT feature to guests KVM: arm64: Offer early resume for non-blocking WFxT instructions KVM: arm64: Handle blocking WFIT instruction KVM: arm64: Introduce kvm_counter_compute_delta() helper KVM: arm64: Simplify kvm_cpu_has_pending_timer() arm64: Use WFxT for __delay() when possible arm64: Add wfet()/wfit() helpers arm64: Add HWCAP advertising FEAT_WFXT arm64: Add RV and RN fields for ESR_ELx_WFx_ISS arm64: Expand ESR_ELx_WFx_ISS_TI to match its ARMv8.7 definition Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-05-04KVM: arm64: Implement PSCI SYSTEM_SUSPENDOliver Upton1-0/+2
ARM DEN0022D.b 5.19 "SYSTEM_SUSPEND" describes a PSCI call that allows software to request that a system be placed in the deepest possible low-power state. Effectively, software can use this to suspend itself to RAM. Unfortunately, there really is no good way to implement a system-wide PSCI call in KVM. Any precondition checks done in the kernel will need to be repeated by userspace since there is no good way to protect a critical section that spans an exit to userspace. SYSTEM_RESET and SYSTEM_OFF are equally plagued by this issue, although no users have seemingly cared for the relatively long time these calls have been supported. The solution is to just make the whole implementation userspace's problem. Introduce a new system event, KVM_SYSTEM_EVENT_SUSPEND, that indicates to userspace a calling vCPU has invoked PSCI SYSTEM_SUSPEND. Additionally, add a CAP to get buy-in from userspace for this new exit type. Only advertise the SYSTEM_SUSPEND PSCI call if userspace has opted in. If a vCPU calls SYSTEM_SUSPEND, punt straight to userspace. Provide explicit documentation of userspace's responsibilites for the exit and point to the PSCI specification to describe the actual PSCI call. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220504032446.4133305-8-oupton@google.com
2022-05-04KVM: arm64: Add support for userspace to suspend a vCPUOliver Upton1-0/+1
Introduce a new MP state, KVM_MP_STATE_SUSPENDED, which indicates a vCPU is in a suspended state. In the suspended state the vCPU will block until a wakeup event (pending interrupt) is recognized. Add a new system event type, KVM_SYSTEM_EVENT_WAKEUP, to indicate to userspace that KVM has recognized one such wakeup event. It is the responsibility of userspace to then make the vCPU runnable, or leave it suspended until the next wakeup event. Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220504032446.4133305-7-oupton@google.com
2022-05-04KVM: arm64: Track vCPU power state using MP state valuesOliver Upton1-2/+3
A subsequent change to KVM will add support for additional power states. Store the MP state by value rather than keeping track of it as a boolean. No functional change intended. Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220504032446.4133305-4-oupton@google.com
2022-05-04KVM: arm64: Dedupe vCPU power off helpersOliver Upton1-0/+2
vcpu_power_off() and kvm_psci_vcpu_off() are equivalent; rename the former and replace all callsites to the latter. No functional change intended. Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220504032446.4133305-3-oupton@google.com
2022-05-03KVM: arm64: Add vendor hypervisor firmware registerRaghavendra Rao Ananta1-0/+2
Introduce the firmware register to hold the vendor specific hypervisor service calls (owner value 6) as a bitmap. The bitmap represents the features that'll be enabled for the guest, as configured by the user-space. Currently, this includes support for KVM-vendor features along with reading the UID, represented by bit-0, and Precision Time Protocol (PTP), represented by bit-1. Signed-off-by: Raghavendra Rao Ananta <rananta@google.com> Reviewed-by: Gavin Shan <gshan@redhat.com> [maz: tidy-up bitmap values] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220502233853.1233742-5-rananta@google.com
2022-05-03KVM: arm64: Add standard hypervisor firmware registerRaghavendra Rao Ananta1-0/+2
Introduce the firmware register to hold the standard hypervisor service calls (owner value 5) as a bitmap. The bitmap represents the features that'll be enabled for the guest, as configured by the user-space. Currently, this includes support only for Paravirtualized time, represented by bit-0. Signed-off-by: Raghavendra Rao Ananta <rananta@google.com> Reviewed-by: Gavin Shan <gshan@redhat.com> [maz: tidy-up bitmap values] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220502233853.1233742-4-rananta@google.com
2022-05-03KVM: arm64: Setup a framework for hypercall bitmap firmware registersRaghavendra Rao Ananta1-0/+12
KVM regularly introduces new hypercall services to the guests without any consent from the userspace. This means, the guests can observe hypercall services in and out as they migrate across various host kernel versions. This could be a major problem if the guest discovered a hypercall, started using it, and after getting migrated to an older kernel realizes that it's no longer available. Depending on how the guest handles the change, there's a potential chance that the guest would just panic. As a result, there's a need for the userspace to elect the services that it wishes the guest to discover. It can elect these services based on the kernels spread across its (migration) fleet. To remedy this, extend the existing firmware pseudo-registers, such as KVM_REG_ARM_PSCI_VERSION, but by creating a new COPROC register space for all the hypercall services available. These firmware registers are categorized based on the service call owners, but unlike the existing firmware pseudo-registers, they hold the features supported in the form of a bitmap. During the VM initialization, the registers are set to upper-limit of the features supported by the corresponding registers. It's expected that the VMMs discover the features provided by each register via GET_ONE_REG, and write back the desired values using SET_ONE_REG. KVM allows this modification only until the VM has started. Some of the standard features are not mapped to any bits of the registers. But since they can recreate the original problem of making it available without userspace's consent, they need to be explicitly added to the case-list in kvm_hvc_call_default_allowed(). Any function-id that's not enabled via the bitmap, or not listed in kvm_hvc_call_default_allowed, will be returned as SMCCC_RET_NOT_SUPPORTED to the guest. Older userspace code can simply ignore the feature and the hypercall services will be exposed unconditionally to the guests, thus ensuring backward compatibility. In this patch, the framework adds the register only for ARM's standard secure services (owner value 4). Currently, this includes support only for ARM True Random Number Generator (TRNG) service, with bit-0 of the register representing mandatory features of v1.0. Other services are momentarily added in the upcoming patches. Signed-off-by: Raghavendra Rao Ananta <rananta@google.com> Reviewed-by: Gavin Shan <gshan@redhat.com> [maz: reduced the scope of some helpers, tidy-up bitmap max values, dropped error-only fast path] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220502233853.1233742-3-rananta@google.com