linux - Linux Kernel (branches are rebased on master from time to time)

Age	Commit message (Collapse)	Author	Files	Lines
2022-02-10	KVM: x86: Shove vp_bitmap handling down into sparse_set_to_vcpu_mask()	Sean Christopherson	1	-27/+38
	Move the vp_bitmap "allocation" that's needed to handle mismatched vp_index values down into sparse_set_to_vcpu_mask() and drop __always_inline from said helper. The need for an intermediate vp_bitmap is a detail that's specific to the sparse translation with mismatched VP<=>vCPU indexes and does not need to be exposed to the caller. Regarding the __always_inline, prior to commit f21dd494506a ("KVM: x86: hyperv: optimize sparse VP set processing") the helper, then named hv_vcpu_in_sparse_set(), was a tiny bit of code that effectively boiled down to a handful of bit ops. The __always_inline was understandable, if not justifiable. Since the aforementioned change, sparse_set_to_vcpu_mask() is a chunky 350-450+ bytes of code without KASAN=y, and balloons to 1100+ with KASAN=y. In other words, it has no business being forcefully inlined. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211207220926.718794-7-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86: Don't bother reading sparse banks that end up being ignored	Sean Christopherson	1	-3/+16
	When handling "sparse" VP_SET requests, don't read sparse banks that can't possibly contain a legal VP index instead of ignoring such banks later on in sparse_set_to_vcpu_mask(). This allows KVM to cap the size of its sparse_banks arrays for VP_SET at KVM_HV_MAX_SPARSE_VCPU_SET_BITS. Add a compile time assert that KVM_HV_MAX_SPARSE_VCPU_SET_BITS<=64, i.e. that KVM_MAX_VCPUS<=4096, as the TLFS allows for at most 64 sparse banks, and KVM will need to do _something_ to play nice with Hyper-V. Reducing the size of sparse_banks fudges around a compilation warning (that becomes error with KVM_WERROR=y) when CONFIG_KASAN_STACK=y, which is selected (and can't be unselected) by CONFIG_KASAN=y when using gcc (clang/LLVM is a stack hog in some cases so it's opt-in for clang). KASAN_STACK adds a redzone around every stack variable, which pushes the Hyper-V functions over the default limit of 1024. Ideally, KVM would flat out reject such impossibilities, but the TLFS explicitly allows providing empty banks, even if a bank can't possibly contain a valid VP index due to its position exceeding KVM's max. Furthermore, for a bit 1 in ValidBankMask, it is valid state for the corresponding element in BanksContents can be all 0s, meaning no processors are specified in this bank. Arguably KVM should reject and not ignore the "extra" banks, but that can be done independently and without bloating sparse_banks, e.g. by reading each "extra" 8-byte chunk individually. Reported-by: Ajay Garg <ajaygargnsit@gmail.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211207220926.718794-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86: Add a helper to get the sparse VP_SET for IPIs and TLB flushes	Sean Christopherson	1	-16/+16
	Add a helper, kvm_get_sparse_vp_set(), to handle sanity checks related to the VARHEAD field and reading the sparse banks of a VP_SET. A future commit to reduce the memory footprint of sparse_banks will introduce more common code to the sparse bank retrieval. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211207220926.718794-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86: Refactor kvm_hv_flush_tlb() to reduce indentation	Sean Christopherson	1	-19/+21
	Refactor the "extended" path of kvm_hv_flush_tlb() to reduce the nesting depth for the non-fast sparse path, and to make the code more similar to the extended path in kvm_hv_send_ipi(). No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211207220926.718794-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86: Get the number of Hyper-V sparse banks from the VARHEAD field	Sean Christopherson	2	-20/+29
	Get the number of sparse banks from the VARHEAD field, which the guest is required to provide as "The size of a variable header, in QWORDS.", where the variable header is: Variable Header Bytes = {Total Header Bytes - sizeof(Fixed Header)} rounded up to nearest multiple of 8 Variable HeaderSize = Variable Header Bytes / 8 In other words, the VARHEAD should match the number of sparse banks. Keep the manual count as a sanity check, but otherwise rely on the field so as to more closely align with the logic defined in the TLFS and to allow for future cleanups. Tweak the tracepoint output to use "rep_cnt" instead of simply "cnt" now that there is also "var_cnt". Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211207220926.718794-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86/mmu: Consolidate comments about {Host,MMU}-writable	David Matlack	2	-55/+60
	Consolidate the large comment above DEFAULT_SPTE_HOST_WRITABLE with the large comment above is_writable_pte() into one comment. This comment explains the different reasons why an SPTE may be non-writable and KVM keeps track of that with the {Host,MMU}-writable bits. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220125230723.1701061-1-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86/mmu: Rename DEFAULT_SPTE_MMU_WRITEABLE to DEFAULT_SPTE_MMU_WRITABLE	David Matlack	3	-7/+7
	Both "writeable" and "writable" are valid, but we should be consistent about which we use. DEFAULT_SPTE_MMU_WRITEABLE was the odd one out in the SPTE code, so rename it to DEFAULT_SPTE_MMU_WRITABLE. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220125230713.1700406-1-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86/mmu: Move is_writable_pte() to spte.h	David Matlack	2	-39/+39
	Move is_writable_pte() close to the other functions that check writability information about SPTEs. While here opportunistically replace the open-coded bit arithmetic in check_spte_writable_invariants() with a call to is_writable_pte(). No functional change intended. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220125230518.1697048-4-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86/mmu: Check SPTE writable invariants when setting leaf SPTEs	David Matlack	4	-9/+6
	Check SPTE writable invariants when setting SPTEs rather than in spte_can_locklessly_be_made_writable(). By the time KVM checks spte_can_locklessly_be_made_writable(), the SPTE has long been since corrupted. Note that these invariants only apply to shadow-present leaf SPTEs (i.e. not to MMIO SPTEs, non-leaf SPTEs, etc.). Add a comment explaining the restriction and only instrument the code paths that set shadow-present leaf SPTEs. To account for access tracking, also check the SPTE writable invariants when marking an SPTE as an access track SPTE. This also lets us remove a redundant WARN from mark_spte_for_access_track(). Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220125230518.1697048-3-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86/mmu: Move SPTE writable invariant checks to a helper function	David Matlack	1	-7/+13
	Move the WARNs in spte_can_locklessly_be_made_writable() to a separate helper function. This is in preparation for moving these checks to the places where SPTEs are set. Opportunistically add warning error messages that include the SPTE to make future debugging of these warnings easier. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220125230518.1697048-2-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: LAPIC: Enable timer posted-interrupt only when mwait/hlt is advertised	Wanpeng Li	1	-1/+2
	As commit 0c5f81dad46 ("KVM: LAPIC: Inject timer interrupt via posted interrupt") mentioned that the host admin should well tune the guest setup, so that vCPUs are placed on isolated pCPUs, and with several pCPUs surplus for busy housekeeping. In this setup, it is preferrable to disable mwait/hlt/pause vmexits to keep the vCPUs in non-root mode. However, if only some guests isolated and others not, they would not have any benefit from posted timer interrupts, and at the same time lose VMX preemption timer fast paths because kvm_can_post_timer_interrupt() returns true and therefore forces kvm_can_use_hv_timer() to false. By guaranteeing that posted-interrupt timer is only used if MWAIT or HLT are done without vmexit, KVM can make a better choice and use the VMX preemption timer and the corresponding fast paths. Reported-by: Aili Yao <yaoaili@kingsoft.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Cc: Aili Yao <yaoaili@kingsoft.com> Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1643112538-36743-1-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: VMX: Dont' send posted IRQ if vCPU == this vCPU and vCPU is IN_GUEST_MODE	Wanpeng Li	1	-19/+21
	When delivering a virtual interrupt, don't actually send a posted interrupt if the target vCPU is also the currently running vCPU and is IN_GUEST_MODE, in which case the interrupt is being sent from a VM-Exit fastpath and the core run loop in vcpu_enter_guest() will manually move the interrupt from the PIR to vmcs.GUEST_RVI. IRQs are disabled while IN_GUEST_MODE, thus there's no possibility of the virtual interrupt being sent from anything other than KVM, i.e. KVM won't suppress a wake event from an IRQ handler (see commit fdba608f15e2, "KVM: VMX: Wake vCPU when delivering posted IRQ even if vCPU == this vCPU"). Eliding the posted interrupt restores the performance provided by the combination of commits 379a3c8ee444 ("KVM: VMX: Optimize posted-interrupt delivery for timer fastpath") and 26efe2fd92e5 ("KVM: VMX: Handle preemption timer fastpath"). Thanks Sean for better comments. Suggested-by: Chao Gao <chao.gao@intel.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1643111979-36447-1-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: SVM: Rename hook implementations to conform to kvm_x86_ops' names	Sean Christopherson	3	-25/+25
	Massage SVM's implementation names that still diverge from kvm_x86_ops to allow for wiring up all SVM-defined functions via kvm-x86-ops.h. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220128005208.4008533-22-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: SVM: Rename SEV implemenations to conform to kvm_x86_ops hooks	Sean Christopherson	3	-21/+21
	Rename svm_vm_copy_asid_from() and svm_vm_migrate_from() to conform to the names used by kvm_x86_ops, and opportunistically use "sev" instead of "svm" to more precisely identify the role of the hooks. svm_vm_copy_asid_from() in particular was poorly named as the function does much more than simply copy the ASID. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220128005208.4008533-21-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86: Use more verbose names for mem encrypt kvm_x86_ops hooks	Sean Christopherson	6	-17/+23
	Use slightly more verbose names for the so called "memory encrypt", a.k.a. "mem enc", kvm_x86_ops hooks to bridge the gap between the current super short kvm_x86_ops names and SVM's more verbose, but non-conforming names. This is a step toward using kvm-x86-ops.h with KVM_X86_CVM_OP() to fill svm_x86_ops. Opportunistically rename mem_enc_op() to mem_enc_ioctl() to better reflect its true nature, as it really is a full fledged ioctl() of its own. Ideally, the hook would be named confidential_vm_ioctl() or so, as the ioctl() is a gateway to more than just memory encryption, and because its underlying purpose to support Confidential VMs, which can be provided without memory encryption, e.g. if the TCB of the guest includes the host kernel but not host userspace, or by isolation in hardware without encrypting memory. But, diverging from KVM_MEMORY_ENCRYPT_OP even further is undeseriable, and short of creating alises for all related ioctl()s, which introduces a different flavor of divergence, KVM is stuck with the nomenclature. Defer renaming SVM's functions to a future commit as there are additional changes needed to make SVM fully conforming and to match reality (looking at you, svm_vm_copy_asid_from()). No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220128005208.4008533-20-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: SVM: Remove unused MAX_INST_SIZE #define	Sean Christopherson	1	-2/+0
	Remove SVM's MAX_INST_SIZE, which has long since been obsoleted by the common MAX_INSN_SIZE. Note, the latter's "insn" is also the generally preferred abbreviation of instruction. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220128005208.4008533-18-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: SVM: Rename svm_flush_tlb() to svm_flush_tlb_current()	Sean Christopherson	2	-6/+7
	Rename svm_flush_tlb() to svm_flush_tlb_current() so that at least one of the flushing operations in svm_x86_ops can be filled via kvm-x86-ops.h, and to document the scope of the flush (specifically that it doesn't flush "all"). Opportunistically make svm_tlb_flush_current(), was svm_flush_tlb(), static. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220128005208.4008533-17-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86: Move get_cs_db_l_bits() helper to SVM	Sean Christopherson	3	-12/+10
	Move kvm_get_cs_db_l_bits() to SVM and rename it appropriately so that its svm_x86_ops entry can be filled via kvm-x86-ops, and to eliminate a superfluous export from KVM x86. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220128005208.4008533-16-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: VMX: Rename VMX functions to conform to kvm_x86_ops names	Sean Christopherson	3	-18/+18
	Massage VMX's implementation names for kvm_x86_ops to maximize use of kvm-x86-ops.h. Leave cpu_has_vmx_wbinvd_exit() as-is to preserve the cpu_has_vmx_*() pattern used for querying VMCS capabilities. Keep pi_has_pending_interrupt() as vmx_dy_apicv_has_pending_interrupt() does a poor job of describing exactly what is being checked in VMX land. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220128005208.4008533-14-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86: Use static_call() for copy/move encryption context ioctls()	Sean Christopherson	2	-7/+12
	Define and use static_call()s for .vm_{copy,move}_enc_context_from(), mostly so that the op is defined in kvm-x86-ops.h. This will allow using KVM_X86_OP in vendor code to wire up the implementation. Any performance gains eeked out by using static_call() is a happy bonus and not the primary motiviation. Opportunistically refactor the code to reduce indentation and keep line lengths reasonable, and to be consistent when wrapping versus running a bit over the 80 char soft limit. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220128005208.4008533-12-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86: Unexport kvm_x86_ops	Sean Christopherson	1	-1/+0
	Drop the export of kvm_x86_ops now it is no longer referenced by SVM or VMX. Disallowing access to kvm_x86_ops is very desirable as it prevents vendor code from incorrectly modifying hooks after they have been set by kvm_arch_hardware_setup(), and more importantly after each function's associated static_call key has been updated. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220128005208.4008533-11-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86: Uninline and export hv_track_root_tdp()	Sean Christopherson	2	-13/+15
	Uninline and export Hyper-V's hv_track_root_tdp(), which is (somewhat indirectly) the last remaining reference to kvm_x86_ops from vendor modules, i.e. will allow unexporting kvm_x86_ops. Reloading the TDP PGD isn't the fastest of paths, hv_track_root_tdp() isn't exactly tiny, and disallowing vendor code from accessing kvm_x86_ops provides nice-to-have encapsulation of common x86 code (and of Hyper-V code for that matter). No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220128005208.4008533-10-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: nVMX: Refactor PMU refresh to avoid referencing kvm_x86_ops.pmu_ops	Sean Christopherson	3	-4/+7
	Refactor the nested VMX PMU refresh helper to pass it a flag stating whether or not the vCPU has PERF_GLOBAL_CTRL instead of having the nVMX helper query the information by bouncing through kvm_x86_ops.pmu_ops. This will allow a future patch to use static_call() for the PMU ops without having to export any static call definitions from common x86, and it is also a step toward unexported kvm_x86_ops. Alternatively, nVMX could call kvm_pmu_is_valid_msr() to indirectly use kvm_x86_ops.pmu_ops, but that would incur an extra layer of indirection and would require exporting kvm_pmu_is_valid_msr(). Opportunistically rename the helper to keep line lengths somewhat reasonable, and to better capture its high-level role. No functional change intended. Cc: Like Xu <like.xu.linux@gmail.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220128005208.4008533-9-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: xen: Use static_call() for invoking kvm_x86_ops hooks	Sean Christopherson	1	-2/+2
	Use static_call() for invoking kvm_x86_ops function that already have a defined static call, mostly as a step toward having _all_ calls to kvm_x86_ops route through a static_call() in order to simplify auditing, e.g. via grep, that all functions have an entry in kvm-x86-ops.h, but also because there's no reason not to use a static_call(). Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220128005208.4008533-8-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86: Use static_call() for .vcpu_deliver_sipi_vector()	Sean Christopherson	2	-1/+2
	Define and use a static_call() for kvm_x86_ops.vcpu_deliver_sipi_vector(), mostly so that the op is defined in kvm-x86-ops.h. This will allow using KVM_X86_OP in vendor code to wire up the implementation. Any performance gains eeked out by using static_call() is a happy bonus and not the primary motiviation. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220128005208.4008533-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: VMX: Call vmx_get_cpl() directly in handle_dr()	Sean Christopherson	1	-1/+1
	Use vmx_get_cpl() instead of bouncing through kvm_x86_ops.get_cpl() when performing a CPL check on MOV DR accesses. This avoids a RETPOLINE (when enabled), and more importantly removes a vendor reference to kvm_x86_ops and helps pave the way for unexporting kvm_x86_ops. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220128005208.4008533-7-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86: Rename kvm_x86_ops pointers to align w/ preferred vendor names	Sean Christopherson	6	-59/+56
	Rename a variety of kvm_x86_op function pointers so that preferred name for vendor implementations follows the pattern <vendor>_<function>, e.g. rename .run() to .vcpu_run() to match {svm,vmx}_vcpu_run(). This will allow vendor implementations to be wired up via the KVM_X86_OP macro. In many cases, VMX and SVM "disagree" on the preferred name, though in reality it's VMX and x86 that disagree as SVM blindly prepended _svm to the kvm_x86_ops name. Justification for using the VMX nomenclature: - set_{irq,nmi} => inject_{irq,nmi} because the helper is injecting an event that has already been "set" in e.g. the vIRR. SVM's relevant VMCB field is even named event_inj, and KVM's stat is irq_injections. - prepare_guest_switch => prepare_switch_to_guest because the former is ambiguous, e.g. it could mean switching between multiple guests, switching from the guest to host, etc... - update_pi_irte => pi_update_irte to allow for matching match the rest of VMX's posted interrupt naming scheme, which is vmx_pi_<blah>(). - start_assignment => pi_start_assignment to again follow VMX's posted interrupt naming scheme, and to provide context for what bit of code might care about an otherwise undescribed "assignment". The "tlb_flush" => "flush_tlb" creates an inconsistency with respect to Hyper-V's "tlb_remote_flush" hooks, but Hyper-V really is the one that's wrong. x86, VMX, and SVM all use flush_tlb, and even common KVM is on a variant of the bandwagon with "kvm_flush_remote_tlbs", e.g. a more appropriate name for the Hyper-V hooks would be flush_remote_tlbs. Leave that change for another time as the Hyper-V hooks always start as NULL, i.e. the name doesn't matter for using kvm-x86-ops.h, and changing all names requires an astounding amount of churn. VMX and SVM function names are intentionally left as is to minimize the diff. Both VMX and SVM will need to rename even more functions in order to fully utilize KVM_X86_OPS, i.e. an additional patch for each is inevitable. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220128005208.4008533-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86: Drop export for .tlb_flush_current() static_call key	Sean Christopherson	1	-1/+0
	Remove the export of kvm_x86_tlb_flush_current() as there are no longer any users outside of common x86 code. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220128005208.4008533-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86: skip host CPUID call for hypervisor leaves	Paolo Bonzini	1	-1/+9
	Hypervisor leaves are always synthesized by __do_cpuid_func; just return zeroes and do not ask the host. Even on nested virtualization, a value from another hypervisor would be bogus, because all hypercalls and MSRs are processed by KVM. Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86: Remove unused "flags" of kvm_pv_kick_cpu_op()	Jinrong Liang	1	-2/+2
	The "unsigned long flags" parameter of kvm_pv_kick_cpu_op() is not used, so remove it. No functional change intended. Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Message-Id: <20220125095909.38122-20-cloudliang@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86: Remove unused "vcpu" of kvm_scale_tsc()	Jinrong Liang	2	-8/+8
	The "struct kvm_vcpu *vcpu" parameter of kvm_scale_tsc() is not used, so remove it. No functional change intended. Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Message-Id: <20220125095909.38122-18-cloudliang@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86/emulate: Remove unused "tss_selector" of task_switch_{16, 32}()	Jinrong Liang	1	-7/+4
	The "u16 tss_selector" parameter of task_switch_{16, 32}() is not used, so remove it. No functional change intended. Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Message-Id: <20220125095909.38122-16-cloudliang@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86/emulate: Remove unused "ctxt" of setup_syscalls_segments()	Jinrong Liang	1	-5/+4
	The "struct x86_emulate_ctxt *ctxt" parameter of setup_syscalls_segments() is not used, so remove it. No functional change intended. Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Message-Id: <20220125095909.38122-15-cloudliang@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86/ioapic: Remove unused "addr" and "length" of ioapic_read_indirect()	Jinrong Liang	1	-4/+2
	The "unsigned long addr" and "unsigned long length" parameter of ioapic_read_indirect() is not used, so remove it. No functional change intended. Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Message-Id: <20220125095909.38122-14-cloudliang@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86/i8259: Remove unused "addr" of elcr_ioport_{read,write}()	Jinrong Liang	1	-4/+4
	The "u32 addr" parameter of elcr_ioport_write() and elcr_ioport_read() is not used, so remove it. No functional change intended. Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Message-Id: <20220125095909.38122-13-cloudliang@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: SVM: improve split between svm_prepare_guest_switch and ↵	Paolo Bonzini	3	-12/+10
	sev_es_prepare_guest_switch KVM performs the VMSAVE to the host save area for both regular and SEV-ES guests, so hoist it up to svm_prepare_guest_switch. And because sev_es_prepare_guest_switch does not really need to know the details of struct svm_cpu_data *, just pass it the pointer to the host save area inside the HSAVE page. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86/svm: Remove unused "vcpu" of svm_check_exit_valid()	Jinrong Liang	1	-2/+2
	The "struct kvm_vcpu *vcpu" parameter of svm_check_exit_valid() is not used, so remove it. No functional change intended. Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Message-Id: <20220125095909.38122-7-cloudliang@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86/mmu_audit: Remove unused "level" of audit_spte_after_sync()	Jinrong Liang	1	-2/+2
	The "int level" parameter of audit_spte_after_sync() is not used, so remove it. No functional change intended. Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Message-Id: <20220125095909.38122-6-cloudliang@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86/tdp_mmu: Remove unused "kvm" of kvm_tdp_mmu_get_root()	Jinrong Liang	2	-4/+3
	The "struct kvm *kvm" parameter of kvm_tdp_mmu_get_root() is not used, so remove it. No functional change intended. Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Message-Id: <20220125095909.38122-5-cloudliang@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86/mmu: Remove unused "vcpu" of reset_{tdp,ept}_shadow_zero_bits_mask()	Jinrong Liang	1	-6/+4
	The "struct kvm_vcpu *vcpu" parameter of reset_ept_shadow_zero_bits_mask() and reset_tdp_shadow_zero_bits_mask() is not used, so remove it. No functional change intended. Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Message-Id: <20220125095909.38122-4-cloudliang@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86/mmu: Remove unused "kvm" of __rmap_write_protect()	Jinrong Liang	1	-5/+4
	The "struct kvm *kvm" parameter of __rmap_write_protect() is not used, so remove it. No functional change intended. Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Message-Id: <20220125095909.38122-3-cloudliang@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86/mmu: Remove unused "kvm" of kvm_mmu_unlink_parents()	Jinrong Liang	1	-2/+2
	The "struct kvm *kvm" parameter of kvm_mmu_unlink_parents() is not used, so remove it. No functional change intended. Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Message-Id: <20220125095909.38122-2-cloudliang@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86: Skip APICv update if APICv is disable at the module level	Sean Christopherson	2	-0/+6
	Bail from the APICv update paths _before_ taking apicv_update_lock if APICv is disabled at the module level. kvm_request_apicv_update() in particular is invoked from multiple paths that can be reached without APICv being enabled, e.g. svm_enable_irq_window(), and taking the rw_sem for write when APICv is disabled may introduce unnecessary contention and stalls. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211208015236.1616697-25-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86: Drop NULL check on kvm_x86_ops.check_apicv_inhibit_reasons	Sean Christopherson	1	-2/+1
	Drop the useless NULL check on kvm_x86_ops.check_apicv_inhibit_reasons when handling an APICv update, both VMX and SVM unconditionally implement the helper and leave it non-NULL even if APICv is disabled at the module level. The latter is a moot point now that __kvm_request_apicv_update() is called if and only if enable_apicv is true. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211208015236.1616697-26-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86: Unexport __kvm_request_apicv_update()	Sean Christopherson	1	-1/+0
	Unexport __kvm_request_apicv_update(), it's not used by vendor code and should never be used by vendor code. The only reason it's exposed at all is because Hyper-V's SynIC needs to track how many auto-EOIs are in use, and it's convenient to use apicv_update_lock to guard that tracking. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211208015236.1616697-27-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86/mmu: Zap _all_ roots when unmapping gfn range in TDP MMU	Sean Christopherson	1	-15/+24
	Zap both valid and invalid roots when zapping/unmapping a gfn range, as KVM must ensure it holds no references to the freed page after returning from the unmap operation. Most notably, the TDP MMU doesn't zap invalid roots in mmu_notifier callbacks. This leads to use-after-free and other issues if the mmu_notifier runs to completion while an invalid root zapper yields as KVM fails to honor the requirement that there must be _no_ references to the page after the mmu_notifier returns. The bug is most easily reproduced by hacking KVM to cause a collision between set_nx_huge_pages() and kvm_mmu_notifier_release(), but the bug exists between kvm_mmu_notifier_invalidate_range_start() and memslot updates as well. Invalidating a root ensures pages aren't accessible by the guest, and KVM won't read or write page data itself, but KVM will trigger e.g. kvm_set_pfn_dirty() when zapping SPTEs, and thus completing a zap of an invalid root _after_ the mmu_notifier returns is fatal. WARNING: CPU: 24 PID: 1496 at arch/x86/kvm/../../../virt/kvm/kvm_main.c:173 [kvm] RIP: 0010:kvm_is_zone_device_pfn+0x96/0xa0 [kvm] Call Trace: <TASK> kvm_set_pfn_dirty+0xa8/0xe0 [kvm] __handle_changed_spte+0x2ab/0x5e0 [kvm] __handle_changed_spte+0x2ab/0x5e0 [kvm] __handle_changed_spte+0x2ab/0x5e0 [kvm] zap_gfn_range+0x1f3/0x310 [kvm] kvm_tdp_mmu_zap_invalidated_roots+0x50/0x90 [kvm] kvm_mmu_zap_all_fast+0x177/0x1a0 [kvm] set_nx_huge_pages+0xb4/0x190 [kvm] param_attr_store+0x70/0x100 module_attr_store+0x19/0x30 kernfs_fop_write_iter+0x119/0x1b0 new_sync_write+0x11c/0x1b0 vfs_write+0x1cc/0x270 ksys_write+0x5f/0xe0 do_syscall_64+0x38/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae </TASK> Fixes: b7cccd397f31 ("KVM: x86/mmu: Fast invalidation for TDP MMU") Cc: stable@vger.kernel.org Cc: Ben Gardon <bgardon@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211215011557.399940-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86/mmu: Move "invalid" check out of kvm_tdp_mmu_get_root()	Sean Christopherson	2	-5/+10
	Move the check for an invalid root out of kvm_tdp_mmu_get_root() and into the one place it actually matters, tdp_mmu_next_root(), as the other user already has an implicit validity check. A future bug fix will need to get references to invalid roots to honor mmu_notifier requests; there's no point in forcing what will be a common path to open code getting a reference to a root. No functional change intended. Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211215011557.399940-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86/mmu: Use common TDP MMU zap helper for MMU notifier unmap hook	Sean Christopherson	1	-7/+2
	Use the common TDP MMU zap helper when handling an MMU notifier unmap event, the two flows are semantically identical. Consolidate the code in preparation for a future bug fix, as both kvm_tdp_mmu_unmap_gfn_range() and __kvm_tdp_mmu_zap_gfn_range() are guilty of not zapping SPTEs in invalid roots. No functional change intended. Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211215011557.399940-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10	KVM: x86/xen: Fix runstate updates to be atomic when preempting vCPU	David Woodhouse	1	-30/+67
	There are circumstances whem kvm_xen_update_runstate_guest() should not sleep because it ends up being called from __schedule() when the vCPU is preempted: [ 222.830825] kvm_xen_update_runstate_guest+0x24/0x100 [ 222.830878] kvm_arch_vcpu_put+0x14c/0x200 [ 222.830920] kvm_sched_out+0x30/0x40 [ 222.830960] __schedule+0x55c/0x9f0 To handle this, make it use the same trick as __kvm_xen_has_interrupt(), of using the hva from the gfn_to_hva_cache directly. Then it can use pagefault_disable() around the accesses and just bail out if the page is absent (which is unlikely). I almost switched to using a gfn_to_pfn_cache here and bailing out if kvm_map_gfn() fails, like kvm_steal_time_set_preempted() does — but on closer inspection it looks like kvm_map_gfn() will always fail in atomic context for a page in IOMEM, which means it will silently fail to make the update every single time for such guests, AFAICT. So I didn't do it that way after all. And will probably fix that one too. Cc: stable@vger.kernel.org Fixes: 30b5c851af79 ("KVM: x86/xen: Add support for vCPU runstate information") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <b17a93e5ff4561e57b1238e3e7ccd0b613eb827e.camel@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-08	KVM: x86: SVM: move avic definitions from AMD's spec to svm.h	Maxim Levitsky	4	-32/+38
	asm/svm.h is the correct place for all values that are defined in the SVM spec, and that includes AVIC. Also add some values from the spec that were not defined before and will be soon useful. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20220207155447.840194-10-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>