summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2015-06-19KVM: MTRR: improve kvm_mtrr_get_guest_memory_typeXiao Guangrong1-45/+49
- kvm_mtrr_get_guest_memory_type() only checks one page in MTRRs so that it's unnecessary to check to see if the range is partially covered in MTRR - optimize the check of overlap memory type and add some comments to explain the precedence Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19KVM: MTRR: do not split 64 bits MSR contentXiao Guangrong2-23/+16
Variable MTRR MSRs are 64 bits which are directly accessed with full length, no reason to split them to two 32 bits Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19KVM: MTRR: clean up mtrr default typeXiao Guangrong2-14/+29
Drop kvm_mtrr->enable, omit the decode/code workload and get rid of all the hard code Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19KVM: MTRR: exactly define the size of variable MTRRsXiao Guangrong1-1/+1
Only KVM_NR_VAR_MTRR variable MTRRs are available in KVM guest Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19KVM: MTRR: remove mtrr_state.have_fixedXiao Guangrong3-6/+11
vMTRR does not depend on any host MTRR feature and fixed MTRRs have always been implemented, so drop this field Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19KVM: MTRR: handle MSR_MTRRcap in kvm_mtrr_get_msrXiao Guangrong2-2/+12
MSR_MTRRcap is a MTRR msr so move the handler to the common place, also add some comments to make the hard code more readable Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19KVM: x86: move MTRR related code to a separate fileXiao Guangrong7-318/+342
MTRR code locates in x86.c and mmu.c so that move them to a separate file to make the organization more clearer and it will be the place where we fully implement vMTRR Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19KVM: x86: fix CR0.CD virtualizationXiao Guangrong2-10/+26
Currently, CR0.CD is not checked when we virtualize memory cache type for noncoherent_dma guests, this patch fixes it by : - setting UC for all memory if CR0.CD = 1 - zapping all the last sptes in MMU if CR0.CD is changed Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19KVM: nSVM: Check for NRIPS support before updating control fieldBandan Das1-2/+6
If hardware doesn't support DecodeAssist - a feature that provides more information about the intercept in the VMCB, KVM decodes the instruction and then updates the next_rip vmcb control field. However, NRIP support itself depends on cpuid Fn8000_000A_EDX[NRIPS]. Since skip_emulated_instruction() doesn't verify nrip support before accepting control.next_rip as valid, avoid writing this field if support isn't present. Signed-off-by: Bandan Das <bsd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19KVM: fix checkpatch.pl errors in kvm/coalesced_mmio.hKevin Mulvey1-2/+2
Tabs rather than spaces Signed-off-by: Kevin Mulvey <kmulvey@linux.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19KVM: fix checkpatch.pl errors in kvm/async_pf.hKevin Mulvey1-2/+2
fix brace spacing Signed-off-by: Kevin Mulvey <kmulvey@linux.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19kvm: irqchip: Break up high order allocations of kvm_irq_routing_tableJoerg Roedel1-8/+33
The allocation size of the kvm_irq_routing_table depends on the number of irq routing entries because they are all allocated with one kzalloc call. When the irq routing table gets bigger this requires high order allocations which fail from time to time: qemu-kvm: page allocation failure: order:4, mode:0xd0 This patch fixes this issue by breaking up the allocation of the table and its entries into individual kzalloc calls. These could all be satisfied with order-0 allocations, which are less likely to fail. The downside of this change is the lower performance, because of more calls to kzalloc. But given how often kvm_set_irq_routing is called in the lifetime of a guest, it doesn't really matter much. Signed-off-by: Joerg Roedel <jroedel@suse.de> [Avoid sparse warning through rcu_access_pointer. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19Merge tag 'kvm-arm-for-4.2' of ↵Paolo Bonzini15-55/+104
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM changes for v4.2: - Proper guest time accounting - FP access fix for 32bit - The usual pile of GIC fixes - PSCI fixes - Random cleanups
2015-06-18KVM: arm/arm64: vgic: Remove useless arm-gic.h #includeMarc Zyngier1-2/+0
Back in the days, vgic.c used to have an intimate knowledge of the actual GICv2. These days, this has been abstracted away into hardware-specific backends. Remove the now useless arm-gic.h #include directive, making it clear that GICv2 specific code doesn't belong here. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-06-17KVM: arm/arm64: vgic: Avoid injecting reserved IRQ numbersMarc Zyngier1-4/+1
Commit fd1d0ddf2ae9 (KVM: arm/arm64: check IRQ number on userland injection) rightly limited the range of interrupts userspace can inject in a guest, but failed to consider the (unlikely) case where a guest is configured with 1024 interrupts. In this case, interrupts ranging from 1020 to 1023 are unuseable, as they have a special meaning for the GIC CPU interface. Make sure that these number cannot be used as an IRQ. Also delete a redundant (and similarily buggy) check in kvm_set_irq. Reported-by: Peter Maydell <peter.maydell@linaro.org> Cc: Andre Przywara <andre.przywara@arm.com> Cc: <stable@vger.kernel.org> # 4.1, 4.0, 3.19, 3.18 Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-06-17arm/arm64: KVM: vgic: Do not save GICH_HCR / ICH_HCR_EL2Marc Zyngier3-8/+0
The GIC Hypervisor Configuration Register is used to enable the delivery of virtual interupts to a guest, as well as to define in which conditions maintenance interrupts are delivered to the host. This register doesn't contain any information that we need to read back (the EOIcount is utterly useless for us). So let's save ourselves some cycles, and not save it before writing zero to it. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-06-17KVM: arm: vgic: Drop useless Group0 warningMarc Zyngier1-2/+0
If a GICv3-enabled guest tries to configure Group0, we print a warning on the console (because we don't support Group0 interrupts). This is fairly pointless, and would allow a guest to spam the console. Let's just drop the warning. Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-06-17ARM: kvm: psci: fix handling of unimplemented functionsLorenzo Pieralisi1-13/+3
According to the PSCI specification and the SMC/HVC calling convention, PSCI function_ids that are not implemented must return NOT_SUPPORTED as return value. Current KVM implementation takes an unhandled PSCI function_id as an error and injects an undefined instruction into the guest if PSCI implementation is called with a function_id that is not handled by the resident PSCI version (ie it is not implemented), which is not the behaviour expected by a guest when calling a PSCI function_id that is not implemented. This patch fixes this issue by returning NOT_SUPPORTED whenever the kvm PSCI call is executed for a function_id that is not implemented by the PSCI kvm layer. Cc: <stable@vger.kernel.org> # 3.18+ Cc: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-06-17KVM: arm64: fix misleading comments in save/restoreAlex Bennée1-4/+4
The elr_el2 and spsr_el2 registers in fact contain the processor state before entry into EL2. In the case of guest state it could be in either el0 or el1. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-06-17KVM: arm/arm64: Enable the KVM-VFIO deviceKim Phillips4-2/+4
The KVM-VFIO device is used by the QEMU VFIO device. It is used to record the list of in-use VFIO groups so that KVM can manipulate them. Signed-off-by: Kim Phillips <kim.phillips@linaro.org> Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-06-17arm/arm64: KVM: Properly account for guest CPU timeChristoffer Dall1-4/+17
Until now we have been calling kvm_guest_exit after re-enabling interrupts when we come back from the guest, but this has the unfortunate effect that CPU time accounting done in the context of timer interrupts occurring while the guest is running doesn't properly notice that the time since the last tick was spent in the guest. Inspired by the comment in the x86 code, move the kvm_guest_exit() call below the local_irq_enable() call and change __kvm_guest_exit() to kvm_guest_exit(), because we are now calling this function with interrupts enabled. We have to now explicitly disable preemption and not enable preemption before we've called kvm_guest_exit(), since otherwise we could be preempted and everything happening before we eventually get scheduled again would be accounted for as guest time. At the same time, move the trace_kvm_exit() call outside of the atomic section, since there is no reason for us to do that with interrupts disabled. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-06-17kvm: remove one useless check extensionTiejun Chen2-2/+1
We already check KVM_CAP_IRQFD in generic once enable CONFIG_HAVE_KVM_IRQFD, kvm_vm_ioctl_check_extension_generic() | + switch (arg) { + ... + #ifdef CONFIG_HAVE_KVM_IRQFD + case KVM_CAP_IRQFD: + #endif + ... + return 1; + ... + } | + kvm_vm_ioctl_check_extension() So its not necessary to check this in arch again, and also fix one typo, s/emlation/emulation. Signed-off-by: Tiejun Chen <tiejun.chen@intel.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-06-17arm: KVM: force execution of HCPTR access on VM exitMarc Zyngier2-8/+22
On VM entry, we disable access to the VFP registers in order to perform a lazy save/restore of these registers. On VM exit, we restore access, test if we did enable them before, and save/restore the guest/host registers if necessary. In this sequence, the FPEXC register is always accessed, irrespective of the trapping configuration. If the guest didn't touch the VFP registers, then the HCPTR access has now enabled such access, but we're missing a barrier to ensure architectural execution of the new HCPTR configuration. If the HCPTR access has been delayed/reordered, the subsequent access to FPEXC will cause a trap, which we aren't prepared to handle at all. The same condition exists when trapping to enable VFP for the guest. The fix is to introduce a barrier after enabling VFP access. In the vmexit case, it can be relaxed to only takes place if the guest hasn't accessed its view of the VFP registers, making the access to FPEXC safe. The set_hcptr macro is modified to deal with both vmenter/vmexit and vmtrap operations, and now takes an optional label that is branched to when the guest hasn't touched the VFP registers. Reported-by: Vikram Sethi <vikrams@codeaurora.org> Cc: stable@kernel.org # v3.9+ Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-06-09KVM: arm64: add active register handling to GICv3 emulation as wellAndre Przywara1-4/+50
Commit 47a98b15ba7c ("arm/arm64: KVM: support for un-queuing active IRQs") introduced handling of the GICD_I[SC]ACTIVER registers, but only for the GICv2 emulation. For the sake of completeness and as this is a pre-requisite for save/restore of the GICv3 distributor state, we should also emulate their handling in the distributor and redistributor frames of an emulated GICv3. Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-06-09ARM: KVM: Remove pointless void pointer castFiro Yang1-2/+2
No need to cast the void pointer returned by kmalloc() in arch/arm/kvm/mmu.c::kvm_alloc_stage2_pgd(). Signed-off-by: Firo Yang <firogm@gmail.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-06-05KVM: x86: mark legacy PCI device assignment as deprecatedPaolo Bonzini2-8/+8
Follow up to commit e194bbdf362ba7d53cfd23ba24f1a7c90ef69a74. Suggested-by: Bandan Das <bsd@redhat.com> Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05KVM: x86: advertise KVM_CAP_X86_SMMPaolo Bonzini4-0/+28
... and we're done. :) Because SMBASE is usually relocated above 1M on modern chipsets, and SMM handlers might indeed rely on 4G segment limits, we only expose it if KVM is able to run the guest in big real mode. This includes any of VMX+emulate_invalid_guest_state, VMX+unrestricted_guest, or SVM. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05KVM: x86: add SMM to the MMU role, support SMRAM address spacePaolo Bonzini5-17/+42
This is now very simple to do. The only interesting part is a simple trick to find the right memslot in gfn_to_rmap, retrieving the address space from the spte role word. The same trick is used in the auditing code. The comment on top of union kvm_mmu_page_role has been stale forever, so remove it. Speaking of stale code, remove pad_for_nice_hex_output too: it was splitting the "access" bitfield across two bytes and thus had effectively turned into pad_for_ugly_hex_output. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05KVM: x86: work on all available address spacesPaolo Bonzini4-44/+91
This patch has no semantic change, but it prepares for the introduction of a second address space for system management mode. A new function x86_set_memory_region (and the "slots_lock taken" counterpart __x86_set_memory_region) is introduced in order to operate on all address spaces when adding or deleting private memory slots. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05KVM: x86: use vcpu-specific functions to read/write/translate GFNsPaolo Bonzini7-80/+80
We need to hide SMRAM from guests not running in SMM. Therefore, all uses of kvm_read_guest* and kvm_write_guest* must be changed to check whether the VCPU is in system management mode and use a different set of memslots. Switch from kvm_* to the newly-introduced kvm_vcpu_*, which call into kvm_arch_vcpu_memslots_id. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05KVM: x86: pass struct kvm_mmu_page to gfn_to_rmapPaolo Bonzini2-7/+11
This is always available (with one exception in the auditing code), and with the same auditing exception the level was coming from sp->role.level. Later, the spte's role will also be used to look up the right memslots array. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05KVM: implement multiple address spacesPaolo Bonzini5-32/+79
Only two ioctls have to be modified; the address space id is placed in the higher 16 bits of their slot id argument. As of this patch, no architecture defines more than one address space; x86 will be the first. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05KVM: add vcpu-specific functions to read/write/translate GFNsPaolo Bonzini2-13/+178
We need to hide SMRAM from guests not running in SMM. Therefore, all uses of kvm_read_guest* and kvm_write_guest* must be changed to use different address spaces, depending on whether the VCPU is in system management mode. We need to introduce a new family of functions for this purpose. For now, the VCPU-based functions have the same behavior as the existing per-VM ones, they just accept a different type for the first argument. Later however they will be changed to use one of many "struct kvm_memslots" stored in struct kvm, through an architecture hook. VM-based functions will unconditionally use the first memslots pointer. Whenever possible, this patch introduces slot-based functions with an __ prefix, with two wrappers for generic and vcpu-based actions. The exceptions are kvm_read_guest and kvm_write_guest, which are copied into the new functions kvm_vcpu_read_guest and kvm_vcpu_write_guest. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-04KVM: x86: save/load state on SMM switchPaolo Bonzini4-2/+498
The big ugly one. This patch adds support for switching in and out of system management mode, respectively upon receiving KVM_REQ_SMI and upon executing a RSM instruction. Both 32- and 64-bit formats are supported for the SMM state save area. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-04KVM: x86: latch INITs while in system management modePaolo Bonzini2-1/+15
Do not process INITs immediately while in system management mode, keep it instead in apic->pending_events. Tell userspace if an INIT is pending when they issue GET_VCPU_EVENTS, and similarly handle the new field in SET_VCPU_EVENTS. Note that the same treatment should be done while in VMX non-root mode. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-04KVM: x86: stubs for SMM supportPaolo Bonzini7-4/+81
This patch adds the interface between x86.c and the emulator: the SMBASE register, a new emulator flag, the RSM instruction. It also adds a new request bit that will be used by the KVM_SMI ioctl. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-04KVM: x86: API changes for SMM supportPaolo Bonzini7-10/+99
This patch includes changes to the external API for SMM support. Userspace can predicate the availability of the new fields and ioctls on a new capability, KVM_CAP_X86_SMM, which is added at the end of the patch series. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-04KVM: x86: pass the whole hflags field to emulator and backPaolo Bonzini3-5/+16
The hflags field will contain information about system management mode and will be useful for the emulator. Pass the entire field rather than just the guest-mode information. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-04KVM: x86: pass host_initiated to functions that read MSRsPaolo Bonzini4-101/+127
SMBASE is only readable from SMM for the VCPU, but it must be always accessible if userspace is accessing it. Thus, all functions that read MSRs are changed to accept a struct msr_data; the host_initiated and index fields are pre-initialized, while the data field is filled on return. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-04KVM: x86: introduce num_emulated_msrsPaolo Bonzini1-13/+27
We will want to filter away MSR_IA32_SMBASE from the emulated_msrs if the host CPU does not support SMM virtualization. Introduce the logic to do that, and also move paravirt MSRs to emulated_msrs for simplicity and to get rid of KVM_SAVE_MSRS_BEGIN. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-04KVM: x86: clear hidden CPU state at reset timePaolo Bonzini1-0/+2
This was noticed by Radim while reviewing the implementation of system management mode. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-04kvm: x86: fix kvm_apic_has_events to check for NULL pointerPaolo Bonzini1-1/+1
Malicious (or egregiously buggy) userspace can trigger it, but it should never happen in normal operation. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-04kvm: x86: default legacy PCI device assignment support to "n"Paolo Bonzini1-3/+4
VFIO has proved itself a much better option than KVM's built-in device assignment. It is mature, provides better isolation because it enforces ACS, and even the userspace code is being tested on a wider variety of hardware these days than the legacy support. Disable legacy device assignment by default. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-03Merge tag 'kvm-s390-next-20150602' of ↵Paolo Bonzini1-3/+8
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next KVM: s390: Fix and cleanup for 4.2 (kvm/next) One small fix for a commit targetted for 4.2 and one cleanup regarding our printks.
2015-06-02KVM: s390: introduce KMSG_COMPONENT for kvm-s390David Hildenbrand1-2/+6
Let's remove "kvm-s390" from our printk messages and make use of pr_fmt instead. Also replace one printk() occurrence by a equivalent pr_warn on the way. Suggested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-06-02KVM: s390: call exit_sie() directly on vcpu block/requestDavid Hildenbrand1-1/+2
Thinking about it, I can't find a real use case where we want to block a VCPU and not kick it out of SIE. (except if we want to do the same in batch for multiple VCPUs - but that's a micro optimization) So let's simply perform the exit_sie() calls directly when setting the other magic block bits in the SIE. Otherwise e.g. kvm_s390_set_tod_low() still has other VCPUs running after that call, working with a wrong epoch. Fixes: 27406cd50c ("KVM: s390: provide functions for blocking all CPUs") Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-05-29KVM: x86: zero kvmclock_offset when vcpu0 initializes kvmclock system MSRMarcelo Tosatti1-0/+4
Initialize kvmclock base, on kvmclock system MSR write time, so that the guest sees kvmclock counting from zero. This matches baremetal behaviour when kvmclock in guest sets sched clock stable. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> [Remove unnecessary comment. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-05-29x86: kvmclock: set scheduler clock stableLuiz Capitulino1-2/+12
If you try to enable NOHZ_FULL on a guest today, you'll get the following error when the guest tries to deactivate the scheduler tick: WARNING: CPU: 3 PID: 2182 at kernel/time/tick-sched.c:192 can_stop_full_tick+0xb9/0x290() NO_HZ FULL will not work with unstable sched clock CPU: 3 PID: 2182 Comm: kworker/3:1 Not tainted 4.0.0-10545-gb9bb6fb #204 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: events flush_to_ldisc ffffffff8162a0c7 ffff88011f583e88 ffffffff814e6ba0 0000000000000002 ffff88011f583ed8 ffff88011f583ec8 ffffffff8104d095 ffff88011f583eb8 0000000000000000 0000000000000003 0000000000000001 0000000000000001 Call Trace: <IRQ> [<ffffffff814e6ba0>] dump_stack+0x4f/0x7b [<ffffffff8104d095>] warn_slowpath_common+0x85/0xc0 [<ffffffff8104d146>] warn_slowpath_fmt+0x46/0x50 [<ffffffff810bd2a9>] can_stop_full_tick+0xb9/0x290 [<ffffffff810bd9ed>] tick_nohz_irq_exit+0x8d/0xb0 [<ffffffff810511c5>] irq_exit+0xc5/0x130 [<ffffffff814f180a>] smp_apic_timer_interrupt+0x4a/0x60 [<ffffffff814eff5e>] apic_timer_interrupt+0x6e/0x80 <EOI> [<ffffffff814ee5d1>] ? _raw_spin_unlock_irqrestore+0x31/0x60 [<ffffffff8108bbc8>] __wake_up+0x48/0x60 [<ffffffff8134836c>] n_tty_receive_buf_common+0x49c/0xba0 [<ffffffff8134a6bf>] ? tty_ldisc_ref+0x1f/0x70 [<ffffffff81348a84>] n_tty_receive_buf2+0x14/0x20 [<ffffffff8134b390>] flush_to_ldisc+0xe0/0x120 [<ffffffff81064d05>] process_one_work+0x1d5/0x540 [<ffffffff81064c81>] ? process_one_work+0x151/0x540 [<ffffffff81065191>] worker_thread+0x121/0x470 [<ffffffff81065070>] ? process_one_work+0x540/0x540 [<ffffffff8106b4df>] kthread+0xef/0x110 [<ffffffff8106b3f0>] ? __kthread_parkme+0xa0/0xa0 [<ffffffff814ef4f2>] ret_from_fork+0x42/0x70 [<ffffffff8106b3f0>] ? __kthread_parkme+0xa0/0xa0 ---[ end trace 06e3507544a38866 ]--- However, it turns out that kvmclock does provide a stable sched_clock callback. So, let the scheduler know this which in turn makes NOHZ_FULL work in the guest. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-05-29x86: kvmclock: add flag to indicate pvclock counts from zeroMarcelo Tosatti1-0/+1
Setting sched clock stable for kvmclock causes the printk timestamps to not start from zero, which is different from baremetal and can possibly break userspace. Add a flag to indicate that hypervisor sets clock base at zero when kvmclock is initialized. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-05-28arch/x86/kvm/mmu.c: work around gcc-4.4.4 bugAndrew Morton1-7/+7
arch/x86/kvm/mmu.c: In function 'kvm_mmu_pte_write': arch/x86/kvm/mmu.c:4256: error: unknown field 'cr0_wp' specified in initializer arch/x86/kvm/mmu.c:4257: error: unknown field 'cr4_pae' specified in initializer arch/x86/kvm/mmu.c:4257: warning: excess elements in union initializer ... gcc-4.4.4 (at least) has issues when using anonymous unions in initializers. Fixes: edc90b7dc4ceef6 ("KVM: MMU: fix SMAP virtualization") Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>