linux - Linux Kernel (branches are rebased on master from time to time)

Age	Commit message (Collapse)	Author	Files	Lines
2012-10-05	PPC: select EPAPR_PARAVIRT for all users of epapr hcalls	Stuart Yoder	1	-0/+1
	Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05	KVM: PPC: ev_idle hcall support for e500 guests	Liu Yu-B13201	3	-6/+44
	Signed-off-by: Liu Yu <yu.liu@freescale.com> [varun: 64-bit changes] Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com> Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05	KVM: PPC: Add support for ePAPR idle hcall in host kernel	Liu Yu-B13201	2	-2/+9
	And add a new flag definition in kvm_ppc_pvinfo to indicate whether the host supports the EV_IDLE hcall. Signed-off-by: Liu Yu <yu.liu@freescale.com> [stuart.yoder@freescale.com: cleanup,fixes for conditions allowing idle] Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com> [agraf: fix typo] Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05	KVM: PPC: add pvinfo for hcall opcodes on e500mc/e5500	Stuart Yoder	1	-1/+9
	Signed-off-by: Liu Yu <yu.liu@freescale.com> [stuart: factored this out from idle hcall support in host patch] Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05	KVM: PPC: use definitions in epapr header for hcalls	Stuart Yoder	3	-16/+17
	Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05	PPC: epapr: create define for return code value of success	Stuart Yoder	1	-1/+2
	Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2012-09-27	KVM: s390: Fix vcpu_load handling in interrupt code	Christian Borntraeger	1	-2/+0
	Recent changes (KVM: make processes waiting on vcpu mutex killable) now requires to check the return value of vcpu_load. This triggered a warning in s390 specific kvm code. Turns out that we can actually remove the put/load, since schedule will do the right thing via the preempt notifiers. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-23	KVM: x86: Fix guest debug across vcpu INIT reset	Jan Kiszka	4	-43/+24
	If we reset a vcpu on INIT, we so far overwrote dr7 as provided by KVM_SET_GUEST_DEBUG, and we also cleared switch_db_regs unconditionally. Fix this by saving the dr7 used for guest debugging and calculating the effective register value as well as switch_db_regs on any potential change. This will change to focus of the set_guest_debug vendor op to update_dp_bp_intercept. Found while trying to stop on start_secondary. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-23	KVM: Add resampling irqfds for level triggered interrupts	Alex Williamson	1	-0/+4
	To emulate level triggered interrupts, add a resample option to KVM_IRQFD. When specified, a new resamplefd is provided that notifies the user when the irqchip has been resampled by the VM. This may, for instance, indicate an EOI. Also in this mode, posting of an interrupt through an irqfd only asserts the interrupt. On resampling, the interrupt is automatically de-asserted prior to user notification. This enables level triggered interrupts to be posted and re-enabled from vfio with no userspace intervention. All resampling irqfds can make use of a single irq source ID, so we reserve a new one for this interface. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-20	KVM: optimize apic interrupt delivery	Gleb Natapov	4	-12/+193
	Most interrupt are delivered to only one vcpu. Use pre-build tables to find interrupt destination instead of looping through all vcpus. In case of logical mode loop only through vcpus in a logical cluster irq is sent to. Signed-off-by: Gleb Natapov <gleb@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-20	KVM: MMU: Eliminate pointless temporary 'ac'	Avi Kivity	1	-4/+1
	'ac' essentially reconstructs the 'access' variable we already have, except for the PFERR_PRESENT_MASK and PFERR_RSVD_MASK. As these are not used by callees, just use 'access' directly. Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-20	KVM: MMU: Avoid access/dirty update loop if all is well	Avi Kivity	1	-6/+20
	Keep track of accessed/dirty bits; if they are all set, do not enter the accessed/dirty update loop. Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-20	KVM: MMU: Eliminate eperm temporary	Avi Kivity	1	-4/+1
	'eperm' is no longer used in the walker loop, so we can eliminate it. Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-20	KVM: MMU: Optimize is_last_gpte()	Avi Kivity	4	-20/+41
	Instead of branchy code depending on level, gpte.ps, and mmu configuration, prepare everything in a bitmap during mode changes and look it up during runtime. Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-20	KVM: MMU: Simplify walk_addr_generic() loop	Avi Kivity	1	-35/+25
	The page table walk is coded as an infinite loop, with a special case on the last pte. Code it as an ordinary loop with a termination condition on the last pte (large page or walk length exhausted), and put the last pte handling code after the loop where it belongs. Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-20	KVM: MMU: Optimize pte permission checks	Avi Kivity	5	-36/+61
	walk_addr_generic() permission checks are a maze of branchy code, which is performed four times per lookup. It depends on the type of access, efer.nxe, cr0.wp, cr4.smep, and in the near future, cr4.smap. Optimize this away by precalculating all variants and storing them in a bitmap. The bitmap is recalculated when rarely-changing variables change (cr0, cr4) and is indexed by the often-changing variables (page fault error code, pte access permissions). The permission check is moved to the end of the loop, otherwise an SMEP fault could be reported as a false positive, when PDE.U=1 but PTE.U=0. Noted by Xiao Guangrong. The result is short, branch-free code. Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-20	KVM: MMU: Update accessed and dirty bits after guest pagetable walk	Avi Kivity	1	-29/+47
	While unspecified, the behaviour of Intel processors is to first perform the page table walk, then, if the walk was successful, to atomically update the accessed and dirty bits of walked paging elements. While we are not required to follow this exactly, doing so will allow us to perform the access permissions check after the walk is complete, rather than after each walk step. (the tricky case is SMEP: a zero in any pte's U bit makes the referenced page a supervisor page, so we can't fault on a one bit during the walk itself). Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-20	KVM: MMU: Move gpte_access() out of paging_tmpl.h	Avi Kivity	2	-16/+15
	We no longer rely on paging_tmpl.h defines; so we can move the function to mmu.c. Rely on zero extension to 64 bits to get the correct nx behaviour. Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-20	KVM: MMU: Optimize gpte_access() slightly	Avi Kivity	1	-3/+1
	If nx is disabled, then is gpte[63] is set we will hit a reserved bit set fault before checking permissions; so we can ignore the setting of efer.nxe. Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-20	KVM: MMU: Push clean gpte write protection out of gpte_access()	Avi Kivity	3	-13/+26
	gpte_access() computes the access permissions of a guest pte and also write-protects clean gptes. This is wrong when we are servicing a write fault (since we'll be setting the dirty bit momentarily) but correct when instantiating a speculative spte, or when servicing a read fault (since we'll want to trap a following write in order to set the dirty bit). It doesn't seem to hurt in practice, but in order to make the code readable, push the write protection out of gpte_access() and into a new protect_clean_gpte() which is called explicitly when needed. Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-17	KVM: make processes waiting on vcpu mutex killable	Michael S. Tsirkin	1	-3/+9
	vcpu mutex can be held for unlimited time so taking it with mutex_lock on an ioctl is wrong: one process could be passed a vcpu fd and call this ioctl on the vcpu used by another process, it will then be unkillable until the owner exits. Call mutex_lock_killable instead and return status. Note: mutex_lock_interruptible would be even nicer, but I am not sure all users are prepared to handle EINTR from these ioctls. They might misinterpret it as an error. Cleanup paths expect a vcpu that can't be used by any userspace so this will always succeed - catch bugs by calling BUG_ON. Catch callers that don't check return state by adding __must_check. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-09-17	KVM: SVM: Make use of asm.h	Avi Kivity	1	-26/+20
	Use macros for bitness-insensitive register names, instead of rolling our own. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-09-17	KVM: VMX: Make use of asm.h	Avi Kivity	1	-39/+30
	Use macros for bitness-insensitive register names, instead of rolling our own. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-09-17	KVM: VMX: Make lto-friendly	Avi Kivity	1	-6/+11
	LTO (link-time optimization) doesn't like local labels to be referred to from a different function, since the two functions may be built in separate compilation units. Use an external variable instead. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-09-12	KVM: x86: lapic: Clean up find_highest_vector() and count_vectors()	Takuya Yoshikawa	1	-12/+18
	find_highest_vector() and count_vectors(): - Instead of using magic values, define and use proper macros. find_highest_vector(): - Remove likely() which is there only for historical reasons and not doing correct branch predictions anymore. Using such heuristics to optimize this function is not worth it now. Let CPUs predict things instead. - Stop checking word[0] separately. This was only needed for doing likely() optimization. - Use for loop, not while, to iterate over the register array to make the code clearer. Note that we actually confirmed that the likely() did wrong predictions by inserting debug code. Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-09-10	KVM: MMU: remove unnecessary check	Xiao Guangrong	1	-5/+0
	Checking the return of kvm_mmu_get_page is unnecessary since it is guaranteed by memory cache Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-10	KVM: Depend on HIGH_RES_TIMERS	Liu, Jinsong	1	-0/+1
	KVM lapic timer and tsc deadline timer based on hrtimer, setting a leftmost node to rb tree and then do hrtimer reprogram. If hrtimer not configured as high resolution, hrtimer_enqueue_reprogram do nothing and then make kvm lapic timer and tsc deadline timer fail. Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-06	KVM: use symbolic constant for nr interrupts	Michael S. Tsirkin	1	-2/+2
	interrupt_bitmap is KVM_NR_INTERRUPTS bits in size, so just use that instead of hard-coded constants and math. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-06	KVM: emulator: optimize "rep ins" handling	Gleb Natapov	2	-6/+31
	Optimize "rep ins" by allowing emulator to write back more than one datum at a time. Introduce new operand type OP_MEM_STR which tells writeback() that dst contains pointer to an array that should be written back as opposite to just one data element. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-06	KVM: emulator: string_addr_inc() cleanup	Gleb Natapov	1	-7/+4
	Remove unneeded segment argument. Address structure already has correct segment which was put there during decode. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-06	KVM: emulator: make x86 emulation modes enum instead of defines	Gleb Natapov	2	-13/+13
	Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-06	KVM: Provide userspace IO exit completion callback	Gleb Natapov	2	-37/+57
	Current code assumes that IO exit was due to instruction emulation and handles execution back to emulator directly. This patch adds new userspace IO exit completion callback that can be set by any other code that caused IO exit to userspace. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-06	KVM: move postcommit flush to x86, as mmio sptes are x86 specific	Marcelo Tosatti	1	-0/+8
	Other arches do not need this. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> v2: fix incorrect deletion of mmio sptes on gpa move (noticed by Takuya) Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-06	KVM: split kvm_arch_flush_shadow	Marcelo Tosatti	4	-4/+25
	Introducing kvm_arch_flush_shadow_memslot, to invalidate the translations of a single memory slot. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-05	KVM: SVM: constify lookup tables	Mathias Krause	1	-4/+4
	We never modify direct_access_msrs[], msrpm_ranges[], svm_exit_handlers[] or x86_intercept_map[] at runtime. Mark them r/o. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-05	KVM: VMX: constify lookup tables	Mathias Krause	1	-7/+7
	We use vmcs_field_to_offset_table[], kvm_vmx_segment_fields[] and kvm_vmx_exit_handlers[] as lookup tables only -- make them r/o. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-05	KVM: x86: more constification	Mathias Krause	2	-2/+2
	Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-05	KVM: x86: constify read_write_emulator_ops	Mathias Krause	1	-4/+4
	We never change those, make them r/o. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-05	KVM: x86 emulator: constify emulate_ops	Mathias Krause	3	-13/+13
	We never change emulate_ops[] at runtime so it should be r/o. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-05	KVM: x86 emulator: mark opcode tables const	Mathias Krause	1	-20/+20
	The opcode tables never change at runtime, therefor mark them const. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-05	KVM: x86 emulator: use aligned variants of SSE register ops	Mathias Krause	1	-32/+32
	As the the compiler ensures that the memory operand is always aligned to a 16 byte memory location, use the aligned variant of MOVDQ for read_sse_reg() and write_sse_reg(). Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-05	KVM: x86: minor size optimization	Mathias Krause	1	-6/+6
	Some fields can be constified and/or made static to reduce code and data size. Numbers for a 32 bit build: text data bss dec hex filename before: 3351 80 0 3431 d67 cpuid.o after: 3391 0 0 3391 d3f cpuid.o Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-04	KVM: cleanup pic reset	Gleb Natapov	1	-41/+11
	kvm_pic_reset() is not used anywhere. Move reset logic from pic_ioport_write() there. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-30	KVM: x86: remove unused variable from kvm_task_switch()	Marcelo Tosatti	1	-1/+0
	Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-08-27	KVM: VMX: Ignore segment G and D bits when considering whether we can virtualize	Avi Kivity	1	-1/+1
	We will enter the guest with G and D cleared; as real hardware ignores D in real mode, and G is taken care of by the limit test, we allow more code to run in vm86 mode. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-08-27	KVM: VMX: Save all segment data in real mode	Avi Kivity	1	-0/+1
	Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-08-27	KVM: VMX: Preserve segment limit and access rights in real mode	Avi Kivity	1	-0/+3
	While this is undocumented, real processors do not reload the segment limit and access rights when loading a segment register in real mode. Real programs rely on it so we need to comply with this behaviour. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-08-27	KVM: VMX: Return real real-mode segment data even if ↵	Avi Kivity	1	-2/+1
	emulate_invalid_guest_state=1 emulate_invalid_guest_state=1 doesn't mean we don't munge the segments in the vmcs; we do. So we need to return the real ones (maintained by vmx_set_segment). Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-08-27	KVM: x86 emulator: Fix #GP error code during linearization	Avi Kivity	1	-2/+2
	We want the segment selector, nor segment number. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-08-27	KVM: x86 emulator: Check segment limits in real mode too	Avi Kivity	1	-3/+4
	Segment limits are verified in real mode, not just protected mode. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>