KVM: x86/xen: Compatibility fixes for shared runstate area

The guest runstate area can be arbitrarily byte-aligned. In fact, even when a sane 32-bit guest aligns the overall structure nicely, the 64-bit fields in the structure end up being unaligned due to the fact that the 32-bit ABI only aligns them to 32 bits. So setting the ->state_entry_time field to something|XEN_RUNSTATE_UPDATE is buggy, because if it's unaligned then we can't update the whole field atomically; the low bytes might be observable before the _UPDATE bit is. Xen actually updates the *byte* containing that top bit, on its own. KVM should do the same. In addition, we cannot assume that the runstate area fits within a single page. One option might be to make the gfn_to_pfn cache cope with regions that cross a page — but getting a contiguous virtual kernel mapping of a discontiguous set of IOMEM pages is a distinctly non-trivial exercise, and it seems this is the *only* current use case for the GPC which would benefit from it. An earlier version of the runstate code did use a gfn_to_hva cache for this purpose, but it still had the single-page restriction because it used the uhva directly — because it needs to be able to do so atomically when the vCPU is being scheduled out, so it used pagefault_disable() around the accesses and didn't just use kvm_write_guest_cached() which has a fallback path. So... use a pair of GPCs for the first and potential second page covering the runstate area. We can get away with locking both at once because nothing else takes more than one GPC lock at a time so we can invent a trivial ordering rule. The common case where it's all in the same page is kept as a fast path, but in both cases, the actual guest structure (compat or not) is built up from the fields in @vx, following preset pointers to the state and times fields. The only difference is whether those pointers point to the kernel stack (in the split case) or to guest memory directly via the GPC. The fast path is also fixed to use a byte access for the XEN_RUNSTATE_UPDATE bit, then the only real difference is the dual memcpy. Finally, Xen also does write the runstate area immediately when it's configured. Flip the kvm_xen_update_runstate() and …_guest() functions and call the latter directly when the runstate area is set. This means that other ioctls which modify the runstate also write it immediately to the guest when they do so, which is also intended. Update the xen_shinfo_test to exercise the pathological case where the XEN_RUNSTATE_UPDATE flag in the top byte of the state_entry_time is actually in a different page to the rest of the 64-bit word. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
author: David Woodhouse <dwmw@amazon.co.uk> 2022-11-18 14:32:38 +0000
committer: Paolo Bonzini <pbonzini@redhat.com> 2022-11-30 10:56:08 -0500
commit: 5ec3289b31ab9bb209be59cee360aac4b03f320a (patch)
tree: 6d2c28dbd35197645748218a0336d0d3cd7bcbb2 /arch/x86/include/asm/kvm_host.h
parent: 1e79a9e3ab96ecf8dbb8b6d237b3ae824bd79074 (diff)
download: linux-5ec3289b31ab9bb209be59cee360aac4b03f320a.tar.bz2
1 files changed, 1 insertions, 0 deletions
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d1013c4f673c..70af7240a1d5 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -686,6 +686,7 @@ struct kvm_vcpu_xen {
 	struct gfn_to_pfn_cache vcpu_info_cache;
 	struct gfn_to_pfn_cache vcpu_time_info_cache;
 	struct gfn_to_pfn_cache runstate_cache;
+	struct gfn_to_pfn_cache runstate2_cache;
 	u64 last_steal;
 	u64 runstate_entry_time;
 	u64 runstate_times[4];
author	David Woodhouse <dwmw@amazon.co.uk>	2022-11-18 14:32:38 +0000
committer	Paolo Bonzini <pbonzini@redhat.com>	2022-11-30 10:56:08 -0500
commit	5ec3289b31ab9bb209be59cee360aac4b03f320a (patch)
tree	6d2c28dbd35197645748218a0336d0d3cd7bcbb2 /arch/x86/include/asm/kvm_host.h
parent	1e79a9e3ab96ecf8dbb8b6d237b3ae824bd79074 (diff)
download	linux-5ec3289b31ab9bb209be59cee360aac4b03f320a.tar.bz2