diff options
Diffstat (limited to 'Documentation/virt')
-rw-r--r-- | Documentation/virt/kvm/api.rst | 65 | ||||
-rw-r--r-- | Documentation/virt/kvm/devices/s390_flic.rst | 11 | ||||
-rw-r--r-- | Documentation/virt/kvm/index.rst | 2 | ||||
-rw-r--r-- | Documentation/virt/kvm/s390-pv-boot.rst | 84 | ||||
-rw-r--r-- | Documentation/virt/kvm/s390-pv.rst | 116 |
5 files changed, 267 insertions, 11 deletions
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index b7d4180605ab..158d1186d103 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -2110,7 +2110,8 @@ Errors: ====== ============================================================ ENOENT no such register - EINVAL invalid register ID, or no such register + EINVAL invalid register ID, or no such register or used with VMs in + protected virtualization mode on s390 EPERM (arm64) register access not allowed before vcpu finalization ====== ============================================================ @@ -2545,7 +2546,8 @@ Errors include: ======== ============================================================ ENOENT no such register - EINVAL invalid register ID, or no such register + EINVAL invalid register ID, or no such register or used with VMs in + protected virtualization mode on s390 EPERM (arm64) register access not allowed before vcpu finalization ======== ============================================================ @@ -4635,6 +4637,54 @@ the clear cpu reset definition in the POP. However, the cpu is not put into ESA mode. This reset is a superset of the initial reset. +4.125 KVM_S390_PV_COMMAND +------------------------- + +:Capability: KVM_CAP_S390_PROTECTED +:Architectures: s390 +:Type: vm ioctl +:Parameters: struct kvm_pv_cmd +:Returns: 0 on success, < 0 on error + +:: + + struct kvm_pv_cmd { + __u32 cmd; /* Command to be executed */ + __u16 rc; /* Ultravisor return code */ + __u16 rrc; /* Ultravisor return reason code */ + __u64 data; /* Data or address */ + __u32 flags; /* flags for future extensions. Must be 0 for now */ + __u32 reserved[3]; + }; + +cmd values: + +KVM_PV_ENABLE + Allocate memory and register the VM with the Ultravisor, thereby + donating memory to the Ultravisor that will become inaccessible to + KVM. All existing CPUs are converted to protected ones. After this + command has succeeded, any CPU added via hotplug will become + protected during its creation as well. + +KVM_PV_DISABLE + + Deregister the VM from the Ultravisor and reclaim the memory that + had been donated to the Ultravisor, making it usable by the kernel + again. All registered VCPUs are converted back to non-protected + ones. + +KVM_PV_VM_SET_SEC_PARMS + Pass the image header from VM memory to the Ultravisor in + preparation of image unpacking and verification. + +KVM_PV_VM_UNPACK + Unpack (protect and decrypt) a page of the encrypted boot image. + +KVM_PV_VM_VERIFY + Verify the integrity of the unpacked image. Only if this succeeds, + KVM is allowed to start protected VCPUs. + + 5. The kvm_run structure ======================== @@ -6025,3 +6075,14 @@ Architectures: s390 This capability indicates that the KVM_S390_NORMAL_RESET and KVM_S390_CLEAR_RESET ioctls are available. + +8.23 KVM_CAP_S390_PROTECTED + +Architecture: s390 + + +This capability indicates that the Ultravisor has been initialized and +KVM can therefore start protected VMs. +This capability governs the KVM_S390_PV_COMMAND ioctl and the +KVM_MP_STATE_LOAD MP_STATE. KVM_SET_MP_STATE can fail for protected +guests when the state change is invalid. diff --git a/Documentation/virt/kvm/devices/s390_flic.rst b/Documentation/virt/kvm/devices/s390_flic.rst index 954190da7d04..ea96559ba501 100644 --- a/Documentation/virt/kvm/devices/s390_flic.rst +++ b/Documentation/virt/kvm/devices/s390_flic.rst @@ -108,16 +108,9 @@ Groups: mask or unmask the adapter, as specified in mask KVM_S390_IO_ADAPTER_MAP - perform a gmap translation for the guest address provided in addr, - pin a userspace page for the translated address and add it to the - list of mappings - - .. note:: A new mapping will be created unconditionally; therefore, - the calling code should avoid making duplicate mappings. - + This is now a no-op. The mapping is purely done by the irq route. KVM_S390_IO_ADAPTER_UNMAP - release a userspace page for the translated address specified in addr - from the list of mappings + This is now a no-op. The mapping is purely done by the irq route. KVM_DEV_FLIC_AISM modify the adapter-interruption-suppression mode for a given isc if the diff --git a/Documentation/virt/kvm/index.rst b/Documentation/virt/kvm/index.rst index 774deaebf7fa..dcc252634cf9 100644 --- a/Documentation/virt/kvm/index.rst +++ b/Documentation/virt/kvm/index.rst @@ -18,6 +18,8 @@ KVM nested-vmx ppc-pv s390-diag + s390-pv + s390-pv-boot timekeeping vcpu-requests diff --git a/Documentation/virt/kvm/s390-pv-boot.rst b/Documentation/virt/kvm/s390-pv-boot.rst new file mode 100644 index 000000000000..8b8fa0390409 --- /dev/null +++ b/Documentation/virt/kvm/s390-pv-boot.rst @@ -0,0 +1,84 @@ +.. SPDX-License-Identifier: GPL-2.0 + +====================================== +s390 (IBM Z) Boot/IPL of Protected VMs +====================================== + +Summary +------- +The memory of Protected Virtual Machines (PVMs) is not accessible to +I/O or the hypervisor. In those cases where the hypervisor needs to +access the memory of a PVM, that memory must be made accessible. +Memory made accessible to the hypervisor will be encrypted. See +:doc:`s390-pv` for details." + +On IPL (boot) a small plaintext bootloader is started, which provides +information about the encrypted components and necessary metadata to +KVM to decrypt the protected virtual machine. + +Based on this data, KVM will make the protected virtual machine known +to the Ultravisor (UV) and instruct it to secure the memory of the +PVM, decrypt the components and verify the data and address list +hashes, to ensure integrity. Afterwards KVM can run the PVM via the +SIE instruction which the UV will intercept and execute on KVM's +behalf. + +As the guest image is just like an opaque kernel image that does the +switch into PV mode itself, the user can load encrypted guest +executables and data via every available method (network, dasd, scsi, +direct kernel, ...) without the need to change the boot process. + + +Diag308 +------- +This diagnose instruction is the basic mechanism to handle IPL and +related operations for virtual machines. The VM can set and retrieve +IPL information blocks, that specify the IPL method/devices and +request VM memory and subsystem resets, as well as IPLs. + +For PVMs this concept has been extended with new subcodes: + +Subcode 8: Set an IPL Information Block of type 5 (information block +for PVMs) +Subcode 9: Store the saved block in guest memory +Subcode 10: Move into Protected Virtualization mode + +The new PV load-device-specific-parameters field specifies all data +that is necessary to move into PV mode. + +* PV Header origin +* PV Header length +* List of Components composed of + * AES-XTS Tweak prefix + * Origin + * Size + +The PV header contains the keys and hashes, which the UV will use to +decrypt and verify the PV, as well as control flags and a start PSW. + +The components are for instance an encrypted kernel, kernel parameters +and initrd. The components are decrypted by the UV. + +After the initial import of the encrypted data, all defined pages will +contain the guest content. All non-specified pages will start out as +zero pages on first access. + + +When running in protected virtualization mode, some subcodes will result in +exceptions or return error codes. + +Subcodes 4 and 7, which specify operations that do not clear the guest +memory, will result in specification exceptions. This is because the +UV will clear all memory when a secure VM is removed, and therefore +non-clearing IPL subcodes are not allowed. + +Subcodes 8, 9, 10 will result in specification exceptions. +Re-IPL into a protected mode is only possible via a detour into non +protected mode. + +Keys +---- +Every CEC will have a unique public key to enable tooling to build +encrypted images. +See `s390-tools <https://github.com/ibm-s390-tools/s390-tools/>`_ +for the tooling. diff --git a/Documentation/virt/kvm/s390-pv.rst b/Documentation/virt/kvm/s390-pv.rst new file mode 100644 index 000000000000..774a8c606091 --- /dev/null +++ b/Documentation/virt/kvm/s390-pv.rst @@ -0,0 +1,116 @@ +.. SPDX-License-Identifier: GPL-2.0 + +========================================= +s390 (IBM Z) Ultravisor and Protected VMs +========================================= + +Summary +------- +Protected virtual machines (PVM) are KVM VMs that do not allow KVM to +access VM state like guest memory or guest registers. Instead, the +PVMs are mostly managed by a new entity called Ultravisor (UV). The UV +provides an API that can be used by PVMs and KVM to request management +actions. + +Each guest starts in non-protected mode and then may make a request to +transition into protected mode. On transition, KVM registers the guest +and its VCPUs with the Ultravisor and prepares everything for running +it. + +The Ultravisor will secure and decrypt the guest's boot memory +(i.e. kernel/initrd). It will safeguard state changes like VCPU +starts/stops and injected interrupts while the guest is running. + +As access to the guest's state, such as the SIE state description, is +normally needed to be able to run a VM, some changes have been made in +the behavior of the SIE instruction. A new format 4 state description +has been introduced, where some fields have different meanings for a +PVM. SIE exits are minimized as much as possible to improve speed and +reduce exposed guest state. + + +Interrupt injection +------------------- +Interrupt injection is safeguarded by the Ultravisor. As KVM doesn't +have access to the VCPUs' lowcores, injection is handled via the +format 4 state description. + +Machine check, external, IO and restart interruptions each can be +injected on SIE entry via a bit in the interrupt injection control +field (offset 0x54). If the guest cpu is not enabled for the interrupt +at the time of injection, a validity interception is recognized. The +format 4 state description contains fields in the interception data +block where data associated with the interrupt can be transported. + +Program and Service Call exceptions have another layer of +safeguarding; they can only be injected for instructions that have +been intercepted into KVM. The exceptions need to be a valid outcome +of an instruction emulation by KVM, e.g. we can never inject a +addressing exception as they are reported by SIE since KVM has no +access to the guest memory. + + +Mask notification interceptions +------------------------------- +KVM cannot intercept lctl(g) and lpsw(e) anymore in order to be +notified when a PVM enables a certain class of interrupt. As a +replacement, two new interception codes have been introduced: One +indicating that the contents of CRs 0, 6, or 14 have been changed, +indicating different interruption subclasses; and one indicating that +PSW bit 13 has been changed, indicating that a machine check +intervention was requested and those are now enabled. + +Instruction emulation +--------------------- +With the format 4 state description for PVMs, the SIE instruction already +interprets more instructions than it does with format 2. It is not able +to interpret every instruction, but needs to hand some tasks to KVM; +therefore, the SIE and the ultravisor safeguard emulation inputs and outputs. + +The control structures associated with SIE provide the Secure +Instruction Data Area (SIDA), the Interception Parameters (IP) and the +Secure Interception General Register Save Area. Guest GRs and most of +the instruction data, such as I/O data structures, are filtered. +Instruction data is copied to and from the SIDA when needed. Guest +GRs are put into / retrieved from the Secure Interception General +Register Save Area. + +Only GR values needed to emulate an instruction will be copied into this +save area and the real register numbers will be hidden. + +The Interception Parameters state description field still contains the +the bytes of the instruction text, but with pre-set register values +instead of the actual ones. I.e. each instruction always uses the same +instruction text, in order not to leak guest instruction text. +This also implies that the register content that a guest had in r<n> +may be in r<m> from the hypervisor's point of view. + +The Secure Instruction Data Area contains instruction storage +data. Instruction data, i.e. data being referenced by an instruction +like the SCCB for sclp, is moved via the SIDA. When an instruction is +intercepted, the SIE will only allow data and program interrupts for +this instruction to be moved to the guest via the two data areas +discussed before. Other data is either ignored or results in validity +interceptions. + + +Instruction emulation interceptions +----------------------------------- +There are two types of SIE secure instruction intercepts: the normal +and the notification type. Normal secure instruction intercepts will +make the guest pending for instruction completion of the intercepted +instruction type, i.e. on SIE entry it is attempted to complete +emulation of the instruction with the data provided by KVM. That might +be a program exception or instruction completion. + +The notification type intercepts inform KVM about guest environment +changes due to guest instruction interpretation. Such an interception +is recognized, for example, for the store prefix instruction to provide +the new lowcore location. On SIE reentry, any KVM data in the data areas +is ignored and execution continues as if the guest instruction had +completed. For that reason KVM is not allowed to inject a program +interrupt. + +Links +----- +`KVM Forum 2019 presentation <https://static.sched.com/hosted_files/kvmforum2019/3b/ibm_protected_vms_s390x.pdf>`_ |