From b8b43a4c2e9f66481cc667007aac6d0372e9c4cb Mon Sep 17 00:00:00 2001 From: David Matlack Date: Thu, 1 Dec 2022 11:52:48 -0800 Subject: KVM: Move halt-polling documentation into common directory Move halt-polling.rst into the common KVM documentation directory and out of the x86-specific directory. Halt-polling is a common feature and the existing documentation is already written as such. Signed-off-by: David Matlack Message-Id: <20221201195249.3369720-2-dmatlack@google.com> Signed-off-by: Paolo Bonzini --- Documentation/virt/kvm/halt-polling.rst | 140 ++++++++++++++++++++++++++++ Documentation/virt/kvm/index.rst | 1 + Documentation/virt/kvm/x86/halt-polling.rst | 140 ---------------------------- Documentation/virt/kvm/x86/index.rst | 1 - 4 files changed, 141 insertions(+), 141 deletions(-) create mode 100644 Documentation/virt/kvm/halt-polling.rst delete mode 100644 Documentation/virt/kvm/x86/halt-polling.rst (limited to 'Documentation') diff --git a/Documentation/virt/kvm/halt-polling.rst b/Documentation/virt/kvm/halt-polling.rst new file mode 100644 index 000000000000..4922e4a15f18 --- /dev/null +++ b/Documentation/virt/kvm/halt-polling.rst @@ -0,0 +1,140 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=========================== +The KVM halt polling system +=========================== + +The KVM halt polling system provides a feature within KVM whereby the latency +of a guest can, under some circumstances, be reduced by polling in the host +for some time period after the guest has elected to no longer run by cedeing. +That is, when a guest vcpu has ceded, or in the case of powerpc when all of the +vcpus of a single vcore have ceded, the host kernel polls for wakeup conditions +before giving up the cpu to the scheduler in order to let something else run. + +Polling provides a latency advantage in cases where the guest can be run again +very quickly by at least saving us a trip through the scheduler, normally on +the order of a few micro-seconds, although performance benefits are workload +dependant. In the event that no wakeup source arrives during the polling +interval or some other task on the runqueue is runnable the scheduler is +invoked. Thus halt polling is especially useful on workloads with very short +wakeup periods where the time spent halt polling is minimised and the time +savings of not invoking the scheduler are distinguishable. + +The generic halt polling code is implemented in: + + virt/kvm/kvm_main.c: kvm_vcpu_block() + +The powerpc kvm-hv specific case is implemented in: + + arch/powerpc/kvm/book3s_hv.c: kvmppc_vcore_blocked() + +Halt Polling Interval +===================== + +The maximum time for which to poll before invoking the scheduler, referred to +as the halt polling interval, is increased and decreased based on the perceived +effectiveness of the polling in an attempt to limit pointless polling. +This value is stored in either the vcpu struct: + + kvm_vcpu->halt_poll_ns + +or in the case of powerpc kvm-hv, in the vcore struct: + + kvmppc_vcore->halt_poll_ns + +Thus this is a per vcpu (or vcore) value. + +During polling if a wakeup source is received within the halt polling interval, +the interval is left unchanged. In the event that a wakeup source isn't +received during the polling interval (and thus schedule is invoked) there are +two options, either the polling interval and total block time[0] were less than +the global max polling interval (see module params below), or the total block +time was greater than the global max polling interval. + +In the event that both the polling interval and total block time were less than +the global max polling interval then the polling interval can be increased in +the hope that next time during the longer polling interval the wake up source +will be received while the host is polling and the latency benefits will be +received. The polling interval is grown in the function grow_halt_poll_ns() and +is multiplied by the module parameters halt_poll_ns_grow and +halt_poll_ns_grow_start. + +In the event that the total block time was greater than the global max polling +interval then the host will never poll for long enough (limited by the global +max) to wakeup during the polling interval so it may as well be shrunk in order +to avoid pointless polling. The polling interval is shrunk in the function +shrink_halt_poll_ns() and is divided by the module parameter +halt_poll_ns_shrink, or set to 0 iff halt_poll_ns_shrink == 0. + +It is worth noting that this adjustment process attempts to hone in on some +steady state polling interval but will only really do a good job for wakeups +which come at an approximately constant rate, otherwise there will be constant +adjustment of the polling interval. + +[0] total block time: + the time between when the halt polling function is + invoked and a wakeup source received (irrespective of + whether the scheduler is invoked within that function). + +Module Parameters +================= + +The kvm module has 3 tuneable module parameters to adjust the global max +polling interval as well as the rate at which the polling interval is grown and +shrunk. These variables are defined in include/linux/kvm_host.h and as module +parameters in virt/kvm/kvm_main.c, or arch/powerpc/kvm/book3s_hv.c in the +powerpc kvm-hv case. + ++-----------------------+---------------------------+-------------------------+ +|Module Parameter | Description | Default Value | ++-----------------------+---------------------------+-------------------------+ +|halt_poll_ns | The global max polling | KVM_HALT_POLL_NS_DEFAULT| +| | interval which defines | | +| | the ceiling value of the | | +| | polling interval for | (per arch value) | +| | each vcpu. | | ++-----------------------+---------------------------+-------------------------+ +|halt_poll_ns_grow | The value by which the | 2 | +| | halt polling interval is | | +| | multiplied in the | | +| | grow_halt_poll_ns() | | +| | function. | | ++-----------------------+---------------------------+-------------------------+ +|halt_poll_ns_grow_start| The initial value to grow | 10000 | +| | to from zero in the | | +| | grow_halt_poll_ns() | | +| | function. | | ++-----------------------+---------------------------+-------------------------+ +|halt_poll_ns_shrink | The value by which the | 0 | +| | halt polling interval is | | +| | divided in the | | +| | shrink_halt_poll_ns() | | +| | function. | | ++-----------------------+---------------------------+-------------------------+ + +These module parameters can be set from the debugfs files in: + + /sys/module/kvm/parameters/ + +Note: that these module parameters are system wide values and are not able to + be tuned on a per vm basis. + +Further Notes +============= + +- Care should be taken when setting the halt_poll_ns module parameter as a large value + has the potential to drive the cpu usage to 100% on a machine which would be almost + entirely idle otherwise. This is because even if a guest has wakeups during which very + little work is done and which are quite far apart, if the period is shorter than the + global max polling interval (halt_poll_ns) then the host will always poll for the + entire block time and thus cpu utilisation will go to 100%. + +- Halt polling essentially presents a trade off between power usage and latency and + the module parameters should be used to tune the affinity for this. Idle cpu time is + essentially converted to host kernel time with the aim of decreasing latency when + entering the guest. + +- Halt polling will only be conducted by the host when no other tasks are runnable on + that cpu, otherwise the polling will cease immediately and schedule will be invoked to + allow that other task to run. Thus this doesn't allow a guest to denial of service the + cpu. diff --git a/Documentation/virt/kvm/index.rst b/Documentation/virt/kvm/index.rst index e0a2c74e1043..ad13ec55ddfe 100644 --- a/Documentation/virt/kvm/index.rst +++ b/Documentation/virt/kvm/index.rst @@ -17,4 +17,5 @@ KVM locking vcpu-requests + halt-polling review-checklist diff --git a/Documentation/virt/kvm/x86/halt-polling.rst b/Documentation/virt/kvm/x86/halt-polling.rst deleted file mode 100644 index 4922e4a15f18..000000000000 --- a/Documentation/virt/kvm/x86/halt-polling.rst +++ /dev/null @@ -1,140 +0,0 @@ -.. SPDX-License-Identifier: GPL-2.0 - -=========================== -The KVM halt polling system -=========================== - -The KVM halt polling system provides a feature within KVM whereby the latency -of a guest can, under some circumstances, be reduced by polling in the host -for some time period after the guest has elected to no longer run by cedeing. -That is, when a guest vcpu has ceded, or in the case of powerpc when all of the -vcpus of a single vcore have ceded, the host kernel polls for wakeup conditions -before giving up the cpu to the scheduler in order to let something else run. - -Polling provides a latency advantage in cases where the guest can be run again -very quickly by at least saving us a trip through the scheduler, normally on -the order of a few micro-seconds, although performance benefits are workload -dependant. In the event that no wakeup source arrives during the polling -interval or some other task on the runqueue is runnable the scheduler is -invoked. Thus halt polling is especially useful on workloads with very short -wakeup periods where the time spent halt polling is minimised and the time -savings of not invoking the scheduler are distinguishable. - -The generic halt polling code is implemented in: - - virt/kvm/kvm_main.c: kvm_vcpu_block() - -The powerpc kvm-hv specific case is implemented in: - - arch/powerpc/kvm/book3s_hv.c: kvmppc_vcore_blocked() - -Halt Polling Interval -===================== - -The maximum time for which to poll before invoking the scheduler, referred to -as the halt polling interval, is increased and decreased based on the perceived -effectiveness of the polling in an attempt to limit pointless polling. -This value is stored in either the vcpu struct: - - kvm_vcpu->halt_poll_ns - -or in the case of powerpc kvm-hv, in the vcore struct: - - kvmppc_vcore->halt_poll_ns - -Thus this is a per vcpu (or vcore) value. - -During polling if a wakeup source is received within the halt polling interval, -the interval is left unchanged. In the event that a wakeup source isn't -received during the polling interval (and thus schedule is invoked) there are -two options, either the polling interval and total block time[0] were less than -the global max polling interval (see module params below), or the total block -time was greater than the global max polling interval. - -In the event that both the polling interval and total block time were less than -the global max polling interval then the polling interval can be increased in -the hope that next time during the longer polling interval the wake up source -will be received while the host is polling and the latency benefits will be -received. The polling interval is grown in the function grow_halt_poll_ns() and -is multiplied by the module parameters halt_poll_ns_grow and -halt_poll_ns_grow_start. - -In the event that the total block time was greater than the global max polling -interval then the host will never poll for long enough (limited by the global -max) to wakeup during the polling interval so it may as well be shrunk in order -to avoid pointless polling. The polling interval is shrunk in the function -shrink_halt_poll_ns() and is divided by the module parameter -halt_poll_ns_shrink, or set to 0 iff halt_poll_ns_shrink == 0. - -It is worth noting that this adjustment process attempts to hone in on some -steady state polling interval but will only really do a good job for wakeups -which come at an approximately constant rate, otherwise there will be constant -adjustment of the polling interval. - -[0] total block time: - the time between when the halt polling function is - invoked and a wakeup source received (irrespective of - whether the scheduler is invoked within that function). - -Module Parameters -================= - -The kvm module has 3 tuneable module parameters to adjust the global max -polling interval as well as the rate at which the polling interval is grown and -shrunk. These variables are defined in include/linux/kvm_host.h and as module -parameters in virt/kvm/kvm_main.c, or arch/powerpc/kvm/book3s_hv.c in the -powerpc kvm-hv case. - -+-----------------------+---------------------------+-------------------------+ -|Module Parameter | Description | Default Value | -+-----------------------+---------------------------+-------------------------+ -|halt_poll_ns | The global max polling | KVM_HALT_POLL_NS_DEFAULT| -| | interval which defines | | -| | the ceiling value of the | | -| | polling interval for | (per arch value) | -| | each vcpu. | | -+-----------------------+---------------------------+-------------------------+ -|halt_poll_ns_grow | The value by which the | 2 | -| | halt polling interval is | | -| | multiplied in the | | -| | grow_halt_poll_ns() | | -| | function. | | -+-----------------------+---------------------------+-------------------------+ -|halt_poll_ns_grow_start| The initial value to grow | 10000 | -| | to from zero in the | | -| | grow_halt_poll_ns() | | -| | function. | | -+-----------------------+---------------------------+-------------------------+ -|halt_poll_ns_shrink | The value by which the | 0 | -| | halt polling interval is | | -| | divided in the | | -| | shrink_halt_poll_ns() | | -| | function. | | -+-----------------------+---------------------------+-------------------------+ - -These module parameters can be set from the debugfs files in: - - /sys/module/kvm/parameters/ - -Note: that these module parameters are system wide values and are not able to - be tuned on a per vm basis. - -Further Notes -============= - -- Care should be taken when setting the halt_poll_ns module parameter as a large value - has the potential to drive the cpu usage to 100% on a machine which would be almost - entirely idle otherwise. This is because even if a guest has wakeups during which very - little work is done and which are quite far apart, if the period is shorter than the - global max polling interval (halt_poll_ns) then the host will always poll for the - entire block time and thus cpu utilisation will go to 100%. - -- Halt polling essentially presents a trade off between power usage and latency and - the module parameters should be used to tune the affinity for this. Idle cpu time is - essentially converted to host kernel time with the aim of decreasing latency when - entering the guest. - -- Halt polling will only be conducted by the host when no other tasks are runnable on - that cpu, otherwise the polling will cease immediately and schedule will be invoked to - allow that other task to run. Thus this doesn't allow a guest to denial of service the - cpu. diff --git a/Documentation/virt/kvm/x86/index.rst b/Documentation/virt/kvm/x86/index.rst index 7ff588826b9f..9ece6b8dc817 100644 --- a/Documentation/virt/kvm/x86/index.rst +++ b/Documentation/virt/kvm/x86/index.rst @@ -10,7 +10,6 @@ KVM for x86 systems amd-memory-encryption cpuid errata - halt-polling hypercalls mmu msr -- cgit v1.2.3 From 34e30ebbe48cc43c14276f863f0d2995c8f13186 Mon Sep 17 00:00:00 2001 From: David Matlack Date: Thu, 1 Dec 2022 11:52:49 -0800 Subject: KVM: Document the interaction between KVM_CAP_HALT_POLL and halt_poll_ns Clarify the existing documentation about how KVM_CAP_HALT_POLL and halt_poll_ns interact to make it clear that VMs using KVM_CAP_HALT_POLL ignore halt_poll_ns. Signed-off-by: David Matlack Message-Id: <20221201195249.3369720-3-dmatlack@google.com> Signed-off-by: Paolo Bonzini --- Documentation/virt/kvm/api.rst | 15 +++++++-------- Documentation/virt/kvm/halt-polling.rst | 13 +++++++++++++ 2 files changed, 20 insertions(+), 8 deletions(-) (limited to 'Documentation') diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index eee9f857a986..896914e3a847 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -7213,14 +7213,13 @@ veto the transition. :Parameters: args[0] is the maximum poll time in nanoseconds :Returns: 0 on success; -1 on error -This capability overrides the kvm module parameter halt_poll_ns for the -target VM. - -VCPU polling allows a VCPU to poll for wakeup events instead of immediately -scheduling during guest halts. The maximum time a VCPU can spend polling is -controlled by the kvm module parameter halt_poll_ns. This capability allows -the maximum halt time to specified on a per-VM basis, effectively overriding -the module parameter for the target VM. +KVM_CAP_HALT_POLL overrides the kvm.halt_poll_ns module parameter to set the +maximum halt-polling time for all vCPUs in the target VM. This capability can +be invoked at any time and any number of times to dynamically change the +maximum halt-polling time. + +See Documentation/virt/kvm/halt-polling.rst for more information on halt +polling. 7.21 KVM_CAP_X86_USER_SPACE_MSR ------------------------------- diff --git a/Documentation/virt/kvm/halt-polling.rst b/Documentation/virt/kvm/halt-polling.rst index 4922e4a15f18..3fae39b1a5ba 100644 --- a/Documentation/virt/kvm/halt-polling.rst +++ b/Documentation/virt/kvm/halt-polling.rst @@ -119,6 +119,19 @@ These module parameters can be set from the debugfs files in: Note: that these module parameters are system wide values and are not able to be tuned on a per vm basis. +Any changes to these parameters will be picked up by new and existing vCPUs the +next time they halt, with the notable exception of VMs using KVM_CAP_HALT_POLL +(see next section). + +KVM_CAP_HALT_POLL +================= + +KVM_CAP_HALT_POLL is a VM capability that allows userspace to override halt_poll_ns +on a per-VM basis. VMs using KVM_CAP_HALT_POLL ignore halt_poll_ns completely (but +still obey halt_poll_ns_grow, halt_poll_ns_grow_start, and halt_poll_ns_shrink). + +See Documentation/virt/kvm/api.rst for more information on this capability. + Further Notes ============= -- cgit v1.2.3