summaryrefslogtreecommitdiffstats
path: root/include/trace/events/iocost.h
AgeCommit message (Collapse)AuthorFilesLines
2022-12-01blk-iocost: Trace vtime_base_rate instead of vtime_rateKemeng Shi1-2/+2
Since commit ac33e91e2daca ("blk-iocost: implement vtime loss compensation") rename original vtime_rate to vtime_base_rate and current vtime_rate is original vtime_rate with compensation. The current rate showed in tracepoint is mixed with vtime_rate and vtime_base_rate: 1) In function ioc_adjust_base_vrate, the first trace_iocost_ioc_vrate_adj shows vtime_rate, the second trace_iocost_ioc_vrate_adj shows vtime_base_rate. 2) In function iocg_activate shows vtime_rate by calling TRACE_IOCG_PATH(iocg_activate... 3) In function ioc_check_iocgs shows vtime_rate by calling TRACE_IOCG_PATH(iocg_idle... Trace vtime_base_rate instead of vtime_rate as: 1) Before commit ac33e91e2daca ("blk-iocost: implement vtime loss compensation"), the traced rate is without compensation, so still show rate without compensation. 2) The vtime_base_rate is more stable while vtime_rate heavily depends on excess budeget on current period which may change abruptly in next period. Signed-off-by: Kemeng Shi <shikemeng@huawei.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20221018121932.10792-4-shikemeng@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-12blk-iocost: tracing: atomic64_read(&ioc->vtime_rate) is assigned an extra ↵Li kunyu1-1/+1
semicolon Remove extra semicolon. Link: https://lkml.kernel.org/r/20220629030013.10362-1-kunyu@nfschina.com Cc: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Li kunyu <kunyu@nfschina.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2020-12-17blk-iocost: Add iocg idle state tracepointBaolin Wang1-1/+15
It will be helpful to trace the iocg's whole state, including active and idle state. And we can easily expand the original iocost_iocg_activate trace event to support a state trace class, including active and idle state tracing. Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-25iocost: add iocg_forgive_debt tracepointTejun Heo1-0/+41
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01blk-iocost: restore inuse update tracepointsTejun Heo1-3/+3
Update and restore the inuse update tracepoints. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01blk-iocost: decouple vrate adjustment from surplus transfersTejun Heo1-9/+4
Budget donations are inaccurate and could take multiple periods to converge. To prevent triggering vrate adjustments while surplus transfers were catching up, vrate adjustment was suppressed if donations were increasing, which was indicated by non-zero nr_surpluses. This entangling won't be necessary with the scheduled rewrite of donation mechanism which will make it precise and immediate. Let's decouple the two in preparation. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01blk-iocost: calculate iocg->usages[] from iocg->local_stat.usage_usTejun Heo1-5/+2
Currently, iocg->usages[] which are used to guide inuse adjustments are calculated from vtime deltas. This, however, assumes that the hierarchical inuse weight at the time of calculation held for the entire period, which often isn't true and can lead to significant errors. Now that we have absolute usage information collected, we can derive iocg->usages[] from iocg->local_stat.usage_us so that inuse adjustment decisions are made based on actual absolute usage. The calculated usage is clamped between 1 and WEIGHT_ONE and WEIGHT_ONE is also used to signal saturation regardless of the current hierarchical inuse weight. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-04-21blk-iocost: Fix error on iocost_ioc_vrate_adjWaiman Long1-3/+3
Systemtap 4.2 is unable to correctly interpret the "u32 (*missed_ppm)[2]" argument of the iocost_ioc_vrate_adj trace entry defined in include/trace/events/iocost.h leading to the following error: /tmp/stapAcz0G0/stap_c89c58b83cea1724e26395efa9ed4939_6321_aux_6.c:78:8: error: expected ‘;’, ‘,’ or ‘)’ before ‘*’ token , u32[]* __tracepoint_arg_missed_ppm That argument type is indeed rather complex and hard to read. Looking at block/blk-iocost.c. It is just a 2-entry u32 array. By simplifying the argument to a simple "u32 *missed_ppm" and adjusting the trace entry accordingly, the compilation error was gone. Fixes: 7caa47151ab2 ("blkcg: implement blk-iocost") Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-29blkcg: blk-iocost: predeclare used structsStephen Rothwell1-0/+4
Fixes: 7caa47151ab2 ("blkcg: implement blk-iocost") Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-28blkcg: implement blk-iocostTejun Heo1-0/+174
This patchset implements IO cost model based work-conserving proportional controller. While io.latency provides the capability to comprehensively prioritize and protect IOs depending on the cgroups, its protection is binary - the lowest latency target cgroup which is suffering is protected at the cost of all others. In many use cases including stacking multiple workload containers in a single system, it's necessary to distribute IO capacity with better granularity. One challenge of controlling IO resources is the lack of trivially observable cost metric. The most common metrics - bandwidth and iops - can be off by orders of magnitude depending on the device type and IO pattern. However, the cost isn't a complete mystery. Given several key attributes, we can make fairly reliable predictions on how expensive a given stream of IOs would be, at least compared to other IO patterns. The function which determines the cost of a given IO is the IO cost model for the device. This controller distributes IO capacity based on the costs estimated by such model. The more accurate the cost model the better but the controller adapts based on IO completion latency and as long as the relative costs across differents IO patterns are consistent and sensible, it'll adapt to the actual performance of the device. Currently, the only implemented cost model is a simple linear one with a few sets of default parameters for different classes of device. This covers most common devices reasonably well. All the infrastructure to tune and add different cost models is already in place and a later patch will also allow using bpf progs for cost models. Please see the top comment in blk-iocost.c and documentation for more details. v2: Rebased on top of RQ_ALLOC_TIME changes and folded in Rik's fix for a divide-by-zero bug in current_hweight() triggered by zero inuse_sum. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Andy Newell <newella@fb.com> Cc: Josef Bacik <jbacik@fb.com> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>