io_uring: don't use iov_iter_advance() for fixed buffers - linux - Linux Kernel (branches are rebased on master from time to time)

diff options

author	Jens Axboe <axboe@kernel.dk>	2019-07-20 08:37:31 -0600
committer	Jens Axboe <axboe@kernel.dk>	2019-07-21 21:46:36 -0600
commit	bd11b3a391e3df6fa958facbe4b3f9f4cca9bd49 (patch)
tree	b4369a1b7fa26b0509767a39812ca03b0b1cbd0e /fs/ocfs2/export.h
parent	6a43074e2f461c2c49a607f9f6f5218d53f97d1e (diff)
download	linux-bd11b3a391e3df6fa958facbe4b3f9f4cca9bd49.tar.bz2

io_uring: don't use iov_iter_advance() for fixed buffers

Hrvoje reports that when a large fixed buffer is registered and IO is being done to the latter pages of said buffer, the IO submission time is much worse: reading to the start of the buffer: 11238 ns reading to the end of the buffer: 1039879 ns In fact, it's worse by two orders of magnitude. The reason for that is how io_uring figures out how to setup the iov_iter. We point the iter at the first bvec, and then use iov_iter_advance() to fast-forward to the offset within that buffer we need. However, that is abysmally slow, as it entails iterating the bvecs that we setup as part of buffer registration. There's really no need to use this generic helper, as we know it's a BVEC type iterator, and we also know that each bvec is PAGE_SIZE in size, apart from possibly the first and last. Hence we can just use a shift on the offset to find the right index, and then adjust the iov_iter appropriately. After this fix, the timings are: reading to the start of the buffer: 10135 ns reading to the end of the buffer: 1377 ns Or about an 755x improvement for the tail page. Reported-by: Hrvoje Zeba <zeba.hrvoje@gmail.com> Tested-by: Hrvoje Zeba <zeba.hrvoje@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Diffstat (limited to 'fs/ocfs2/export.h')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: