summaryrefslogtreecommitdiffstats
path: root/io_uring/net.c
AgeCommit message (Collapse)AuthorFilesLines
2022-10-22io_uring/net: fail zc sendmsg when unsupported by socketPavel Begunkov1-0/+2
The previous patch fails zerocopy send requests for protocols that don't support it, do the same for zerocopy sendmsg. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0854e7bb4c3d810a48ec8b5853e2f61af36a0467.1666346426.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-10-22io_uring/net: fail zc send when unsupported by socketPavel Begunkov1-0/+2
If a protocol doesn't support zerocopy it will silently fall back to copying. This type of behaviour has always been a source of troubles so it's better to fail such requests instead. Cc: <stable@vger.kernel.org> # 6.0 Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/2db3c7f16bb6efab4b04569cd16e6242b40c5cb3.1666346426.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-10-12io_uring/net: handle -EINPROGRESS correct for IORING_OP_CONNECTJens Axboe1-6/+22
We treat EINPROGRESS like EAGAIN, but if we're retrying post getting EINPROGRESS, then we just need to check the socket for errors and terminate the request. This was exposed on a bluetooth connection request which ends up taking a while and hitting EINPROGRESS, and yields a CQE result of -EBADFD because we're retrying a connect on a socket that is now connected. Cc: stable@vger.kernel.org Fixes: 87f80d623c6c ("io_uring: handle connect -EINPROGRESS like -EAGAIN") Link: https://github.com/axboe/liburing/issues/671 Reported-by: Aidan Sun <aidansun05@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-29io_uring/net: fix notif cqe reorderingPavel Begunkov1-5/+19
send zc is not restricted to !IO_URING_F_UNLOCKED anymore and so we can't use task-tw ordering trick to order notification cqes with requests completions. In this case leave it alone and let io_send_zc_cleanup() flush it. Cc: stable@vger.kernel.org Fixes: 53bdc88aac9a2 ("io_uring/notif: order notif vs send CQEs") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0031f3a00d492e814a4a0935a2029a46d9c9ba06.1664486545.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-29io_uring/net: don't update msg_name if not providedPavel Begunkov1-1/+2
io_sendmsg_copy_hdr() may clear msg->msg_name if the userspace didn't provide it, we should retain NULL in this case. Cc: stable@vger.kernel.org Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/97d49f61b5ec76d0900df658cfde3aa59ff22121.1664486545.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-29io_uring/net: fix fast_iov assignment in io_setup_async_msg()Stefan Metzmacher1-2/+4
I hit a very bad problem during my tests of SENDMSG_ZC. BUG(); in first_iovec_segment() triggered very easily. The problem was io_setup_async_msg() in the partial retry case, which seems to happen more often with _ZC. iov_iter_iovec_advance() may change i->iov in order to have i->iov_offset being only relative to the first element. Which means kmsg->msg.msg_iter.iov is no longer the same as kmsg->fast_iov. But this would rewind the copy to be the start of async_msg->fast_iov, which means the internal state of sync_msg->msg.msg_iter is inconsitent. I tested with 5 vectors with length like this 4, 0, 64, 20, 8388608 and got a short writes with: - ret=2675244 min_ret=8388692 => remaining 5713448 sr->done_io=2675244 - ret=-EAGAIN => io_uring_poll_arm - ret=4911225 min_ret=5713448 => remaining 802223 sr->done_io=7586469 - ret=-EAGAIN => io_uring_poll_arm - ret=802223 min_ret=802223 => res=8388692 While this was easily triggered with SENDMSG_ZC (queued for 6.1), it was a potential problem starting with 7ba89d2af17aa879dda30f5d5d3f152e587fc551 in 5.18 for IORING_OP_RECVMSG. And also with 4c3c09439c08b03d9503df0ca4c7619c5842892e in 5.19 for IORING_OP_SENDMSG. However 257e84a5377fbbc336ff563833a8712619acce56 introduced the critical code into io_setup_async_msg() in 5.11. Fixes: 7ba89d2af17aa ("io_uring: ensure recv and recvmsg handle MSG_WAITALL correctly") Fixes: 257e84a5377fb ("io_uring: refactor sendmsg/recvmsg iov managing") Cc: stable@vger.kernel.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b2e7be246e2fb173520862b0c7098e55767567a2.1664436949.git.metze@samba.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-28io_uring/net: fix non-zc send with addressPavel Begunkov1-6/+6
We're currently ignoring the dest address with non-zerocopy send because even though we copy it from the userspace shortly after ->msg_name gets zeroed. Move msghdr init earlier. Fixes: 516e82f0e043a ("io_uring/net: support non-zerocopy sendto") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/176ced5e8568aa5d300ca899b7f05b303ebc49fd.1664409532.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-28io_uring/net: don't skip notifs for failed requestsPavel Begunkov1-21/+8
We currently only add a notification CQE when the send succeded, i.e. cqe.res >= 0. However, it'd be more robust to do buffer notifications for failed requests as well in case drivers decide do something fanky. Always return a buffer notification after initial prep, don't hide it. This behaviour is better aligned with documentation and the patch also helps the userspace to respect it. Cc: stable@vger.kernel.org # 6.0 Suggested-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9c8bead87b2b980fcec441b8faef52188b4a6588.1664292100.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-26io_uring/net: fix cleanup double free free_iov initPavel Begunkov1-10/+10
Having ->async_data doesn't mean it's initialised and previously we vere relying on setting F_CLEANUP at the right moment. With zc sendmsg though, we set F_CLEANUP early in prep when we alloc a notif and so we may allocate async_data, fail in copy_msg_hdr() leaving struct io_async_msghdr not initialised correctly but with F_CLEANUP set, which causes a ->free_iov double free and probably other nastiness. Always initialise ->free_iov. Also, now it might point to fast_iov when fails, so avoid freeing it during cleanups. Reported-by: syzbot+edfd15cd4246a3fc615a@syzkaller.appspotmail.com Fixes: 493108d95f146 ("io_uring/net: zerocopy sendmsg") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-23io_uring/net: fix UAF in io_sendrecv_fail()Pavel Begunkov1-10/+5
We should not assume anything about ->free_iov just from REQ_F_ASYNC_DATA but rather rely on REQ_F_NEED_CLEANUP, as we may allocate ->async_data but failed init would leave the field in not consistent state. The easiest solution is to remove removing REQ_F_NEED_CLEANUP and so ->async_data dealloc from io_sendrecv_fail() and let io_send_zc_cleanup() do the job. The catch here is that we also need to prevent double notif flushing, just test it for NULL and zero where it's needed. BUG: KASAN: use-after-free in io_sendrecv_fail+0x3b0/0x3e0 io_uring/net.c:1221 Write of size 8 at addr ffff8880771b4080 by task syz-executor.3/30199 CPU: 1 PID: 30199 Comm: syz-executor.3 Not tainted 6.0.0-rc6-next-20220923-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:284 [inline] print_report+0x15e/0x45d mm/kasan/report.c:395 kasan_report+0xbb/0x1f0 mm/kasan/report.c:495 io_sendrecv_fail+0x3b0/0x3e0 io_uring/net.c:1221 io_req_complete_failed+0x155/0x1b0 io_uring/io_uring.c:873 io_drain_req io_uring/io_uring.c:1648 [inline] io_queue_sqe_fallback.cold+0x29f/0x788 io_uring/io_uring.c:1931 io_submit_sqe io_uring/io_uring.c:2160 [inline] io_submit_sqes+0x1180/0x1df0 io_uring/io_uring.c:2276 __do_sys_io_uring_enter+0xac6/0x2410 io_uring/io_uring.c:3216 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: c4c0009e0b56e ("io_uring/net: combine fail handlers") Reported-by: syzbot+4c597a574a3f5a251bda@syzkaller.appspotmail.com Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/23ab8346e407ea50b1198a172c8a97e1cf22915b.1663945875.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-21io_uring/net: zerocopy sendmsgPavel Begunkov1-5/+86
Add a zerocopy version of sendmsg. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6aabc4bdfc0ec78df6ec9328137e394af9d4e7ef.1663668091.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-21io_uring/net: combine fail handlersPavel Begunkov1-15/+16
Merge io_send_zc_fail() into io_sendrecv_fail(), saves a few lines of code and some headache for following patch. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e0eba1d577413aef5602cd45f588b9230207082d.1663668091.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-21io_uring/net: rename io_sendzc()Pavel Begunkov1-3/+3
Simple renaming of io_sendzc*() functions in preparatio to adding a zerocopy sendmsg variant. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/265af46829e6076dd220011b1858dc3151969226.1663668091.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-21io_uring/net: support non-zerocopy sendtoPavel Begunkov1-6/+29
We have normal sends, but what is missing is sendto-like requests. Add sendto() capabilities to IORING_OP_SEND by passing in addr just as we do for IORING_OP_SEND_ZC. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/69fbd8b2cb830e57d1bf9ec351e9bf95c5b77e3f.1663668091.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-21io_uring/net: refactor io_setup_async_addrPavel Begunkov1-9/+9
Instead of passing the right address into io_setup_async_addr() only specify local on-stack storage and let the function infer where to grab it from. It optimises out one local variable we have to deal with. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6bfa9ab810d776853eb26ed59301e2536c3a5471.1663668091.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-21io_uring/net: don't lose partial send_zc on failPavel Begunkov1-0/+16
Partial zc send may end up in io_req_complete_failed(), which not only would return invalid result but also mask out the notification leading to lifetime issues. Cc: stable@vger.kernel.org Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/5673285b5e83e6ceca323727b4ddaa584b5cc91e.1663668091.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-21io_uring/net: don't lose partial send/recv on failPavel Begunkov1-0/+10
Just as with rw, partial send/recv may end up in io_req_complete_failed() and loose the result, make sure we return the number of bytes processed. Cc: stable@vger.kernel.org Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a4ff95897b5419356fca9ea55db91ac15b2975f9.1663668091.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-21io_uring/net: use io_sr_msg for sendzcPavel Begunkov1-13/+5
Reuse struct io_sr_msg for zerocopy sends, which is handy. There is only one zerocopy specific field, namely .notif, and we have enough space for it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/408c5b1b2d8869e1a12da5f5a78ed72cac112149.1662639236.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-21io_uring/net: refactor io_sr_msg typesPavel Begunkov1-7/+7
In preparation for using struct io_sr_msg for zerocopy sends, clean up types. First, flags can be u16 as it's provided by the userspace in u16 ioprio, as well as addr_len. This saves us 4 bytes. Also use unsigned for size and done_io, both are as well limited to u32. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/42c2639d6385b8b2181342d2af3a42d3b1c5bcd2.1662639236.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-21io_uring/net: add non-bvec sg chunking callbackPavel Begunkov1-6/+11
Add a sg_from_iter() for when we initiate non-bvec zerocopy sends, which helps us to remove some extra steps from io_sg_from_iter(). The only thing the new function has to do before giving control away to __zerocopy_sg_from_iter() is to check if the skb has managed frags and downgrade them if so. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/cda3dea0d36f7931f63a70f350130f085ac3f3dd.1662639236.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-21io_uring/net: io_async_msghdr caches for sendzcPavel Begunkov1-5/+4
We already keep io_async_msghdr caches for normal send/recv requests, use them also for zerocopy send. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/42fa615b6e0be25f47a685c35d7b5e4f1b03d348.1662639236.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-21io_uring/net: use async caches for async prepPavel Begunkov1-3/+13
send/recv have async_data caches but there are only used from within issue handlers. Extend their use also to ->prep_async, should be handy with links and IOSQE_ASYNC. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b9a2264b807582a97ed606c5bfcdc2399384e8a5.1662639236.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-21io_uring/net: reshuffle error handlingPavel Begunkov1-8/+8
We should prioritise send/recv retry cases over failures, they're more important. Shuffle -ERESTARTSYS after we handled retries. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d9059691b30d0963b7269fa4a0c81ee7720555e6.1662639236.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-18io_uring/net: fix zc fixed buf lifetimePavel Begunkov1-8/+8
Notifications usually outlive requests, so we need to pin buffers with it by assigning a rsrc to it instead of the request. Fixed: b48c312be05e8 ("io_uring/net: simplify zerocopy send user API") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/dd6406ff8a90887f2b36ed6205dac9fda17c1f35.1663366886.git.asml.silence@gmail.com Reviewed-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-08io_uring/net: copy addr for zc on POLL_FIRSTPavel Begunkov1-3/+4
Every time we return from an issue handler and expect the request to be retried we should also setup it for async exec ourselves. Do that when we return on IORING_RECVSEND_POLL_FIRST in io_sendzc(), otherwise it'll re-read the address, which might be a surprise for the userspace. Fixes: 092aeedb750a9 ("io_uring: allow to pass addr into sendzc") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/ab1d0657890d6721339c56d2e161a4bba06f85d0.1662642013.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-01io_uring/net: simplify zerocopy send user APIPavel Begunkov1-20/+33
Following user feedback, this patch simplifies zerocopy send API. One of the main complaints is that the current API is difficult with the userspace managing notification slots, and then send retries with error handling make it even worse. Instead of keeping notification slots change it to the per-request notifications model, which posts both completion and notification CQEs for each request when any data has been sent, and only one CQE if it fails. All notification CQEs will have IORING_CQE_F_NOTIF set and IORING_CQE_F_MORE in completion CQEs indicates whether to wait a notification or not. IOSQE_CQE_SKIP_SUCCESS is disallowed with zerocopy sends for now. This is less flexible, but greatly simplifies the user API and also the kernel implementation. We reuse notif helpers in this patch, but in the future there won't be need for keeping two requests. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/95287640ab98fc9417370afb16e310677c63e6ce.1662027856.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-01io_uring/notif: remove notif registrationPavel Begunkov1-3/+1
We're going to remove the userspace exposed zerocopy notification API, remove notification registration. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6ff00b97be99869c386958a990593c9c31cf105b.1662027856.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-26io_uring/net: fix overexcessive retriesPavel Begunkov1-1/+1
Length parameter of io_sg_from_iter() can be smaller than the iterator's size, as it's with TCP, so when we set from->count at the end of the function we truncate the iterator forcing TCP to return preliminary with a short send. It affects zerocopy sends with large payload sizes and leads to retries and possible request failures. Fixes: 3ff1a0d395c00 ("io_uring: enable managed frags with register buffers") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0bc0d5179c665b4ef5c328377c84c7a1f298467e.1661530037.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-25io_uring/net: save address for sendzc async executionPavel Begunkov1-7/+46
We usually copy all bits that a request needs from the userspace for async execution, so the userspace can keep them on the stack. However, send zerocopy violates this pattern for addresses and may reloads it e.g. from io-wq. Save the address if any in ->async_data as usual. Reported-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d7512d7aa9abcd36e9afe1a4d292a24cb2d157e5.1661342812.git.asml.silence@gmail.com [axboe: fold in incremental fix] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-24io_uring/net: fix indentationPavel Begunkov1-1/+1
Fix up indentation before we get complaints from tooling. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/bd5754e3764215ccd7fb04cd636ea9167aaa275d.1661342812.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-24io_uring/net: fix zc send link failingPavel Begunkov1-0/+1
Failed requests should be marked with req_set_fail(), so links and cqe skipping work correctly, which is missing in io_sendzc(). Note, io_sendzc() return IOU_OK on failure, so the core code won't do the cleanup for us. Fixes: 06a5464be84e4 ("io_uring: wire send zc request type") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e47d46fda9db30154ce66a549bb0d3380b780520.1661342812.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-18io_uring/net: use right helpers for async_dataPavel Begunkov1-2/+2
There is another spot where we check ->async_data directly instead of using req_has_async_data(), which is the way to do it, fix it up. Fixes: 43e0bbbd0b0e3 ("io_uring: add netmsg cache") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/42f33b9a81dd6ae65dda92f0372b0ff82d548517.1660822636.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-15io_uring/net: improve zc addr import error handlingPavel Begunkov1-8/+8
We may account memory to a memcg of a request that didn't even got to the network layer. It's not a bug as it'll be routinely cleaned up on flush, but it might be confusing for the userspace. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b8aae61f4c3ddc4da97c1da876bb73871f352d50.1660566179.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-15io_uring/net: use right helpers for async recyclePavel Begunkov1-1/+1
We have a helper that checks for whether a request contains anything in ->async_data or not, namely req_has_async_data(). It's better to use it as it might have some extra considerations. Fixes: 43e0bbbd0b0e3 ("io_uring: add netmsg cache") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b7414da4e7c3c32c31fc02dfd1355af4ccf4ca5f.1660566179.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-12io_uring: make io_kiocb_to_cmd() typesafeStefan Metzmacher1-21/+21
We need to make sure (at build time) that struct io_cmd_data is not casted to a structure that's larger. Signed-off-by: Stefan Metzmacher <metze@samba.org> Link: https://lore.kernel.org/r/c024cdf25ae19fc0319d4180e2298bade8ed17b8.1660201408.git.metze@samba.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-05io_uring: fix io_recvmsg_prep_multishot sparse warningsDylan Yudaken1-2/+2
Fix casts missing the __user parts. This seemed to only cause errors on the alpha build, or if checked with sparse, but it was definitely an oversight. Reported-by: kernel test robot <lkp@intel.com> Fixes: 9bb66906f23e ("io_uring: support multishot in recvmsg") Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220805115450.3921352-1-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-04io_uring/net: send retry for zerocopyPavel Begunkov1-3/+17
io_uring handles short sends/recvs for stream sockets when MSG_WAITALL is set, however new zerocopy send is inconsistent in this regard, which might be confusing. Handle short sends. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b876a4838597d9bba4f3215db60d72c33c448ad0.1659622472.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-27io_uring: notification completion optimisationPavel Begunkov1-2/+2
We want to use all optimisations that we have for io_uring requests like completion batching, memory caching and more but for zc notifications. Fortunately, notification perfectly fit the request model so we can overlay them onto struct io_kiocb and use all the infratructure. Most of the fields of struct io_notif natively fits into io_kiocb, so we replace struct io_notif with struct io_kiocb carrying struct io_notif_data in the cmd cache line. Then we adapt io_alloc_notif() to use io_alloc_req()/io_alloc_req_refill(), and kill leftovers of hand coded caching. __io_notif_complete_tw() is converted to use io_uring's tw infra. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9e010125175e80baf51f0ca63bdc7cc6a4a9fa56.1658913593.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-25io_uring/net: use unsigned for flagsPavel Begunkov1-2/+2
Use unsigned int type for msg flags. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/5cfaed13d3191337b14b8664ca68b515d9e2d1b4.1658742118.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-25io_uring/net: make page accounting more consistentPavel Begunkov1-1/+1
Make network page accounting more consistent with how buffer registration is working, i.e. account all memory to ctx->user. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4aacfe64bbb81b27f9ecf5d5c219c69a07e5aa56.1658742118.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-25io_uring/net: checks errors of zc mem accountingPavel Begunkov1-1/+3
mm_account_pinned_pages() may fail, don't ignore the return value. Fixes: e29e3bd4b968 ("io_uring: account locked pages for non-fixed zc") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/dae0542ed8e6706071bb83ad3e7ad6a70d207fd9.1658742118.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: enable managed frags with register buffersPavel Begunkov1-1/+55
io_uring's registered buffers infra has a good performant way of pinning pages, so let's use SKBFL_MANAGED_FRAG_REFS when our requests are purely register buffer backed. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/278731d3f20caf346cfc025fbee0b4c9ee4ed751.1657643355.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: flush notifiers after sendzcPavel Begunkov1-1/+4
Allow to flush notifiers as a part of sendzc request by setting IORING_SENDZC_FLUSH flag. When the sendzc request succeedes it will flush the used [active] notifier. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e0b4d9a6797e2fd6092824fe42953db7a519bbc8.1657643355.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: sendzc with fixed buffersPavel Begunkov1-5/+24
Allow zerocopy sends to use fixed buffers. There is an optimisation for this case, the network layer don't need to reference the pages, see SKBFL_MANAGED_FRAG_REFS, so io_uring have to ensure validity of fixed buffers until the notifier is released. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e1d8bd1b5934e541d90c1824eb4020ae3f5f43f3.1657643355.git.asml.silence@gmail.com [axboe: fold in 32-bit pointer cast warning fix] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: allow to pass addr into sendzcPavel Begunkov1-2/+16
Allow to specify an address to zerocopy sends making it more like sendto(2). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/70417a8f7c5b51ab454690bae08adc0c187f89e8.1657643355.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: account locked pages for non-fixed zcPavel Begunkov1-0/+1
Fixed buffers are RLIMIT_MEMLOCK accounted, however it doesn't cover iovec based zerocopy sends. Do the accounting on the io_uring side. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/19b6e3975440f59f1f6199c7ee7acf977b4eecdc.1657643355.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: wire send zc request typePavel Begunkov1-0/+94
Add a new io_uring opcode IORING_OP_SENDZC. The main distinction from IORING_OP_SEND is that the user should specify a notification slot index in sqe::notification_idx and the buffers are safe to reuse only when the used notification is flushed and completes. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a80387c6a68ce9cf99b3b6ef6f71068468761fb7.1657643355.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: initialise msghdr::msg_ubufPavel Begunkov1-0/+2
Initialise newly added ->msg_ubuf in io_recv() and io_send(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b8f9f263875a4a36e7f26cc5d55ebe315308f57d.1657643355.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24net: fix compat pointer in get_compat_msghdr()Jens Axboe1-1/+1
A previous change enabled external users to copy the data before calling __get_compat_msghdr(), but didn't modify get_compat_msghdr() or __io_compat_recvmsg_copy_hdr() to take that into account. They are both stil passing in the __user pointer rather than the copied version. Ensure we pass in the kernel struct, not the pointer to the user data. Link: https://lore.kernel.org/all/46439555-644d-08a1-7d66-16f8f9a320f0@samsung.com/ Fixes: 1a3e4e94a1b9 ("net: copy from user before calling __get_compat_msghdr") Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: fix types in io_recvmsg_multishot_overflowDylan Yudaken1-5/+5
io_recvmsg_multishot_overflow had incorrect types on non x64 system. But also it had an unnecessary INT_MAX check, which could just be done by changing the type of the accumulator to int (also simplifying the casts). Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: a8b38c4ce724 ("io_uring: support multishot in recvmsg") Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220715130252.610639-1-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>