From eb2667b343361863da7b79be26de641e22844ba0 Mon Sep 17 00:00:00 2001
From: Joseph Qi <joseph.qi@linux.alibaba.com>
Date: Tue, 24 Nov 2020 15:03:03 +0800
Subject: io_uring: fix shift-out-of-bounds when round up cq size

Abaci Fuzz reported a shift-out-of-bounds BUG in io_uring_create():

[ 59.598207] UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13
[ 59.599665] shift exponent 64 is too large for 64-bit type 'long unsigned int'
[ 59.601230] CPU: 0 PID: 963 Comm: a.out Not tainted 5.10.0-rc4+ #3
[ 59.602502] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 59.603673] Call Trace:
[ 59.604286] dump_stack+0x107/0x163
[ 59.605237] ubsan_epilogue+0xb/0x5a
[ 59.606094] __ubsan_handle_shift_out_of_bounds.cold+0xb2/0x20e
[ 59.607335] ? lock_downgrade+0x6c0/0x6c0
[ 59.608182] ? rcu_read_lock_sched_held+0xaf/0xe0
[ 59.609166] io_uring_create.cold+0x99/0x149
[ 59.610114] io_uring_setup+0xd6/0x140
[ 59.610975] ? io_uring_create+0x2510/0x2510
[ 59.611945] ? lockdep_hardirqs_on_prepare+0x286/0x400
[ 59.613007] ? syscall_enter_from_user_mode+0x27/0x80
[ 59.614038] ? trace_hardirqs_on+0x5b/0x180
[ 59.615056] do_syscall_64+0x2d/0x40
[ 59.615940] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 59.617007] RIP: 0033:0x7f2bb8a0b239

This is caused by roundup_pow_of_two() if the input entries larger
enough, e.g. 2^32-1. For sq_entries, it will check first and we allow
at most IORING_MAX_ENTRIES, so it is okay. But for cq_entries, we do
round up first, that may overflow and truncate it to 0, which is not
the expected behavior. So check the cq size first and then do round up.

Fixes: 88ec3211e463 ("io_uring: round-up cq size before comparing with rounded sq size")
Reported-by: Abaci Fuzz <abaci@linux.alibaba.com>
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index a8c136a1cf4e..f971589878bc 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -9252,14 +9252,16 @@ static int io_uring_create(unsigned entries, struct io_uring_params *p,
 		 * to a power-of-two, if it isn't already. We do NOT impose
 		 * any cq vs sq ring sizing.
 		 */
-		p->cq_entries = roundup_pow_of_two(p->cq_entries);
-		if (p->cq_entries < p->sq_entries)
+		if (!p->cq_entries)
 			return -EINVAL;
 		if (p->cq_entries > IORING_MAX_CQ_ENTRIES) {
 			if (!(p->flags & IORING_SETUP_CLAMP))
 				return -EINVAL;
 			p->cq_entries = IORING_MAX_CQ_ENTRIES;
 		}
+		p->cq_entries = roundup_pow_of_two(p->cq_entries);
+		if (p->cq_entries < p->sq_entries)
+			return -EINVAL;
 	} else {
 		p->cq_entries = 2 * p->sq_entries;
 	}
-- 
cgit v1.2.3


From 9c3a205c5ffa36e96903c2e37eb5f41c0f03c43e Mon Sep 17 00:00:00 2001
From: Pavel Begunkov <asml.silence@gmail.com>
Date: Mon, 23 Nov 2020 23:20:27 +0000
Subject: io_uring: fix ITER_BVEC check

iov_iter::type is a bitmask that also keeps direction etc., so it
shouldn't be directly compared against ITER_*. Use proper helper.

Fixes: ff6165b2d7f6 ("io_uring: retain iov_iter state over io_read/io_write calls")
Reported-by: David Howells <dhowells@redhat.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Cc: <stable@vger.kernel.org> # 5.9
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index f971589878bc..ff6deffe5aa9 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3193,7 +3193,7 @@ static void io_req_map_rw(struct io_kiocb *req, const struct iovec *iovec,
 	rw->free_iovec = iovec;
 	rw->bytes_done = 0;
 	/* can only be fixed buffers, no need to do anything */
-	if (iter->type == ITER_BVEC)
+	if (iov_iter_is_bvec(iter))
 		return;
 	if (!iovec) {
 		unsigned iov_off = 0;
-- 
cgit v1.2.3


From af60470347de6ac2b9f0cc3703975a543a3de075 Mon Sep 17 00:00:00 2001
From: Pavel Begunkov <asml.silence@gmail.com>
Date: Wed, 25 Nov 2020 18:41:28 +0000
Subject: io_uring: fix files grab/cancel race

When one task is in io_uring_cancel_files() and another is doing
io_prep_async_work() a race may happen. That's because after accounting
a request inflight in first call to io_grab_identity() it still may fail
and go to io_identity_cow(), which migh briefly keep dangling
work.identity and not only.

Grab files last, so io_prep_async_work() won't fail if it did get into
->inflight_list.

note: the bug shouldn't exist after making io_uring_cancel_files() not
poking into other tasks' requests.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c | 31 +++++++++++++++----------------
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index ff6deffe5aa9..1023f7b44cea 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1313,22 +1313,6 @@ static bool io_grab_identity(struct io_kiocb *req)
 			return false;
 		req->work.flags |= IO_WQ_WORK_FSIZE;
 	}
-
-	if (!(req->work.flags & IO_WQ_WORK_FILES) &&
-	    (def->work_flags & IO_WQ_WORK_FILES) &&
-	    !(req->flags & REQ_F_NO_FILE_TABLE)) {
-		if (id->files != current->files ||
-		    id->nsproxy != current->nsproxy)
-			return false;
-		atomic_inc(&id->files->count);
-		get_nsproxy(id->nsproxy);
-		req->flags |= REQ_F_INFLIGHT;
-
-		spin_lock_irq(&ctx->inflight_lock);
-		list_add(&req->inflight_entry, &ctx->inflight_list);
-		spin_unlock_irq(&ctx->inflight_lock);
-		req->work.flags |= IO_WQ_WORK_FILES;
-	}
 #ifdef CONFIG_BLK_CGROUP
 	if (!(req->work.flags & IO_WQ_WORK_BLKCG) &&
 	    (def->work_flags & IO_WQ_WORK_BLKCG)) {
@@ -1370,6 +1354,21 @@ static bool io_grab_identity(struct io_kiocb *req)
 		}
 		spin_unlock(&current->fs->lock);
 	}
+	if (!(req->work.flags & IO_WQ_WORK_FILES) &&
+	    (def->work_flags & IO_WQ_WORK_FILES) &&
+	    !(req->flags & REQ_F_NO_FILE_TABLE)) {
+		if (id->files != current->files ||
+		    id->nsproxy != current->nsproxy)
+			return false;
+		atomic_inc(&id->files->count);
+		get_nsproxy(id->nsproxy);
+		req->flags |= REQ_F_INFLIGHT;
+
+		spin_lock_irq(&ctx->inflight_lock);
+		list_add(&req->inflight_entry, &ctx->inflight_list);
+		spin_unlock_irq(&ctx->inflight_lock);
+		req->work.flags |= IO_WQ_WORK_FILES;
+	}
 
 	return true;
 }
-- 
cgit v1.2.3