diff options
author | Jens Axboe <axboe@kernel.dk> | 2021-10-18 08:45:39 -0600 |
---|---|---|
committer | Jens Axboe <axboe@kernel.dk> | 2021-10-18 14:40:47 -0600 |
commit | 4f5022453acd0f7b28012e20b7d048470f129894 (patch) | |
tree | 549b44244da2107d85fe36ad4ee630f121e38425 | |
parent | b688f11e86c9a22169a0e522530982735d2db19b (diff) | |
download | linux-4f5022453acd0f7b28012e20b7d048470f129894.tar.bz2 |
nvme: wire up completion batching for the IRQ path
Trivial to do now, just need our own io_comp_batch on the stack and pass
that in to the usual command completion handling.
I pondered making this dependent on how many entries we had to process,
but even for a single entry there's no discernable difference in
performance or latency. Running a sync workload over io_uring:
t/io_uring -b512 -d1 -s1 -c1 -p0 -F1 -B1 -n2 /dev/nvme1n1 /dev/nvme2n1
yields the below performance before the patch:
IOPS=254820, BW=124MiB/s, IOS/call=1/1, inflight=(1 1)
IOPS=251174, BW=122MiB/s, IOS/call=1/1, inflight=(1 1)
IOPS=250806, BW=122MiB/s, IOS/call=1/1, inflight=(1 1)
and the following after:
IOPS=255972, BW=124MiB/s, IOS/call=1/1, inflight=(1 1)
IOPS=251920, BW=123MiB/s, IOS/call=1/1, inflight=(1 1)
IOPS=251794, BW=122MiB/s, IOS/call=1/1, inflight=(1 1)
which definitely isn't slower, about the same if you factor in a bit of
variance. For peak performance workloads, benchmarking shows a 2%
improvement.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-rw-r--r-- | drivers/nvme/host/pci.c | 6 |
1 files changed, 5 insertions, 1 deletions
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 83d3503d5b88..ed684874842f 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -1076,9 +1076,13 @@ static inline int nvme_poll_cq(struct nvme_queue *nvmeq, static irqreturn_t nvme_irq(int irq, void *data) { struct nvme_queue *nvmeq = data; + DEFINE_IO_COMP_BATCH(iob); - if (nvme_poll_cq(nvmeq, NULL)) + if (nvme_poll_cq(nvmeq, &iob)) { + if (!rq_list_empty(iob.req_list)) + nvme_pci_complete_batch(&iob); return IRQ_HANDLED; + } return IRQ_NONE; } |