summaryrefslogtreecommitdiffstats
path: root/drivers/block/virtio_blk.c
AgeCommit message (Collapse)AuthorFilesLines
2022-03-06virtio-blk: Remove BUG_ON() in virtio_queue_rq()Xie Yongji1-10/+2
Currently we have a BUG_ON() to make sure the number of sg list does not exceed queue_max_segments() in virtio_queue_rq(). However, the block layer uses queue_max_discard_segments() instead of queue_max_segments() to limit the sg list for discard requests. So the BUG_ON() might be triggered if virtio-blk device reports a larger value for max discard segment than queue_max_segments(). To fix it, let's simply remove the BUG_ON() which has become unnecessary after commit 02746e26c39e("virtio-blk: avoid preallocating big SGL for data"). And the unused vblk->sg_elems can also be removed together. Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Link: https://lore.kernel.org/r/20220304100058.116-2-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-06virtio-blk: Don't use MAX_DISCARD_SEGMENTS if max_discard_seg is zeroXie Yongji1-2/+8
Currently the value of max_discard_segment will be set to MAX_DISCARD_SEGMENTS (256) with no basis in hardware if device set 0 to max_discard_seg in configuration space. It's incorrect since the device might not be able to handle such large descriptors. To fix it, let's follow max_segments restrictions in this case. Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20220304100058.116-1-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-18Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds1-2/+2
Pull virtio updates from Michael Tsirkin: "virtio,vdpa,qemu_fw_cfg: features, cleanups, and fixes. - partial support for < MAX_ORDER - 1 granularity for virtio-mem - driver_override for vdpa - sysfs ABI documentation for vdpa - multiqueue config support for mlx5 vdpa - and misc fixes, cleanups" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (42 commits) vdpa/mlx5: Fix tracking of current number of VQs vdpa/mlx5: Fix is_index_valid() to refer to features vdpa: Protect vdpa reset with cf_mutex vdpa: Avoid taking cf_mutex lock on get status vdpa/vdpa_sim_net: Report max device capabilities vdpa: Use BIT_ULL for bit operations vdpa/vdpa_sim: Configure max supported virtqueues vdpa/mlx5: Report max device capabilities vdpa: Support reporting max device capabilities vdpa/mlx5: Restore cur_num_vqs in case of failure in change_num_qps() vdpa: Add support for returning device configuration information vdpa/mlx5: Support configuring max data virtqueue vdpa/mlx5: Fix config_attr_mask assignment vdpa: Allow to configure max data virtqueues vdpa: Read device configuration only if FEATURES_OK vdpa: Sync calls set/get config/status with cf_mutex vdpa/mlx5: Distribute RX virtqueues in RQT object vdpa: Provide interface to read driver features vdpa: clean up get_config_size ret value handling virtio_ring: mark ring unused on error ...
2022-01-14virtio: wrap config->reset callsMichael S. Tsirkin1-2/+2
This will enable cleanups down the road. The idea is to disable cbs, then add "flush_queued_cbs" callback as a parameter, this way drivers can flush any work queued after callbacks have been disabled. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20211013105226.20225-1-mst@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-29block: remove the gendisk argument to blk_execute_rqChristoph Hellwig1-1/+1
Remove the gendisk aregument to blk_execute_rq and blk_execute_rq_nowait given that it is unused now. Also convert the boolean at_head parameter to actually use the bool type while touching the prototype. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20211126121802.2090656-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-29block: remove GENHD_FL_EXT_DEVTChristoph Hellwig1-1/+0
All modern drivers can support extra partitions using the extended dev_t. In fact except for the ioctl method drivers never even see partitions in normal operation. So remove the GENHD_FL_EXT_DEVT and allow extra partitions for all block devices that do support partitions, and require those that do not support partitions to explicit disallow them using GENHD_FL_NO_PART. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211122130625.1136848-12-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-24virtio-blk: modify the value type of num in virtio_queue_rq()Ye Guojin1-1/+1
This was found by coccicheck: ./drivers/block/virtio_blk.c, 334, 14-17, WARNING Unsigned expression compared with zero num < 0 Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Ye Guojin <ye.guojin@zte.com.cn> Link: https://lore.kernel.org/r/20211117063955.160777-1-ye.guojin@zte.com.cn Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Fixes: 02746e26c39e ("virtio-blk: avoid preallocating big SGL for data") Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-11-24Revert "virtio-blk: don't let virtio core to validate used length"Michael S. Tsirkin1-1/+0
This reverts commit a40392edf1b2c7822bc0ce68413106661a9d4232. Attempts to validate length in the core did not work out. We'll drop them, so revert the dependent changes in drivers. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-03Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds1-59/+119
Pull virtio updates from Michael Tsirkin: "vhost and virtio fixes and features: - Hardening work by Jason - vdpa driver for Alibaba ENI - Performance tweaks for virtio blk - virtio rng rework using an internal buffer - mac/mtu programming for mlx5 vdpa - Misc fixes, cleanups" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (45 commits) vdpa/mlx5: Forward only packets with allowed MAC address vdpa/mlx5: Support configuration of MAC vdpa/mlx5: Fix clearing of VIRTIO_NET_F_MAC feature bit vdpa_sim_net: Enable user to set mac address and mtu vdpa: Enable user to set mac and mtu of vdpa device vdpa: Use kernel coding style for structure comments vdpa: Introduce query of device config layout vdpa: Introduce and use vdpa device get, set config helpers virtio-scsi: don't let virtio core to validate used buffer length virtio-blk: don't let virtio core to validate used length virtio-net: don't let virtio core to validate used length virtio_ring: validate used buffer length virtio_blk: correct types for status handling virtio_blk: allow 0 as num_request_queues i2c: virtio: Add support for zero-length requests virtio-blk: fixup coccinelle warnings virtio_ring: fix typos in vring_desc_extra virtio-pci: harden INTX interrupts virtio_pci: harden MSI-X interrupts virtio_config: introduce a new .enable_cbs method ...
2021-11-01Merge tag 'for-5.16/passthrough-flag-2021-10-29' of ↵Linus Torvalds1-2/+2
git://git.kernel.dk/linux-block Pull QUEUE_FLAG_SCSI_PASSTHROUGH removal from Jens Axboe: "This contains a series leading to the removal of the QUEUE_FLAG_SCSI_PASSTHROUGH queue flag" * tag 'for-5.16/passthrough-flag-2021-10-29' of git://git.kernel.dk/linux-block: block: remove blk_{get,put}_request block: remove QUEUE_FLAG_SCSI_PASSTHROUGH block: remove the initialize_rq_fn blk_mq_ops method scsi: add a scsi_alloc_request helper bsg-lib: initialize the bsg_job in bsg_transport_sg_io_fn nfsd/blocklayout: use ->get_unique_id instead of sending SCSI commands sd: implement ->get_unique_id block: add a ->get_unique_id method
2021-11-01virtio-blk: don't let virtio core to validate used lengthJason Wang1-0/+1
We never tries to use used length, so the patch prevents the virtio core from validating used length. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20211027022107.14357-4-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01virtio_blk: correct types for status handlingMichael S. Tsirkin1-6/+8
virtblk_setup_cmd returns blk_status_t in an int, callers then assign it back to a blk_status_t variable. blk_status_t is either u32 or (more typically) u8 so it works, but is inelegant and causes sparse warnings. Pass the status in blk_status_t in a consistent way. Reported-by: kernel test robot <lkp@intel.com> Fixes: b2c5221fd074 ("virtio-blk: avoid preallocating big SGL for data") Cc: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2021-11-01virtio_blk: allow 0 as num_request_queuesMichael S. Tsirkin1-13/+1
The default value is 0 meaning "no limit". However if 0 is specified on the command line it is instead silently converted to 1. Further, the value is already validated at point of use, there's no point in duplicating code validating the value when it is set. Simplify the code while making the behaviour more consistent by using plain module_param. Fixes: 1a662cf6cb9a ("virtio-blk: add num_request_queues module parameter") Cc: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01virtio-blk: fixup coccinelle warningsYe Guojin1-1/+1
coccicheck complains about the use of snprintf() in sysfs show functions: WARNING use scnprintf or sprintf Use sysfs_emit instead of scnprintf or sprintf makes more sense. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Ye Guojin <ye.guojin@zte.com.cn> Link: https://lore.kernel.org/r/20211021065111.1047824-1-ye.guojin@zte.com.cn Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-11-01virtio_blk: Fix spelling mistake: "advertisted" -> "advertised"Colin Ian King1-1/+1
There is a spelling mistake in a dev_err error message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20211025102240.22801-1-colin.i.king@gmail.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com>
2021-11-01virtio-blk: validate num_queues during probeJason Wang1-0/+4
If an untrusted device neogitates BLK_F_MQ but advertises a zero num_queues, the driver may end up trying to allocating zero size buffers where ZERO_SIZE_PTR is returned which may pass the checking against the NULL. This will lead unexpected results. Fixing this by failing the probe in this case. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20211019070152.8236-2-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-11-01virtio-blk: add num_request_queues module parameterMax Gurtovoy1-1/+22
Sometimes a user would like to control the amount of request queues to be created for a block device. For example, for limiting the memory footprint of virtio-blk devices. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Link: https://lore.kernel.org/r/20210902204622.54354-1-mgurtovoy@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01virtio-blk: avoid preallocating big SGL for dataMax Gurtovoy1-56/+100
No need to pre-allocate a big buffer for the IO SGL anymore. If a device has lots of deep queues, preallocation for the sg list can consume substantial amounts of memory. For HW virtio-blk device, nr_hw_queues can be 64 or 128 and each queue's depth might be 128. This means the resulting preallocation for the data SGLs is big. Switch to runtime allocation for SGL for lists longer than 2 entries. This is the approach used by NVMe drivers so it should be reasonable for virtio block as well. Runtime SGL allocation has always been the case for the legacy I/O path so this is nothing new. The preallocated small SGL depends on SG_CHAIN so if the ARCH doesn't support SG_CHAIN, use only runtime allocation for the SGL. Re-organize the setup of the IO request to fit the new sg chain mechanism. No performance degradation was seen (fio libaio engine with 16 jobs and 128 iodepth): IO size IOPs Rand Read (before/after) IOPs Rand Write (before/after) -------- --------------------------------- ---------------------------------- 512B 318K/316K 329K/325K 4KB 323K/321K 353K/349K 16KB 199K/208K 250K/275K 128KB 36K/36.1K 39.2K/41.7K Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Israel Rukshin <israelr@nvidia.com> Link: https://lore.kernel.org/r/20210901131434.31158-1-mgurtovoy@nvidia.com Reviewed-by: Feng Li <lifeng1519@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> # kconfig fixups Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-10-29block: remove blk_{get,put}_requestChristoph Hellwig1-2/+2
These are now pointless wrappers around blk_mq_{alloc,free}_request, so remove them. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20211025070517.1548584-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-27virtio-blk: Use blk_validate_block_size() to validate block sizeXie Yongji1-2/+10
The block layer can't support a block size larger than page size yet. And a block size that's too small or not a power of two won't work either. If a misconfigured device presents an invalid block size in configuration space, it will result in the kernel crash something like below: [ 506.154324] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 506.160416] RIP: 0010:create_empty_buffers+0x24/0x100 [ 506.174302] Call Trace: [ 506.174651] create_page_buffers+0x4d/0x60 [ 506.175207] block_read_full_page+0x50/0x380 [ 506.175798] ? __mod_lruvec_page_state+0x60/0xa0 [ 506.176412] ? __add_to_page_cache_locked+0x1b2/0x390 [ 506.177085] ? blkdev_direct_IO+0x4a0/0x4a0 [ 506.177644] ? scan_shadow_nodes+0x30/0x30 [ 506.178206] ? lru_cache_add+0x42/0x60 [ 506.178716] do_read_cache_page+0x695/0x740 [ 506.179278] ? read_part_sector+0xe0/0xe0 [ 506.179821] read_part_sector+0x36/0xe0 [ 506.180337] adfspart_check_ICS+0x32/0x320 [ 506.180890] ? snprintf+0x45/0x70 [ 506.181350] ? read_part_sector+0xe0/0xe0 [ 506.181906] bdev_disk_changed+0x229/0x5c0 [ 506.182483] blkdev_get_whole+0x6d/0x90 [ 506.183013] blkdev_get_by_dev+0x122/0x2d0 [ 506.183562] device_add_disk+0x39e/0x3c0 [ 506.184472] virtblk_probe+0x3f8/0x79b [virtio_blk] [ 506.185461] virtio_dev_probe+0x15e/0x1d0 [virtio] So let's use a block layer helper to validate the block size. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20211026144015.188-5-xieyongji@bytedance.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-13Revert "virtio-blk: Add validation for block size in config space"Michael S. Tsirkin1-31/+6
It turns out that access to config space before completing the feature negotiation is broken for big endian guests at least with QEMU hosts up to 6.1 inclusive. This affects any device that accesses config space in the validate callback: at the moment that is virtio-net with VIRTIO_NET_F_MTU but since 82e89ea077b9 ("virtio-blk: Add validation for block size in config space") that also started affecting virtio-blk with VIRTIO_BLK_F_BLK_SIZE. Further, unlike VIRTIO_NET_F_MTU which is off by default on QEMU, VIRTIO_BLK_F_BLK_SIZE is on by default, which resulted in lots of people not being able to boot VMs on BE. The spec is very clear that what we are doing is legal so QEMU needs to be fixed, but given it's been broken for so many years and no one noticed, we need to give QEMU a bit more time before applying this. Further, this patch is incomplete (does not check blk size is a power of two) and it duplicates the logic from nbd. Revert for now, and we'll reapply a cleaner logic in the next release. Cc: stable@vger.kernel.org Fixes: 82e89ea077b9 ("virtio-blk: Add validation for block size in config space") Cc: Xie Yongji <xieyongji@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-11Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds1-2/+2
Pull virtio updates from Michael Tsirkin: - vduse driver ("vDPA Device in Userspace") supporting emulated virtio block devices - virtio-vsock support for end of record with SEQPACKET - vdpa: mac and mq support for ifcvf and mlx5 - vdpa: management netlink for ifcvf - virtio-i2c, gpio dt bindings - misc fixes and cleanups * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (39 commits) Documentation: Add documentation for VDUSE vduse: Introduce VDUSE - vDPA Device in Userspace vduse: Implement an MMU-based software IOTLB vdpa: Support transferring virtual addressing during DMA mapping vdpa: factor out vhost_vdpa_pa_map() and vhost_vdpa_pa_unmap() vdpa: Add an opaque pointer for vdpa_config_ops.dma_map() vhost-iotlb: Add an opaque pointer for vhost IOTLB vhost-vdpa: Handle the failure of vdpa_reset() vdpa: Add reset callback in vdpa_config_ops vdpa: Fix some coding style issues file: Export receive_fd() to modules eventfd: Export eventfd_wake_count to modules iova: Export alloc_iova_fast() and free_iova_fast() virtio-blk: remove unneeded "likely" statements virtio-balloon: Use virtio_find_vqs() helper vdpa: Make use of PFN_PHYS/PFN_UP/PFN_DOWN helper macro vsock_test: update message bounds test for MSG_EOR af_vsock: rename variables in receive loop virtio/vsock: support MSG_EOR bit processing vhost/vsock: support MSG_EOR bit processing ...
2021-09-06virtio-blk: remove unneeded "likely" statementsMax Gurtovoy1-2/+2
Usually we use "likely/unlikely" to optimize the fast path. Remove redundant "likely/unlikely" statements in the control path to simplify the code and make it easier to read. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Link: https://lore.kernel.org/r/20210905085717.7427-1-mgurtovoy@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Chaitanya Kulkarni <ckulkarnilinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-08-30Merge tag 'for-5.15/block-2021-08-30' of git://git.kernel.dk/linux-blockLinus Torvalds1-8/+8
Pull block updates from Jens Axboe: "Nothing major in here - lots of good cleanups and tech debt handling, which is also evident in the diffstats. In particular: - Add disk sequence numbers (Matteo) - Discard merge fix (Ming) - Relax disk zoned reporting restrictions (Niklas) - Bio error handling zoned leak fix (Pavel) - Start of proper add_disk() error handling (Luis, Christoph) - blk crypto fix (Eric) - Non-standard GPT location support (Dmitry) - IO priority improvements and cleanups (Damien)o - blk-throtl improvements (Chunguang) - diskstats_show() stack reduction (Abd-Alrhman) - Loop scheduler selection (Bart) - Switch block layer to use kmap_local_page() (Christoph) - Remove obsolete disk_name helper (Christoph) - block_device refcounting improvements (Christoph) - Ensure gendisk always has a request queue reference (Christoph) - Misc fixes/cleanups (Shaokun, Oliver, Guoqing)" * tag 'for-5.15/block-2021-08-30' of git://git.kernel.dk/linux-block: (129 commits) sg: pass the device name to blk_trace_setup block, bfq: cleanup the repeated declaration blk-crypto: fix check for too-large dun_bytes blk-zoned: allow BLKREPORTZONE without CAP_SYS_ADMIN blk-zoned: allow zone management send operations without CAP_SYS_ADMIN block: mark blkdev_fsync static block: refine the disk_live check in del_gendisk mmc: sdhci-tegra: Enable MMC_CAP2_ALT_GPT_TEGRA mmc: block: Support alternative_gpt_sector() operation partitions/efi: Support non-standard GPT location block: Add alternative_gpt_sector() operation bio: fix page leak bio_add_hw_page failure block: remove CONFIG_DEBUG_BLOCK_EXT_DEVT block: remove a pointless call to MINOR() in device_add_disk null_blk: add error handling support for add_disk() virtio_blk: add error handling support for add_disk() block: add error handling for device_add_disk / add_disk block: return errors from disk_alloc_events block: return errors from blk_integrity_add block: call blk_register_queue earlier in device_add_disk ...
2021-08-23virtio_blk: add error handling support for add_disk()Luis Chamberlain1-1/+6
We never checked for errors on add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20210818144542.19305-11-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-16virtio_blk: use bvec_virtChristoph Hellwig1-5/+2
Use bvec_virt instead of open coding it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Link: https://lore.kernel.org/r/20210804095634.460779-9-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-11virtio-blk: Add validation for block size in config spaceXie Yongji1-6/+33
An untrusted device might presents an invalid block size in configuration space. This tries to add validation for it in the validate callback and clear the VIRTIO_BLK_F_BLK_SIZE feature bit if the value is out of the supported range. And we also double check the value in virtblk_probe() in case that it's changed after the validation. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20210809101609.148-1-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2021-07-03virtio-blk: limit seg_max to a safe valueStefan Hajnoczi1-1/+7
The struct virtio_blk_config seg_max value is read from the device and incremented by 2 to account for the request header and status byte descriptors added by the driver. In preparation for supporting untrusted virtio-blk devices, protect against integer overflow and limit the value to a safe maximum. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Link: https://lore.kernel.org/r/20210524154020.98195-1-stefanha@redhat.com Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-07-03virtio-blk: Fix memory leak among suspend/resume procedureXie Yongji1-0/+2
The vblk->vqs should be freed before we call init_vqs() in virtblk_restore(). Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20210517084332.280-1-xieyongji@bytedance.com Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-07-03virtio_blk: cleanups: remove check obsoleted by CONFIG_LBDAF removalSohaib1-7/+0
Prior to 72deb455b5ec ("block: remove CONFIG_LBDAF"), it was optional if the 32-bit kernel support block device and/or file sizes larger than 2 TiB (considering the sector size is 512 bytes) But now sector_t and blkcnt_t are always 64-bit in size. Suggested-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Signed-off-by: Sohaib Mohammed <sohaib.amhmd@gmail.com> Link: https://lore.kernel.org/r/20210430103611.77345-1-sohaib.amhmd@gmail.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-06-11virtio-blk: use blk_mq_alloc_diskChristoph Hellwig1-19/+7
Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Link: https://lore.kernel.org/r/20210602065345.355274-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-25Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds1-4/+7
Pull virtio updates from Michael Tsirkin: - new vdpa features to allow creation and deletion of new devices - virtio-blk support per-device queue depth - fixes, cleanups all over the place * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (31 commits) virtio-input: add multi-touch support virtio_mmio: fix one typo vdpa/mlx5: fix param validation in mlx5_vdpa_get_config() virtio_net: Fix fall-through warnings for Clang virtio_input: Prevent EV_MSC/MSC_TIMESTAMP loop storm for MT. virtio-blk: support per-device queue depth virtio_vdpa: don't warn when fail to disable vq virtio-pci: introduce modern device module virito-pci-modern: rename map_capability() to vp_modern_map_capability() virtio-pci-modern: introduce helper to get notification offset virtio-pci-modern: introduce helper for getting queue nums virtio-pci-modern: introduce helper for setting/geting queue size virtio-pci-modern: introduce helper to set/get queue_enable virtio-pci-modern: introduce vp_modern_queue_address() virtio-pci-modern: introduce vp_modern_set_queue_vector() virtio-pci-modern: introduce vp_modern_generation() virtio-pci-modern: introduce helpers for setting and getting features virtio-pci-modern: introduce helpers for setting and getting status virtio-pci-modern: introduce helper to set config vector virtio-pci-modern: introduce vp_modern_remove() ...
2021-02-23virtio-blk: support per-device queue depthJoseph Qi1-4/+7
module parameter 'virtblk_queue_depth' was firstly introduced for testing/benchmarking purposes described in commit fc4324b4597c ("virtio-blk: base queue-depth on virtqueue ringsize or module param"). And currently 'virtblk_queue_depth' is used as a saved value for the first probed device. Since we have different virtio-blk devices which have different capabilities, it requires that we support per-device queue depth instead of per-module. So defaultly use vq free elements if module parameter 'virtblk_queue_depth' is not set. Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/1611307306-71067-1-git-send-email-joseph.qi@linux.alibaba.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-01-24block: remove unnecessary argument from blk_execute_rqGuoqing Jiang1-1/+1
We can remove 'q' from blk_execute_rq as well after the previous change in blk_execute_rq_nowait. And more importantly it never really was needed to start with given that we can trivial derive it from struct request. Cc: linux-scsi@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: linux-ide@vger.kernel.org Cc: linux-mmc@vger.kernel.org Cc: linux-nvme@lists.infradead.org Cc: linux-nfs@vger.kernel.org Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-11-16virtio-blk: remove a spurious call to revalidate_disk_sizeChristoph Hellwig1-1/+0
revalidate_disk_size just updates the block device size from the disk size. Thus calling it from virtblk_update_cache_mode doesn't actually do anything. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-11-16block: remove the update_bdev parameter to set_capacity_revalidate_and_notifyChristoph Hellwig1-1/+1
The update_bdev argument is always set to true, so remove it. Also rename the function to the slighly less verbose set_capacity_and_notify, as propagating the disk size to the block device isn't really revalidation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Petr Vorel <pvorel@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-02block: add a new revalidate_disk_size helperChristoph Hellwig1-1/+1
revalidate_disk is a relative awkward helper for driver use, as it first calls an optional driver method and then updates the block device size, while most callers either don't need the method call at all, or want to keep state between the caller and the called method. Add a revalidate_disk_size helper that just performs the update of the block device size from the gendisk one, and switch all drivers that do not implement ->revalidate_disk to use the new helper instead of revalidate_disk() Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Acked-by: Song Liu <song@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01virtio-blk: Use kobj_to_dev() instead of container_of()Tian Tao1-1/+1
Use kobj_to_dev() instead of container_of() Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-08-17block: virtio_blk: fix handling single range discard requestMing Lei1-8/+23
1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") starts to support multi-range discard for virtio-blk. However, the virtio-blk disk may report max discard segment as 1, at least that is exactly what qemu is doing. So far, block layer switches to normal request merge if max discard segment limit is 1, and multiple bios can be merged to single segment. This way may cause memory corruption in virtblk_setup_discard_write_zeroes(). Fix the issue by handling single max discard segment in straightforward way. Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Changpeng Liu <changpeng.liu@intel.com> Cc: Daniel Verkamp <dverkamp@chromium.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-08-03Merge tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-blockLinus Torvalds1-1/+2
Pull core block updates from Jens Axboe: "Good amount of cleanups and tech debt removals in here, and as a result, the diffstat shows a nice net reduction in code. - Softirq completion cleanups (Christoph) - Stop using ->queuedata (Christoph) - Cleanup bd claiming (Christoph) - Use check_events, moving away from the legacy media change (Christoph) - Use inode i_blkbits consistently (Christoph) - Remove old unused writeback congestion bits (Christoph) - Cleanup/unify submission path (Christoph) - Use bio_uninit consistently, instead of bio_disassociate_blkg (Christoph) - sbitmap cleared bits handling (John) - Request merging blktrace event addition (Jan) - sysfs add/remove race fixes (Luis) - blk-mq tag fixes/optimizations (Ming) - Duplicate words in comments (Randy) - Flush deferral cleanup (Yufen) - IO context locking/retry fixes (John) - struct_size() usage (Gustavo) - blk-iocost fixes (Chengming) - blk-cgroup IO stats fixes (Boris) - Various little fixes" * tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-block: (135 commits) block: blk-timeout: delete duplicated word block: blk-mq-sched: delete duplicated word block: blk-mq: delete duplicated word block: genhd: delete duplicated words block: elevator: delete duplicated word and fix typos block: bio: delete duplicated words block: bfq-iosched: fix duplicated word iocost_monitor: start from the oldest usage index iocost: Fix check condition of iocg abs_vdebt block: Remove callback typedefs for blk_mq_ops block: Use non _rcu version of list functions for tag_set_list blk-cgroup: show global disk stats in root cgroup io.stat blk-cgroup: make iostat functions visible to stat printing block: improve discard bio alignment in __blkdev_issue_discard() block: change REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL to be odd numbers block: defer flush request no matter whether we have elevator block: make blk_timeout_init() static block: remove retry loop in ioc_release_fn() block: remove unnecessary ioc nested locking block: integrate bd_start_claiming into __blkdev_get ...
2020-06-30virtio-blk: free vblk-vqs in error path of virtblk_probe()Hou Tao1-0/+1
Else there will be memory leak if alloc_disk() fails. Fixes: 6a27b656fc02 ("block: virtio-blk: support multi virt queues per virtio-blk device") Signed-off-by: Hou Tao <houtao1@huawei.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-24blk-mq: move failure injection out of blk_mq_complete_requestChristoph Hellwig1-1/+2
Move the call to blk_should_fake_timeout out of blk_mq_complete_request and into the drivers, skipping call sites that are obvious error handlers, and remove the now superflous blk_mq_force_complete_rq helper. This ensures we don't keep injecting errors into completions that just terminate the Linux request after the hardware has been reset or the command has been aborted. Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-02virtio-blk: handle block_device_operations callbacks after hot unplugStefan Hajnoczi1-8/+78
A userspace process holding a file descriptor to a virtio_blk device can still invoke block_device_operations after hot unplug. This leads to a use-after-free accessing vblk->vdev in virtblk_getgeo() when ioctl(HDIO_GETGEO) is invoked: BUG: unable to handle kernel NULL pointer dereference at 0000000000000090 IP: [<ffffffffc00e5450>] virtio_check_driver_offered_feature+0x10/0x90 [virtio] PGD 800000003a92f067 PUD 3a930067 PMD 0 Oops: 0000 [#1] SMP CPU: 0 PID: 1310 Comm: hdio-getgeo Tainted: G OE ------------ 3.10.0-1062.el7.x86_64 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 task: ffff9be5fbfb8000 ti: ffff9be5fa890000 task.ti: ffff9be5fa890000 RIP: 0010:[<ffffffffc00e5450>] [<ffffffffc00e5450>] virtio_check_driver_offered_feature+0x10/0x90 [virtio] RSP: 0018:ffff9be5fa893dc8 EFLAGS: 00010246 RAX: ffff9be5fc3f3400 RBX: ffff9be5fa893e30 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff9be5fbc10b40 RBP: ffff9be5fa893dc8 R08: 0000000000000301 R09: 0000000000000301 R10: 0000000000000000 R11: 0000000000000000 R12: ffff9be5fdc24680 R13: ffff9be5fbc10b40 R14: ffff9be5fbc10480 R15: 0000000000000000 FS: 00007f1bfb968740(0000) GS:ffff9be5ffc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000090 CR3: 000000003a894000 CR4: 0000000000360ff0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: [<ffffffffc016ac37>] virtblk_getgeo+0x47/0x110 [virtio_blk] [<ffffffff8d3f200d>] ? handle_mm_fault+0x39d/0x9b0 [<ffffffff8d561265>] blkdev_ioctl+0x1f5/0xa20 [<ffffffff8d488771>] block_ioctl+0x41/0x50 [<ffffffff8d45d9e0>] do_vfs_ioctl+0x3a0/0x5a0 [<ffffffff8d45dc81>] SyS_ioctl+0xa1/0xc0 A related problem is that virtblk_remove() leaks the vd_index_ida index when something still holds a reference to vblk->disk during hot unplug. This causes virtio-blk device names to be lost (vda, vdb, etc). Fix these issues by protecting vblk->vdev with a mutex and reference counting vblk so the vd_index_ida index can be removed in all cases. Fixes: 48e4043d4529 ("virtio: add virtio disk geometry feature") Reported-by: Lance Digby <ldigby@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Link: https://lore.kernel.org/r/20200430140442.171016-1-stefanha@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2020-04-17virtio_blk: add a missing includeMichael S. Tsirkin1-0/+1
virtio_blk uses VIRTIO_RING_F_INDIRECT_DESC, pull in the header defining that value. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-03-30Merge tag 'for-5.7/block-2020-03-29' of git://git.kernel.dk/linux-blockLinus Torvalds1-4/+1
Pull block updates from Jens Axboe: - Online capacity resizing (Balbir) - Number of hardware queue change fixes (Bart) - null_blk fault injection addition (Bart) - Cleanup of queue allocation, unifying the node/no-node API (Christoph) - Cleanup of genhd, moving code to where it makes sense (Christoph) - Cleanup of the partition handling code (Christoph) - disk stat fixes/improvements (Konstantin) - BFQ improvements (Paolo) - Various fixes and improvements * tag 'for-5.7/block-2020-03-29' of git://git.kernel.dk/linux-block: (72 commits) block: return NULL in blk_alloc_queue() on error block: move bio_map_* to blk-map.c Revert "blkdev: check for valid request queue before issuing flush" block: simplify queue allocation bcache: pass the make_request methods to blk_queue_make_request null_blk: use blk_mq_init_queue_data block: add a blk_mq_init_queue_data helper block: move the ->devnode callback to struct block_device_operations block: move the part_stat* helpers from genhd.h to a new header block: move block layer internals out of include/linux/genhd.h block: move guard_bio_eod to bio.c block: unexport get_gendisk block: unexport disk_map_sector_rcu block: unexport disk_get_part block: mark part_in_flight and part_in_flight_rw static block: mark block_depr static block: factor out requeue handling from dispatch code block/diskstats: replace time_in_queue with sum of request times block/diskstats: accumulate all per-cpu counters in one pass block/diskstats: more accurate approximation of io_ticks for slow disks ...
2020-03-18virtio_blk.c: Convert to use set_capacity_revalidate_and_notifyBalbir Singh1-4/+1
block/genhd provides set_capacity_revalidate_and_notify() for sending RESIZE notifications via uevents. Signed-off-by: Balbir Singh <sblbir@amazon.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-08virtio-blk: improve virtqueue error to BLK_STSHalil Pasic1-2/+7
Let's change the mapping between virtqueue_add errors to BLK_STS statuses, so that -ENOSPC, which indicates virtqueue full is still mapped to BLK_STS_DEV_RESOURCE, but -ENOMEM which indicates non-device specific resource outage is mapped to BLK_STS_RESOURCE. Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Link: https://lore.kernel.org/r/20200213123728.61216-3-pasic@linux.ibm.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-03-08virtio-blk: fix hw_queue stopped on arbitrary errorHalil Pasic1-3/+5
Since nobody else is going to restart our hw_queue for us, the blk_mq_start_stopped_hw_queues() is in virtblk_done() is not sufficient necessarily sufficient to ensure that the queue will get started again. In case of global resource outage (-ENOMEM because mapping failure, because of swiotlb full) our virtqueue may be empty and we can get stuck with a stopped hw_queue. Let us not stop the queue on arbitrary errors, but only on -EONSPC which indicates a full virtqueue, where the hw_queue is guaranteed to get started by virtblk_done() before when it makes sense to carry on submitting requests. Let us also remove a stale comment. Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Cc: Jens Axboe <axboe@kernel.dk> Fixes: f7728002c1c7 ("virtio_ring: fix return code on DMA mapping fails") Link: https://lore.kernel.org/r/20200213123728.61216-2-pasic@linux.ibm.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-06virtio-blk: remove VIRTIO_BLK_F_SCSI supportChristoph Hellwig1-114/+1
Since the need for a special flag to support SCSI passthrough on a block device was added in May 2017 the SCSI passthrough support in virtio-blk has been disabled. It has always been a bad idea (just ask the original author..) and we have virtio-scsi for proper passthrough. The feature also never made it into the virtio 1.0 or later specifications. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-05-21treewide: Add SPDX license identifier for more missed filesThomas Gleixner1-0/+1
Add SPDX license identifiers to all files which: - Have no license information of any form - Have MODULE_LICENCE("GPL*") inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>