summaryrefslogtreecommitdiffstats
path: root/block/partitions/core.c
AgeCommit message (Collapse)AuthorFilesLines
2022-04-18block: turn bdev->bd_openers into an atomic_tChristoph Hellwig1-1/+1
All manipulation of bd_openers is under disk->open_mutex and will remain so for the foreseeable future. But at least one place reads it without the lock (blkdev_get) and there are more to be added. So make sure the compiler does not do turn the increments and decrements into non-atomic sequences by using an atomic_t. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220330052917.2566582-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-17block: use bdev_discard_alignment in part_discard_alignment_showChristoph Hellwig1-5/+1
Use the bdev based alignment helper instead of open coding it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220415045258.199825-21-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-17block: use bdev_alignment_offset in part_alignment_offset_showChristoph Hellwig1-5/+1
Replace the open coded offset calculation with the proper helper. This is an ABI change in that the -1 for a misaligned partition is properly propagated, which can be considered a bug fix and matches what is done on the whole device. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-17-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-02block: remove genhd.hChristoph Hellwig1-1/+0
There is no good reason to keep genhd.h separate from the main blkdev.h header that includes it. So fold the contents of genhd.h into blkdev.h and remove genhd.h entirely. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220124093913.742411-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-29block: remove GENHD_FL_EXT_DEVTChristoph Hellwig1-5/+4
All modern drivers can support extra partitions using the extended dev_t. In fact except for the ioctl method drivers never even see partitions in normal operation. So remove the GENHD_FL_EXT_DEVT and allow extra partitions for all block devices that do support partitions, and require those that do not support partitions to explicit disallow them using GENHD_FL_NO_PART. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211122130625.1136848-12-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-29block: move GENHD_FL_NATIVE_CAPACITY to disk->stateChristoph Hellwig1-9/+6
The flag to indicate an unlocked native capacity is dynamic state, not a driver capability flag, so move it to disk->state. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211122130625.1136848-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-01Merge tag 'for-5.16/bdev-size-2021-10-29' of git://git.kernel.dk/linux-blockLinus Torvalds1-0/+1
Pull bdev size cleanups from Jens Axboe: "Clean up the bdev size handling with new bdev_nr_bytes() helper" * tag 'for-5.16/bdev-size-2021-10-29' of git://git.kernel.dk/linux-block: (34 commits) partitions/ibm: use bdev_nr_sectors instead of open coding it partitions/efi: use bdev_nr_bytes instead of open coding it block/ioctl: use bdev_nr_sectors and bdev_nr_bytes block: cache inode size in bdev udf: use sb_bdev_nr_blocks reiserfs: use sb_bdev_nr_blocks ntfs: use sb_bdev_nr_blocks jfs: use sb_bdev_nr_blocks ext4: use sb_bdev_nr_blocks block: add a sb_bdev_nr_blocks helper block: use bdev_nr_bytes instead of open coding it in blkdev_fallocate squashfs: use bdev_nr_bytes instead of open coding it reiserfs: use bdev_nr_bytes instead of open coding it pstore/blk: use bdev_nr_bytes instead of open coding it ntfs3: use bdev_nr_bytes instead of open coding it nilfs2: use bdev_nr_bytes instead of open coding it nfs/blocklayout: use bdev_nr_bytes instead of open coding it jfs: use bdev_nr_bytes instead of open coding it hfsplus: use bdev_nr_sectors instead of open coding it hfs: use bdev_nr_sectors instead of open coding it ...
2021-11-01Merge tag 'for-5.16/block-2021-10-29' of git://git.kernel.dk/linux-blockLinus Torvalds1-2/+3
Pull block updates from Jens Axboe: - mq-deadline accounting improvements (Bart) - blk-wbt timer fix (Andrea) - Untangle the block layer includes (Christoph) - Rework the poll support to be bio based, which will enable adding support for polling for bio based drivers (Christoph) - Block layer core support for multi-actuator drives (Damien) - blk-crypto improvements (Eric) - Batched tag allocation support (me) - Request completion batching support (me) - Plugging improvements (me) - Shared tag set improvements (John) - Concurrent queue quiesce support (Ming) - Cache bdev in ->private_data for block devices (Pavel) - bdev dio improvements (Pavel) - Block device invalidation and block size improvements (Xie) - Various cleanups, fixes, and improvements (Christoph, Jackie, Masahira, Tejun, Yu, Pavel, Zheng, me) * tag 'for-5.16/block-2021-10-29' of git://git.kernel.dk/linux-block: (174 commits) blk-mq-debugfs: Show active requests per queue for shared tags block: improve readability of blk_mq_end_request_batch() virtio-blk: Use blk_validate_block_size() to validate block size loop: Use blk_validate_block_size() to validate block size nbd: Use blk_validate_block_size() to validate block size block: Add a helper to validate the block size block: re-flow blk_mq_rq_ctx_init() block: prefetch request to be initialized block: pass in blk_mq_tags to blk_mq_rq_ctx_init() block: add rq_flags to struct blk_mq_alloc_data block: add async version of bio_set_polled block: kill DIO_MULTI_BIO block: kill unused polling bits in __blkdev_direct_IO() block: avoid extra iter advance with async iocb block: Add independent access ranges support blk-mq: don't issue request directly in case that current is to be blocked sbitmap: silence data race warning blk-cgroup: synchronize blkg creation against policy deactivation block: refactor bio_iov_bvec_set() block: add single bio async direct IO helper ...
2021-10-18block: cache inode size in bdevJens Axboe1-0/+1
Reading the inode size brings in a new cacheline for IO submit, and it's in the hot path being checked for every single IO. When doing millions of IOs per core per second, this is noticeable overhead. Cache the nr_sectors in the bdev itself. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: fix incorrect references to disk objectsZqiang1-0/+1
When adding partitions to the disk, the reference count of the disk object is increased. then alloc partition device and called device_add(), if the device_add() return error, the reference count of the disk object will be reduced twice, at put_device(pdev) and put_disk(disk). this leads to the end of the object's life cycle prematurely, and trigger following calltrace. __init_work+0x2d/0x50 kernel/workqueue.c:519 synchronize_rcu_expedited+0x3af/0x650 kernel/rcu/tree_exp.h:847 bdi_remove_from_list mm/backing-dev.c:938 [inline] bdi_unregister+0x17f/0x5c0 mm/backing-dev.c:946 release_bdi+0xa1/0xc0 mm/backing-dev.c:968 kref_put include/linux/kref.h:65 [inline] bdi_put+0x72/0xa0 mm/backing-dev.c:976 bdev_free_inode+0x11e/0x220 block/bdev.c:408 i_callback+0x3f/0x70 fs/inode.c:226 rcu_do_batch kernel/rcu/tree.c:2508 [inline] rcu_core+0x76d/0x16c0 kernel/rcu/tree.c:2743 __do_softirq+0x1d7/0x93b kernel/softirq.c:558 invoke_softirq kernel/softirq.c:432 [inline] __irq_exit_rcu kernel/softirq.c:636 [inline] irq_exit_rcu+0xf2/0x130 kernel/softirq.c:648 sysvec_apic_timer_interrupt+0x93/0xc0 making disk is NULL when calling put_disk(). Reported-by: Hao Sun <sunhao.th@gmail.com> Signed-off-by: Zqiang <qiang.zhang1211@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211018103422.2043-1-qiang.zhang1211@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: convert the rest of block to bdev_get_queuePavel Begunkov1-2/+2
Convert bdev->bd_disk->queue to bdev_get_queue(), it's uses a cached queue pointer and so is faster. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/addf6ea988c04213697ba3684c853e4ed7642a39.1634219547.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: drop unused includes in <linux/genhd.h>Christoph Hellwig1-0/+1
Drop various include not actually used in genhd.h itself, and move the remaning includes closer together. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210920123328.1399408-15-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-16block: free the extended dev_t minor laterChristoph Hellwig1-2/+0
The dev_t is used as the inode hash, so we should only released it once then block device inode is gone from the inode cache. Move it to bdev_free_inode to ensure that. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210816122614.601358-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-12block: pass a gendisk to bdev_resize_partitionChristoph Hellwig1-6/+6
bdev_resize_partition can only operate on the whole device. Make that clear by passing a gendisk instead of a block_device. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210810154512.1809898-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-12block: pass a gendisk to bdev_del_partitionChristoph Hellwig1-4/+4
bdev_del_partition can only operate on the whole device. Make that clear by passing a gendisk instead of a block_device. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210810154512.1809898-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-12block: pass a gendisk to bdev_add_partitionChristoph Hellwig1-3/+2
bdev_add_partition can only operate on the whole device. Make that clear by passing a gendisk instead of a block_device. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210810154512.1809898-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-12block: store a gendisk in struct parsed_partitionsChristoph Hellwig1-3/+3
Partition scanning only happens on the whole device, so pass a struct gendisk instead of the whole device block_device to the scanners. This allows to simplify printing the device name in various places as the disk name is available in disk->name. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Stefan Haberland <sth@linux.ibm.com> Link: https://lore.kernel.org/r/20210810154512.1809898-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-12block: remove GENHD_FL_UPChristoph Hellwig1-2/+2
Just check inode_unhashed on the whole device bdev inode instead, and provide a helper to check for that information. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210809064028.1198327-9-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-02block: simplify disk name formatting in check_partitionChristoph Hellwig1-1/+1
disk_name for partition 0 just copies out the disk_name field. Replace the call to disk_name with a %s format specifier. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20210727062518.122108-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-02block: remove bdputChristoph Hellwig1-1/+1
Now that we've stopped using inode references for anything meaninful in the block layer get rid of the helper to put it and just open code the call to iput on the block_device inode. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Chaitanya Kulkarni <ckulkarnilinux@gmail.com> Link: https://lore.kernel.org/r/20210722075402.983367-10-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-02block: change the refcounting for partitionsChristoph Hellwig1-1/+8
Instead of acquiring an inode reference on open make sure partitions always hold device model references to the disk while alive, and switch open to grab only a device model reference to the opened block device. If that is a partition the disk reference is transitively held by the partition already. Link: https://lore.kernel.org/r/20210722075402.983367-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-02block: allocate bd_meta_info later in add_partitionsChristoph Hellwig1-10/+7
Move the allocation of bd_meta_info after initializing the struct device to avoid the special bdput error handling path. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210722075402.983367-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-02block: assert the locking state in delete_partitionChristoph Hellwig1-4/+2
Add a lockdep assert instead of the outdated locking comment. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Chaitanya Kulkarni <ckulkarnilinux@gmail.com> Link: https://lore.kernel.org/r/20210722075402.983367-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-01block: remove the bdgrab in blk_drop_partitionsChristoph Hellwig1-5/+1
There is no need to hold a bdev reference when removing the partition. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210701081638.246552-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-30block: check disk exist before trying to add partitionYufen Yu1-7/+16
If disk have been deleted, we should return fail for ioctl BLKPG_DEL_PARTITION. Otherwise, the directory /sys/class/block may remain invalid symlinks file. The race as following: blkdev_open del_gendisk disk->flags &= ~GENHD_FL_UP; blk_drop_partitions blkpg_ioctl bdev_add_partition add_partition device_add device_add_class_symlinks ioctl may add_partition after del_gendisk() have tried to delete partitions. Then, symlinks file will be created. Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Yufen Yu <yuyufen@huawei.com> Link: https://lore.kernel.org/r/20210610023241.3646241-1-yuyufen@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-24block: pass a gendisk to bdev_disk_changedChristoph Hellwig1-12/+10
bdev_disk_changed can only operate on whole devices. Make that clear by passing a gendisk instead of the struct block_device. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210624123240.441814-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-24block: move bdev_disk_changedChristoph Hellwig1-1/+54
Move bdev_disk_changed to block/partitions/core.c, together with the rest of the partition scanning code. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210624123240.441814-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01block: remove bdget_diskChristoph Hellwig1-13/+12
Just opencode the xa_load in the callers, as none of them actually needs a reference to the bdev. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210525061301.2242282-9-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01block: move bd_mutex to struct gendiskChristoph Hellwig1-14/+10
Replace the per-block device bd_mutex with a per-gendisk open_mutex, thus simplifying locking wherever we deal with partitions. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Link: https://lore.kernel.org/r/20210525061301.2242282-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01block: automatically enable GENHD_FL_EXT_DEVTChristoph Hellwig1-4/+0
Automatically set the GENHD_FL_EXT_DEVT flag for all disks allocated without an explicit number of minors. This is what all new block drivers should do, so make sure it is the default without boilerplate code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01block: refactor device number setup in __device_add_diskChristoph Hellwig1-4/+11
Untangle the mess around blk_alloc_devt by moving the check for the used allocation scheme into the callers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-28Merge tag 'for-5.13/block-2021-04-27' of git://git.kernel.dk/linux-blockLinus Torvalds1-31/+23
Pull block updates from Jens Axboe: "Pretty quiet round this time, which is nice. In detail: - Series revamping bounce buffer support (Christoph) - Dead code removal (Christoph, Bart) - Partition iteration revamp, now using xarray (Christoph) - Passthrough request scheduler improvements (Lin) - Series of BFQ improvements (Paolo) - Fix ioprio task iteration (Peter) - Various little tweaks and fixes (Tejun, Saravanan, Bhaskar, Max, Nikolay)" * tag 'for-5.13/block-2021-04-27' of git://git.kernel.dk/linux-block: (41 commits) blk-iocost: don't ignore vrate_min on QD contention blk-mq: Fix spurious debugfs directory creation during initialization bfq/mq-deadline: remove redundant check for passthrough request blk-mq: bypass IO scheduler's limit_depth for passthrough request block: Remove an obsolete comment from sg_io() block: move bio_list_copy_data to pktcdvd block: remove zero_fill_bio_iter block: add queue_to_disk() to get gendisk from request_queue block: remove an incorrect check from blk_rq_append_bio block: initialize ret in bdev_disk_changed block: Fix sys_ioprio_set(.which=IOPRIO_WHO_PGRP) task iteration block: remove disk_part_iter block: simplify diskstats_show block: simplify show_partition block: simplify printk_all_partitions block: simplify partition_overlaps block: simplify partition removal block: take bd_mutex around delete_partitions in del_gendisk block: refactor blk_drop_partitions block: move more syncing and invalidation to delete_partition ...
2021-04-08block: simplify partition_overlapsChristoph Hellwig1-10/+10
Just use xa_for_each to iterate over the partitions as there is no need to grab a reference to each partition. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210406062303.811835-8-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-08block: simplify partition removalChristoph Hellwig1-4/+6
Always look up the first available entry instead of the complicated stateful traversal. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210406062303.811835-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-08block: take bd_mutex around delete_partitions in del_gendiskChristoph Hellwig1-0/+2
There is nothing preventing an ioctl from trying do delete partition concurrenly with del_gendisk, so take open_mutex to serialize against that. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210406062303.811835-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-08block: refactor blk_drop_partitionsChristoph Hellwig1-11/+3
Move the busy check and disk-wide sync into the only caller, so that the remainder can be shared with del_gendisk. Also pass the gendisk instead of the bdev as that is all that is needed. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210406062303.811835-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-08block: move more syncing and invalidation to delete_partitionChristoph Hellwig1-3/+3
Move the calls to fsync_bdev and __invalidate_device from del_gendisk to delete_partition. For the other two callers that check that there are no openers for the delete partitions(s) the callouts are a no-op as no file system can be mounted, but this keeps all the cleanup in one place. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210406062303.811835-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-08dasd: use bdev_disk_changed instead of blk_drop_partitionsChristoph Hellwig1-4/+0
Use the more general interface - the behavior is the same except that now a change uevent is sent, which is the right thing to do when the device becomes unusable. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Stefan Haberland <sth@linux.ibm.com> Link: https://lore.kernel.org/r/20210406062303.811835-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-27block: don't create too many partitionsMing Lei1-0/+7
Commit a33df75c6328 ("block: use an xarray for disk->part_tbl") drops the check on max supported number of partitionsr, and allows partition with bigger partition numbers to be added. However, ->bd_partno is defined as u8, so partition index of xarray table may not match with ->bd_partno. Then delete_partition() may delete one unmatched partition, and caused use-after-free. Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reported-by: syzbot+8fede7e30c7cee0de139@syzkaller.appspotmail.com Fixes: a33df75c6328 ("block: use an xarray for disk->part_tbl") Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-28block: revert "block: fix bd_size_lock use"Damien Le Moal1-4/+2
With the removal of the skd driver, using IRQ safe locking of a bdev bd_size_lock spinlock to protect the bdev inode size is not necessary anymore as there is no other known driver using this lock under an IRQ disabled context (e.g. calling set_capacity() with IRQ disabled). Revert commit 0fe37724f8e7 ("block: fix bd_size_lock use") which introduced the IRQ safe change. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-21Merge tag 'for-5.12/block-2021-02-17' of git://git.kernel.dk/linux-blockLinus Torvalds1-28/+8
Pull core block updates from Jens Axboe: "Another nice round of removing more code than what is added, mostly due to Christoph's relentless pursuit of tech debt removal/cleanups. This pull request contains: - Two series of BFQ improvements (Paolo, Jan, Jia) - Block iov_iter improvements (Pavel) - bsg error path fix (Pan) - blk-mq scheduler improvements (Jan) - -EBUSY discard fix (Jan) - bvec allocation improvements (Ming, Christoph) - bio allocation and init improvements (Christoph) - Store bdev pointer in bio instead of gendisk + partno (Christoph) - Block trace point cleanups (Christoph) - hard read-only vs read-only split (Christoph) - Block based swap cleanups (Christoph) - Zoned write granularity support (Damien) - Various fixes/tweaks (Chunguang, Guoqing, Lei, Lukas, Huhai)" * tag 'for-5.12/block-2021-02-17' of git://git.kernel.dk/linux-block: (104 commits) mm: simplify swapdev_block sd_zbc: clear zone resources for non-zoned case block: introduce blk_queue_clear_zone_settings() zonefs: use zone write granularity as block size block: introduce zone_write_granularity limit block: use blk_queue_set_zoned in add_partition() nullb: use blk_queue_set_zoned() to setup zoned devices nvme: cleanup zone information initialization block: document zone_append_max_bytes attribute block: use bi_max_vecs to find the bvec pool md/raid10: remove dead code in reshape_request block: mark the bio as cloned in bio_iov_bvec_set block: set BIO_NO_PAGE_REF in bio_iov_bvec_set block: remove a layer of indentation in bio_iov_iter_get_pages block: turn the nr_iovecs argument to bio_alloc* into an unsigned short block: remove the 1 and 4 vec bvec_slabs entries block: streamline bvec_alloc block: factor out a bvec_alloc_gfp helper block: move struct biovec_slab to bio.c block: reuse BIO_INLINE_VECS for integrity bvecs ...
2021-02-10block: use blk_queue_set_zoned in add_partition()Damien Le Moal1-1/+1
When changing the zoned model of host-aware zoned block devices, use blk_queue_set_zoned() instead of directly assigning the gendisk queue zoned limit. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@edc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-01-28block: fix bd_size_lock useDamien Le Moal1-2/+4
Some block device drivers, e.g. the skd driver, call set_capacity() with IRQ disabled. This results in lockdep ito complain about inconsistent lock states ("inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage") because set_capacity takes a block device bd_size_lock using the functions spin_lock() and spin_unlock(). Ensure a consistent locking state by replacing these calls with spin_lock_irqsave() and spin_lock_irqrestore(). The same applies to bdev_set_nr_sectors(). With this fix, all lockdep complaints are resolved. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-01-24block: Fix an error handling in add_partitionDinghao Liu1-1/+1
Once we have called device_initialize(), we should use put_device() to give up the reference on error, just like what we have done on failure of device_add(). Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-01-24block: use an xarray for disk->part_tblChristoph Hellwig1-25/+6
Now that no fast path lookups in the partition table are left, there is no point in micro-optimizing the data structure for it. Just use a bog standard xarray. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-01-24block: add a hard-readonly flag to struct gendiskChristoph Hellwig1-2/+1
Commit 20bd1d026aac ("scsi: sd: Keep disk read-only when re-reading partition") addressed a long-standing problem with user read-only policy being overridden as a result of a device-initiated revalidate. The commit has since been reverted due to a regression that left some USB devices read-only indefinitely. To fix the underlying problems with revalidate we need to keep track of hardware state and user policy separately. The gendisk has been updated to reflect the current hardware state set by the device driver. This is done to allow returning the device to the hardware state once the user clears the BLKROSET flag. The resulting semantics are as follows: - If BLKROSET sets a given partition read-only, that partition will remain read-only even if the underlying storage stack initiates a revalidate. However, the BLKRRPART ioctl will cause the partition table to be dropped and any user policy on partitions will be lost. - If BLKROSET has not been set, both the whole disk device and any partitions will reflect the current write-protect state of the underlying device. Based on a patch from Martin K. Petersen <martin.petersen@oracle.com>. Reported-by: Oleksii Kurochko <olkuroch@cisco.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201221 Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-22block: update some copyrightsChristoph Hellwig1-0/+1
Update copyrights for files that have gotten some major rewrites lately. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-01block: merge struct block_device and struct hd_structChristoph Hellwig1-72/+44
Instead of having two structures that represent each block device with different life time rules, merge them into a single one. This also greatly simplifies the reference counting rules, as we can use the inode reference count as the main reference count for the new struct block_device, with the device model reference front ending it for device model interaction. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-01block: switch disk_part_iter_* to use a struct block_deviceChristoph Hellwig1-7/+6
Switch the partition iter infrastructure to iterate over block_device references instead of hd_struct ones mostly used to get at the block_device. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-01block: pass a block_device to blk_alloc_devtChristoph Hellwig1-1/+1
Pass the block_device actually needed instead of the hd_struct. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>