summaryrefslogtreecommitdiffstats
path: root/drivers/md
AgeCommit message (Collapse)AuthorFilesLines
2016-07-26Merge tag 'dm-4.8-changes' of ↵Linus Torvalds25-1808/+4394
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mike Snitzer: - initially based on Jens' 'for-4.8/core' (given all the flag churn) and later merged with 'for-4.8/core' to pickup the QUEUE_FLAG_DAX commits that DM depends on to provide its DAX support - clean up the bio-based vs request-based DM core code by moving the request-based DM core code out to dm-rq.[hc] - reinstate bio-based support in the DM multipath target (done with the idea that fast storage like NVMe over Fabrics could benefit) -- while preserving support for request_fn and blk-mq request-based DM mpath - SCSI and DM multipath persistent reservation fixes that were coordinated with Martin Petersen. - the DM raid target saw the most extensive change this cycle; it now provides reshape and takeover support (by layering ontop of the corresponding MD capabilities) - DAX support for DM core and the linear, stripe and error targets - a DM thin-provisioning block discard vs allocation race fix that addresses potential for corruption - a stable fix for DM verity-fec's block calculation during decode - a few cleanups and fixes to DM core and various targets * tag 'dm-4.8-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (73 commits) dm: allow bio-based table to be upgraded to bio-based with DAX support dm snap: add fake origin_direct_access dm stripe: add DAX support dm error: add DAX support dm linear: add DAX support dm: add infrastructure for DAX support dm thin: fix a race condition between discarding and provisioning a block dm btree: fix a bug in dm_btree_find_next_single() dm raid: fix random optimal_io_size for raid0 dm raid: address checkpatch.pl complaints dm: call PR reserve/unreserve on each underlying device sd: don't use the ALL_TG_PT bit for reservations dm: fix second blk_delay_queue() parameter to be in msec units not jiffies dm raid: change logical functions to actually return bool dm raid: use rdev_for_each in status dm raid: use rs->raid_disks to avoid memory leaks on free dm raid: support delta_disks for raid1, fix table output dm raid: enhance reshape check and factor out reshape setup dm raid: allow resize during recovery dm raid: fix rs_is_recovering() to allow for lvextend ...
2016-07-20dm: allow bio-based table to be upgraded to bio-based with DAX supportToshi Kani1-1/+10
Allow table type DM_TYPE_BIO_BASED to extend with DM_TYPE_DAX_BIO_BASED since DM_TYPE_DAX_BIO_BASED supports bio-based requests. This is needed to allow a snapshot of an LV with DAX support to be removed. One of the intermediate table reloads that lvm2 does switches from DM_TYPE_BIO_BASED to DM_TYPE_DAX_BIO_BASED. No known reason to disallow this so... Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-20dm snap: add fake origin_direct_accessToshi Kani1-0/+8
dax-capable mapped-device is marked as DM_TYPE_DAX_BIO_BASED, which supports both dax and bio-based operations. dm-snap needs to work with dax-capable device when bio-based operation is used. Add fake origin_direct_access() to origin device so that its origin device is also marked as DM_TYPE_DAX_BIO_BASED for dax-capable device. This allows to extend target's DM table. dm-snap works normally when bio-based operation is used. dm-snap does not support dax operation, and mount with dax option to a target device or snapshot device fails. Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-20dm stripe: add DAX supportToshi Kani1-1/+25
Change dm-stripe to implement direct_access function, stripe_direct_access(), which maps bdev and sector and calls direct_access function of its physical target device. Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-20dm error: add DAX supportMike Snitzer2-2/+10
Allow the error target to replace an existing DAX-enabled target. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-20dm linear: add DAX supportToshi Kani1-1/+20
Change dm-linear to implement direct_access function, linear_direct_access(), which maps sector and calls direct_access function of its physical target device. Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-20dm: add infrastructure for DAX supportToshi Kani3-3/+80
Change mapped device to implement direct_access function, dm_blk_direct_access(), which calls a target direct_access function. 'struct target_type' is extended to have target direct_access interface. This function limits direct accessible size to the dm_target's limit with max_io_len(). Add dm_table_supports_dax() to iterate all targets and associated block devices to check for DAX support. To add DAX support to a DM target the target must only implement the direct_access function. Add a new dm type, DM_TYPE_DAX_BIO_BASED, which indicates that mapped device supports DAX and is bio based. This new type is used to assure that all target devices have DAX support and remain that way after QUEUE_FLAG_DAX is set in mapped device. At initial table load, QUEUE_FLAG_DAX is set to mapped device when setting DM_TYPE_DAX_BIO_BASED to the type. Any subsequent table load to the mapped device must have the same type, or else it fails per the check in table_load(). Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-20block: simplify and cleanup bvec pool handlingChristoph Hellwig1-1/+0
Instead of a flag and an index just make sure an index of 0 means no need to free the bvec array. Also move the constants related to the bvec pools together and use a consistent naming scheme for them. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-20block: get rid of bio_rw and READAChristoph Hellwig6-19/+23
These two are confusing leftover of the old world order, combining values of the REQ_OP_ and REQ_ namespaces. For callers that don't special case we mostly just replace bi_rw with bio_data_dir or op_is_write, except for the few cases where a switch over the REQ_OP_ values makes more sense. Any check for READA is replaced with an explicit check for REQ_RAHEAD. Also remove the READA alias for REQ_RAHEAD. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-20dm thin: fix a race condition between discarding and provisioning a blockJoe Thornber3-11/+124
The discard passdown was being issued after the block was unmapped, which meant the block could be reprovisioned whilst the passdown discard was still in flight. We can only identify unshared blocks (safe to do a passdown a discard to) once they're unmapped and their ref count hits zero. Block ref counts are now used to guard against concurrent allocation of these blocks that are being discarded. So now we unmap the block, issue passdown discards, and the immediately increment ref counts for regions that have been discarded via passed down (this is safe because allocation occurs within the same thread). We then decrement ref counts once the passdown discard IO is complete -- signaling these blocks may now be allocated. This fixes the potential for corruption that was reported here: https://www.redhat.com/archives/dm-devel/2016-June/msg00311.html Reported-by: Dennis Yang <dennisyang@qnap.com> Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-20dm btree: fix a bug in dm_btree_find_next_single()Joe Thornber1-1/+8
dm_btree_find_next_single() can short-circuit the search for a block with a return of -ENODATA if all entries are higher than the search key passed to lower_bound(). This hasn't been a problem because of the way the btree has been used by DM thinp. But it must be fixed now in preparation for fixing the race in DM thinp's handling of simultaneous block discard vs allocation. Otherwise, once that fix is in place, some of the blocks in a discard would not be unmapped as expected. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-19dm raid: fix random optimal_io_size for raid0Heinz Mauelshagen1-4/+3
raid_io_hints() was retrieving the number of data stripes used for the calculation of io_opt from struct r5conf, which is not defined for raid0 mappings. Base the calculation on the in-core raid_set structure instead. Also, adjust to use to_bytes() for the sector -> bytes conversion throughout. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-19dm raid: address checkpatch.pl complaintsHeinz Mauelshagen1-21/+21
Use 'unsigned int' where appropriate. Return negative errors. Correct an indentation. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm: call PR reserve/unreserve on each underlying deviceChristoph Hellwig1-15/+65
So far we tried to rely on the SCSI 'all target ports' bit to register all path, but for many setups this didn't work properly as the different paths are seen as separate initiators to the target instead of multiple ports of the same initiator. Because of that we'll stop setting the 'all target ports' bit in SCSI, and let device mapper handle iterating over the device for each path and register them manually. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm: fix second blk_delay_queue() parameter to be in msec units not jiffiesTahsin Erdogan1-1/+1
Commit d548b34b062 ("dm: reduce the queue delay used in dm_request_fn from 100ms to 10ms") always intended the value to be 10 msecs -- it just expressed it in jiffies because earlier commit 7eaceaccab ("block: remove per-queue plugging") did. Signed-off-by: Tahsin Erdogan <tahsin@google.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Fixes: d548b34b062 ("dm: reduce the queue delay used in dm_request_fn from 100ms to 10ms") Cc: stable@vger.kernel.org # 4.1+ -- stable@ backports must be applied to drivers/md/dm.c
2016-07-18dm raid: change logical functions to actually return boolHeinz Mauelshagen1-15/+14
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: use rdev_for_each in statusHeinz Mauelshagen1-2/+2
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: use rs->raid_disks to avoid memory leaks on freeHeinz Mauelshagen1-6/+5
Also makes code more consistent throughout. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: support delta_disks for raid1, fix table outputHeinz Mauelshagen1-43/+49
Add "delta_disks" constructor argument support to raid1 to allow for consistent userspace disk addition/removal handling. Fix raid_status() to report all raid disks with status and table output on disk adding reshapes, not just the ones listed on the mddev; optimize its rebuild and writemostly output. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: enhance reshape check and factor out reshape setupHeinz Mauelshagen1-61/+106
Enhance rs_reshape_requested() check function to be more transparent and fix its raid10 check. Streamline the constructor by factoring out reshaping preparation into fucntion rs_prepare_reshape(). Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: allow resize during recoveryHeinz Mauelshagen1-15/+19
Resizing a RAID set during recovery can be allowed, because the MD resynchronization thread will either stop any ongoing recovery in case of shrinking below the current recovery position or carry on recovery to the new size if the set is growing. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: fix rs_is_recovering() to allow for lvextendHeinz Mauelshagen1-2/+2
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: fix rebuild and catch bogus sync/resync flagsHeinz Mauelshagen1-3/+16
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: fix ctr memory leaks on error pathsHeinz Mauelshagen1-10/+15
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: fix typo in write_mostly flagHeinz Mauelshagen1-1/+1
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: also reject size change during recoveryHeinz Mauelshagen1-3/+3
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: fix new superblock/bitmap creation on disk additionHeinz Mauelshagen1-1/+2
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: add comments and fix typosHeinz Mauelshagen1-7/+13
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: fix raid10 device size error on out-of-place reshapeHeinz Mauelshagen1-0/+8
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: prohibit 'nosync' on new raid6 and reject resize during reshapeHeinz Mauelshagen1-3/+15
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: clarify and fix recoveryHeinz Mauelshagen1-9/+55
Add function rs_setup_recovery() to allow for defined setup of RAID set recovery in the constructor. Will be called with dev_sectors={0, rdev->sectors, MaxSectors} to recover a new or enforced sync, grown or not to be synhronized RAID set respectively. Prevents recovery on raid0, which doesn't support it. Enforces recovery on raid6 to ensure properly defined Syndromes mandatory for that MD personality are being created. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: fix rs_set_capacity on growing reshapeHeinz Mauelshagen1-6/+3
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: make rs_set_capacity to work on shrinking reshapeHeinz Mauelshagen1-4/+4
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: enhance comments in takeover checksHeinz Mauelshagen1-2/+2
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: remove bogus comment and fix comment typosHeinz Mauelshagen1-4/+2
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: more restricting data_offset value checksHeinz Mauelshagen1-1/+2
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: reject too many write_mostly devicesHeinz Mauelshagen1-1/+7
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: the sync_page_io() metadata_op argument is boolHeinz Mauelshagen1-3/+3
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: prohibit to pass in both sync and nosync ctr flagsHeinz Mauelshagen1-0/+6
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-18dm raid: avoid superfluous memory barriers on static metadataHeinz Mauelshagen1-5/+0
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-06dm rq: check kthread_run return for .request_fn request-based DMMike Snitzer1-0/+2
Check return value of kthread_run() in dm_old_init_request_queue(). Reported-by: Minfei Huang <mnghuan@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-05bcache: Remove redundant block_size assignmentYijing Wang1-1/+0
We have assigned sb->block_size before the switch, so remove the redundant one. Reviewed-by: Coly Li <colyli@suse.de> Signed-off-by: Yijing Wang <wangyijing@huawei.com> Acked-by: Eric Wheeler <bcache@lists.ewheeler.net> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-05bcache: update document infoYijing Wang2-2/+3
There is no return in continue_at(), update the documentation. Signed-off-by: Yijing Wang <wangyijing@huawei.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-05bcache: Remove redundant parameter for cache_alloc()Yijing Wang1-2/+2
Cache_sb is not used in cache_alloc, and we have copied sb info to cache->sb already, remove it. Reviewed-by: Coly Li <colyli@suse.de> Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-01dm verity fec: fix block calculationSami Tolvanen1-3/+1
do_div was replaced with div64_u64 at some point, causing a bug with block calculation due to incompatible semantics of the two functions. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Fixes: a739ff3f543a ("dm verity: add support for forward error correction") Cc: stable@vger.kernel.org # v4.5+ Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-01dm ioctl: Simplify parameter buffer management codeBart Van Assche1-12/+6
Merge the two DM_PARAMS_[KV]MALLOC flags into a single flag. Doing so avoids the crashes seen with previous attempts to consolidate buffer management to use kvfree() without first flagging that memory had actually been allocated. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-07-01dm crypt: Fix sparse complaintsBart Van Assche1-2/+2
Avoid that sparse complains about assigning a __le64 value to a u64 variable. Remove the (u64) casts since these are superfluous. This patch does not change the behavior of the source code. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-16dm raid: don't use 'const' in function returnArnd Bergmann1-1/+1
A newly introduced function has 'const int' as the return type, but as "make W=1" reports, that has no meaning: drivers/md/dm-raid.c:510:18: error: type qualifiers ignored on function return type [-Werror=ignored-qualifiers] This changes the return type to plain 'int'. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 33e53f06850f ("dm raid: introduce extended superblock and new raid types to support takeover/reshaping") Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-14dm raid: fix failed takeover/reshapes by keeping raid set frozenHeinz Mauelshagen1-29/+56
Superblock updates where bogus causing some takovers/reshapes to fail. Introduce new runtime flag (RT_FLAG_KEEP_RS_FROZEN) to keep a raid set frozen when a layout change was requested. Userpace will immediately reload the table w/o the flags requesting such change once they made it to the superblocks and any change of recovery/reshape offsets has to be avoided until after read. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-14dm raid: support to change bitmap region sizeHeinz Mauelshagen1-0/+11
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>