summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2017-08-30drbd: remove BIOSET_NEED_RESCUER flag from drbd_{md_,}io_bio_setNeilBrown1-3/+2
Careful analysis shows that this flag is not needed. The RESCUER flag is only needed when a make_request_fn might: - allocate a bio from the bioset - submit it with generic_make_request() or similar - allocate another bio from the bioset The second allocation can block until the first bio is processed, so a rescuer is needed to ensure the first bio does get processed. With a rescuer it will only get processed when the make_request_fn completes. In drbd, allocations from drbd_io_bio_set happen from drbd_new_req() or w_restart_disk_io() which is only called to handle RESTART_FROZEN_DISK_IO. In former is called precisely once from the make_request_fn. The later is never called by within the make_request_fn. So there cannot be two allocations in the same call to the make_request_fn, so a rescuer is not needed. Allocations from drbd_md_io_bio_set are used for IO to the bitmap and the activity log. There are only accessed from worker threads and workqueues, never directly from make_request_fn. Again, the rescuer isn't needed. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-30drbd: Fix allyesconfig build, fix recent commitPhilipp Reisner1-1/+1
Globals where prefixed with drbd_, that was missed in the in #ifdef'nd code when it is built-in. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Fixes: 183ece30053f ("drbd: move global variables to drbd namespace and make some static") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29drbd: switch from kmalloc() to kmalloc_array()Roland Kammerer2-2/+2
We had one call to kmalloc that actually allocates an array. Switch that one to the kmalloc_array() function. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29drbd: abort drbd_start_resync if there is no connectionRoland Kammerer1-0/+5
This was found by a static analysis tool. While highly unlikely, be sure to return without dereferencing the NULL pointer. Reported-by: Shaobo <shaobo@cs.utah.edu> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29drbd: move global variables to drbd namespace and make some staticRoland Kammerer4-49/+43
This is a follow-up to Gregs complaints that drbd clutteres the global namespace. Some of DRBD's module parameters are only used within one compilation unit. Make these static. Signed-off-by: Roland Kammerer <roland.kammerer@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29drbd: rename "usermode_helper" to "drbd_usermode_helper"Greg Kroah-Hartman3-13/+13
Nothing like having a very generic global variable in a tiny driver subsystem to make a mess of the global namespace... Note, there are many other "generic" named global variables in the drbd subsystem, someone should fix those up one day before they hit a linking error. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29drbd: fix race between handshake and admin disconnect/downLars Ellenberg1-0/+3
conn_try_disconnect() could potentialy hit the BUG_ON() in _conn_set_state() where it iterates over _drbd_set_state() and "asserts" via BUG_ON() that the latter was successful. If the STATE_SENT bit was not yet visible to conn_is_valid_transition() early in _conn_request_state(), but became visible before conn_set_state() later in that call path, we could hit the BUG_ON() after _drbd_set_state(), because it returned SS_IN_TRANSIENT_STATE. To avoid that race, we better protect set_bit(SENT_STATE) with the spinlock. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29drbd: fix potential deadlock when trying to detach during handshakeLars Ellenberg3-23/+56
When requesting a detach, we first suspend IO, and also inhibit meta-data IO by means of drbd_md_get_buffer(), because we don't want to "fail" the disk while there is IO in-flight: the transition into D_FAILED for detach purposes may get misinterpreted as actual IO error in a confused endio function. We wrap it all into wait_event(), to retry in case the drbd_req_state() returns SS_IN_TRANSIENT_STATE, as it does for example during an ongoing connection handshake. In that example, the receiver thread may need to grab drbd_md_get_buffer() during the handshake to make progress. To avoid potential deadlock with detach, detach needs to grab and release the meta data buffer inside of that wait_event retry loop. To avoid lock inversion between mutex_lock(&device->state_mutex) and drbd_md_get_buffer(device), introduce a new enum chg_state_flag CS_INHIBIT_MD_IO, and move the call to drbd_md_get_buffer() inside the state_mutex grabbed in drbd_req_state(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29drbd: A single dot should be put into a sequence.Markus Elfring1-1/+1
Thus use the corresponding function "seq_putc". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Roland Kammerer <roland.kammerer@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29drbd: fix rmmod cleanup, remove _all_ debugfs entriesLars Ellenberg1-1/+2
If there are still resources defined, but "empty", no more volumes or connections configured, they don't hold module reference counts, so rmmod is possible. To avoid DRBD leftovers in debugfs, we need to call our global drbd_debugfs_cleanup() only after all resources have been cleaned up. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29drbd: Use setup_timer() instead of init_timer() to simplify the code.Geliang Tang1-12/+8
Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Roland Kammerer <roland.kammerer@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29drbd: fix potential get_ldev/put_ldev refcount imbalance during attachLars Ellenberg1-3/+5
Race: drbd_adm_attach() | async drbd_md_endio() | device->ldev is still NULL. | | drbd_md_read( | .endio = drbd_md_endio; | submit; | .... | wait for done == 1; | done = 1; ); | wake_up(); .. lot of other stuff, | .. includeing taking and | ...giving up locks, | .. doing further IO, | .. stuff that takes "some time" | | while in this context, | this is the next statement. | which means this context was scheduled .. only then, finally, | away for "some time". device->ldev = nbc; | | if (device->ldev) | put_ldev() Unlikely, but possible. I was able to provoke it "reliably" by adding an mdelay(500); after the wake_up(). Fixed by moving the if (!NULL) put_ldev() before done = 1; Impact of the bug was that the resulting refcount imbalance could lead to premature destruction of the object, potentially causing a NULL pointer dereference during a subsequent detach. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29drbd: new disk-option disable-write-sameLars Ellenberg3-5/+21
Some backend devices claim to support write-same, but would fail actual write-same requests. Allow to set (or toggle) whether or not DRBD tries to support write-same. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29drbd: Fix resource role for newly created resources in events2Philipp Reisner1-1/+1
The conn_higest_role() (a terribly misnamed function) returns the role of the resource. It returned R_UNKNOWN as long as the resource has not a single device. Resources without devices are short living objects. But it matters for the NOTIFY_CREATE netwlink message. It makes a lot more sense to report R_SECONDARY for the newly created resource than R_UNKNOWN. I reviewd all call sites of conn_highest_role(), that change does not matter for the other call sites. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29drbd: mark symbols static where possibleBaoyou Xie3-3/+6
We get a few warnings when building kernel with W=1: drbd/drbd_receiver.c:1224:6: warning: no previous prototype for 'one_flush_endio' [-Wmissing-prototypes] drbd/drbd_req.c:1450:6: warning: no previous prototype for 'send_and_submit_pending' [-Wmissing-prototypes] drbd/drbd_main.c:924:6: warning: no previous prototype for 'assign_p_sizes_qlim' [-Wmissing-prototypes] .... In fact, these functions are only used in the file in which they are declared and don't need a declaration, but can be made static. So this patch marks these functions with 'static'. Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29drbd: Send P_NEG_ACK upon write error in protocol != CLars Ellenberg1-0/+8
In protocol != C, we forgot to send the P_NEG_ACK for failing writes. Once we no longer submit to local disk, because we already "detached", due to the typical "on-io-error detach;" config setting, we already send the neg acks right away. Only those requests that have been submitted, and have been error-completed by the local disk, would forget to send the neg-ack, and only in asynchronous replication (protocol != C). Unless this happened during resync, where we already always send acks, regardless of protocol. The primary side needs the P_NEG_ACK in order to mark the affected block(s) for resync in its out-of-sync bitmap. If the blocks in question are not re-written again, we may miss to resync them later, causing data inconsistencies. This patch will always send the neg-acks, and also at least try to persist the out-of-sync status on the local node already. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29drbd: add explicit plugging when submitting batchesLars Ellenberg1-6/+15
When submitting batches of requests which had been queued on the submitter thread, typically because they needed to wait for an activity log transactions, use explicit plugging to help potential merging of requests in the backend io-scheduler. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29drbd: change list_for_each_safe to while(list_first_entry_or_null)Lars Ellenberg1-4/+4
Two instances of list_for_each_safe can drop their tmp element, they really just peel off each element in turn from the start of the list. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29drbd: introduce drbd_recv_header_maybe_unplugLars Ellenberg6-9/+139
Recently, drbd_recv_header() was changed to potentially implicitly "unplug" the backend device(s), in case there is currently nothing to receive. Be more explicit about it: re-introduce the original drbd_recv_header(), and introduce a new drbd_recv_header_maybe_unplug() for use by the receiver "main loop". Using explicit plugging via blk_start_plug(); blk_finish_plug(); really helps the io-scheduler of the backend with merging requests. Wrap the receiver "main loop" with such a plug. Also catch unplug events on the Primary, and try to propagate. This is performance relevant. Without this, if the receiving side does not merge requests, number of IOPS on the peer can me significantly higher than IOPS on the Primary, and can easily become the bottleneck. Together, both changes should help to reduce the number of IOPS as seen on the backend of the receiving side, by increasing the chance of merging mergable requests, without trading latency for more throughput. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29bsg: remove #if 0'ed codeChristoph Hellwig1-7/+0
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29mq-deadline: Enable auto-loading when built as moduleBen Hutchings1-0/+1
The block core requests modules with the "-iosched" name suffix, but mq-deadline does not have that suffix. Add an alias. Fixes: 945ffb60c11d ("mq-deadline: add blk-mq adaptation of the deadline ...") Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29bfq: Re-enable auto-loading when built as a moduleBen Hutchings1-0/+1
The block core requests modules with the "-iosched" name suffix, but bfq no longer has that suffix. Add an alias. Fixes: ea25da48086d ("block, bfq: split bfq-iosched.c into multiple ...") Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29block: Make blk_dequeue_request() staticDamien Le Moal2-5/+1
The only caller of this function is blk_start_request() in the same file. Fix blk_start_request() description accordingly. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29skd: Let the block layer core choose .nr_requestsBart Van Assche1-1/+0
Since blk_mq_init_queue() initializes .nr_requests to the tag set size and since that value is a good default for the skd driver, do not overwrite the value set by blk_mq_init_queue(). This change doubles the default value of .nr_requests. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29skd: Remove blk_queue_bounce_limit() callBart Van Assche1-1/+0
Since sTec s1120 devices support 64-bit DMA it is not necessary to request data buffer bouncing. Hence remove the blk_queue_bounce_limit() call. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-28nbd: make device_attribute constBhumika Goyal1-1/+1
Make this const as is is only passed as an argument to the function device_create_file and device_remove_file and the corresponding arguments are of type const. Done using Coccinelle Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-28null_blk: use available 'dev' in nullb_device_power_store()Jens Axboe1-1/+1
We already have this pointer, no need to use to_nullb_device() again. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-28block/nullb: delete unnecessary memory freeShaohua Li1-1/+0
Commit 2984c86(nullb: factor disk parameters) has a typo. The nullb_device allocation/free is done outside of null_add_dev. The commit accidentally frees the nullb_device in error code path. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-28block: fix warning when I/O elevator is changed as request_queue is being ↵David Jeffery2-0/+6
removed There is a race between changing I/O elevator and request_queue removal which can trigger the warning in kobject_add_internal. A program can use sysfs to request a change of elevator at the same time another task is unregistering the request_queue the elevator would be attached to. The elevator's kobject will then attempt to be connected to the request_queue in the object tree when the request_queue has just been removed from sysfs. This triggers the warning in kobject_add_internal as the request_queue no longer has a sysfs directory: kobject_add_internal failed for iosched (error: -2 parent: queue) ------------[ cut here ]------------ WARNING: CPU: 3 PID: 14075 at lib/kobject.c:244 kobject_add_internal+0x103/0x2d0 To fix this warning, we can check the QUEUE_FLAG_REGISTERED flag when changing the elevator and use the request_queue's sysfs_lock to serialize between clearing the flag and the elevator testing the flag. Signed-off-by: David Jeffery <djeffery@redhat.com> Tested-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-28block, scheduler: convert xxx_var_store to voidweiping zhang4-33/+31
The last parameter "count" never be used in xxx_var_store, convert these functions to void. Signed-off-by: weiping zhang <zhangweiping@didichuxing.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-25skd: Remove SKD_ID_INCRBart Van Assche1-5/+0
The SKD_ID_INCR flag in skd_request_context.id duplicates information that is already available otherwise, e.g. through the block layer request state and through skd_request_context.state. Hence remove the code that manipulates this flag and also the flag itself. Since skd_isr_completion_posted() only uses the lower bits of skd_request_context.id as hardware tag, this patch does not change the behavior of the skd driver. I'm referring to the following code: tag = req_id & SKD_ID_SLOT_AND_TABLE_MASK; Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-25skd: Make it easier for static analyzers to analyze skd_free_disk()Bart Van Assche1-1/+2
Although it is easy to see that skdev->disk != NULL if skdev->queue != NULL, add a test for skdev->disk to avoid that smatch reports the following warning: drivers/block/skd_main.c:3080 skd_free_disk() error: we previously assumed 'disk' could be null (see line 3074) Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-25skd: Inline skd_end_request()Bart Van Assche1-32/+13
It is not worth to keep the debug statements in skd_end_request(). Without debug statements that function only consists of two statements. Hence inline skd_end_request(). Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-25skd: Rename skd_softirq_done() into skd_complete_rq()Bart Van Assche1-2/+2
The latter name follows more closely the function names used in other blk-mq drivers. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-25block/nullb: fix NULL dereferenceShaohua Li1-1/+1
Dan reported this: The patch 2984c8684f96: "nullb: factor disk parameters" from Aug 14, 2017, leads to the following Smatch complaint: drivers/block/null_blk.c:1759 null_init_tag_set() error: we previously assumed 'nullb' could be null (see line 1750) 1755 set->cmd_size = sizeof(struct nullb_cmd); 1756 set->flags = BLK_MQ_F_SHOULD_MERGE; 1757 set->driver_data = NULL; 1758 1759 if (nullb->dev->blocking) ^^^^^^^^^^^^^^^^^^^^ And an unchecked dereference. nullb could be NULL here. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-25blkcg: avoid free blkcg_root when failed to alloc blkcg policyweiping zhang1-3/+5
this patch fix two errors, firstly avoid kfree blk_root, secondly not free(blkcg) ,if blkcg alloc fail(blkcg == NULL), just unlock that mutex; Signed-off-by: weiping zhang <zhangweiping@didichuxing.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-25null_blk: update email adressJens Axboe1-1/+1
Update to a working one, the fusionio address hasn't been valid in 4 years. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-25block: update comments to reflect REQ_FLUSH -> REQ_PREFLUSH renameOmar Sandoval1-12/+12
Normally I wouldn't bother with this, but in my opinion the comments are the most important part of this whole file since without them no one would have any clue how this insanity works. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-24compat_hdio_ioctl: Fix a declarationBart Van Assche1-1/+1
This patch avoids that sparse reports the following warning messages: block/compat_ioctl.c:85:11: warning: incorrect type in assignment (different address spaces) block/compat_ioctl.c:85:11: expected unsigned long *[noderef] <asn:1>p block/compat_ioctl.c:85:11: got void [noderef] <asn:1>* block/compat_ioctl.c:91:21: warning: incorrect type in argument 1 (different address spaces) block/compat_ioctl.c:91:21: expected void const volatile [noderef] <asn:1>*<noident> block/compat_ioctl.c:91:21: got unsigned long *[noderef] <asn:1>p block/compat_ioctl.c:87:53: warning: dereference of noderef expression block/compat_ioctl.c:91:21: warning: dereference of noderef expression Fixes: commit d597580d3737 ("generic ...copy_..._user primitives") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-24block: remove blk_free_devt in add_partitionweiping zhang1-1/+0
put_device(pdev) will call pdev->type->release finally, and blk_free_devt has been called in part_release(), so remove it. Signed-off-by: weiping zhang <zhangweiping@didichuxing.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-24bio-integrity: Fix regression if profile verify_fn is NULLMilan Broz1-1/+4
In dm-integrity target we register integrity profile that have both generate_fn and verify_fn callbacks set to NULL. This is used if dm-integrity is stacked under a dm-crypt device for authenticated encryption (integrity payload contains authentication tag and IV seed). In this case the verification is done through own crypto API processing inside dm-crypt; integrity profile is only holder of these data. (And memory is owned by dm-crypt as well.) After the commit (and previous changes) Commit 7c20f11680a441df09de7235206f70115fbf6290 Author: Christoph Hellwig <hch@lst.de> Date: Mon Jul 3 16:58:43 2017 -0600 bio-integrity: stop abusing bi_end_io we get this crash: : BUG: unable to handle kernel NULL pointer dereference at (null) : IP: (null) : *pde = 00000000 ... : : Workqueue: kintegrityd bio_integrity_verify_fn : task: f48ae180 task.stack: f4b5c000 : EIP: (null) : EFLAGS: 00210286 CPU: 0 : EAX: f4b5debc EBX: 00001000 ECX: 00000001 EDX: 00000000 : ESI: 00001000 EDI: ed25f000 EBP: f4b5dee8 ESP: f4b5dea4 : DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 : CR0: 80050033 CR2: 00000000 CR3: 32823000 CR4: 001406d0 : Call Trace: : ? bio_integrity_process+0xe3/0x1e0 : bio_integrity_verify_fn+0xea/0x150 : process_one_work+0x1c7/0x5c0 : worker_thread+0x39/0x380 : kthread+0xd6/0x110 : ? process_one_work+0x5c0/0x5c0 : ? kthread_worker_fn+0x100/0x100 : ? kthread_worker_fn+0x100/0x100 : ret_from_fork+0x19/0x24 : Code: Bad EIP value. : EIP: (null) SS:ESP: 0068:f4b5dea4 : CR2: 0000000000000000 Patch just skip the whole verify workqueue if verify_fn is set to NULL. Fixes: 7c20f116 ("bio-integrity: stop abusing bi_end_io") Signed-off-by: Milan Broz <gmazyland@gmail.com> [hch: trivial whitespace fix] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23block, bfq: fix error handle in bfq_initweiping zhang1-1/+3
if elv_register fail, bfq_pool should be free. Signed-off-by: weiping zhang <zhangweiping@didichuxing.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23block: replace bi_bdev with a gendisk pointer and partitions indexChristoph Hellwig99-357/+358
This way we don't need a block_device structure to submit I/O. The block_device has different life time rules from the gendisk and request_queue and is usually only available when the block device node is open. Other callers need to explicitly create one (e.g. the lightnvm passthrough code, or the new nvme multipathing code). For the actual I/O path all that we need is the gendisk, which exists once per block device. But given that the block layer also does partition remapping we additionally need a partition index, which is used for said remapping in generic_make_request. Note that all the block drivers generally want request_queue or sometimes the gendisk, so this removes a layer of indirection all over the stack. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23block: cache the partition index in struct block_deviceChristoph Hellwig2-0/+2
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23block: add a __disk_get_part helperChristoph Hellwig2-13/+15
This helper allows looking up a partion under RCU protection without grabbing a reference to it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23block: reject attempts to allocate more than DISK_MAX_PARTS partitionsChristoph Hellwig1-0/+7
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23raid5: remove a call to get_start_sectChristoph Hellwig1-1/+3
The block layer always remaps partitions before calling into the ->make_request methods of drivers. Thus the call to get_start_sect in in_chunk_boundary will always return 0 and can be removed. Reviewed-by: Shaohua Li <shli@fb.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23btrfs: index check-integrity state hash by a dev_tChristoph Hellwig1-18/+13
We won't have the struct block_device available in the bio soon, so switch to the numerical dev_t instead of the block_device pointer for looking up the check-integrity state. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23skd: Change default interrupt mode to MSI-XBart Van Assche1-1/+1
Since MSI support on some motherboards is unreliable, change the default interrupt mode from MSI to MSI-X. This patch avoids that the following message appears sporadially in the kernel logs of my test setup: do_IRQ: 3.193 No irq handler for vector Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23skd: Avoid double completions in case of a timeoutBart Van Assche1-11/+11
Avoid that normal request completion and the timeout handler can run concurrently by calling blk_mq_complete_request() instead of blk_mq_end_request() from skd_end_request(). Avoid that the block layer can reuse a request while the firmware is still processing it. Convert skd_softirq_done() to blk-mq. Pass the pointer to skd_softirq_done() to the block layer core through blk_mq_ops.complete instead of by calling blk_queue_softirq_done(). Pass the pointer to skd_timed_out() to the block layer core through blk_mq_ops.timeout instead of by calling blk_queue_timed_out(). The timeout handler has been tested as follows: echo 1 > /sys/block/skd0/io-timeout-fail && (cd /sys/kernel/debug/fail_io_timeout && echo 100 > probability && echo N > task-filter && echo 1 > times) Fixes: commit a74d5b76fab9 ("skd: Switch to block layer timeout mechanism") Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>