diff options
author | Mike Snitzer <snitzer@redhat.com> | 2020-05-27 16:32:51 -0400 |
---|---|---|
committer | Mike Snitzer <snitzer@redhat.com> | 2020-06-05 14:59:56 -0400 |
commit | 553ec94cb4b4937b48f81e27de33f71325d1a227 (patch) | |
tree | a344fd73d12ab0e6fb78fc270832163d314a76ca | |
parent | a862e4e2154289fc12aa9e70f33614d9c70f3be4 (diff) | |
download | linux-553ec94cb4b4937b48f81e27de33f71325d1a227.tar.bz2 |
dm mpath: restrict queue_if_no_path state machine
Do not allow saving disabled queue_if_no_path if already saved as
enabled; implies multiple suspends (which shouldn't ever happen). Log
if this unlikely scenario is ever triggered.
Also, only write MPATHF_SAVED_QUEUE_IF_NO_PATH during presuspend or if
"fail_if_no_path" message. MPATHF_SAVED_QUEUE_IF_NO_PATH is no longer
always modified, e.g.: even if queue_if_no_path()'s save_old_value
argument wasn't set. This just implies a bit tighter control over
the management of MPATHF_SAVED_QUEUE_IF_NO_PATH. Side-effect is
multipath_resume() doesn't reset MPATHF_QUEUE_IF_NO_PATH unless
MPATHF_SAVED_QUEUE_IF_NO_PATH was set (during presuspend); and at that
time the MPATHF_SAVED_QUEUE_IF_NO_PATH bit gets cleared. So
MPATHF_SAVED_QUEUE_IF_NO_PATH's use is much more narrow in scope.
Last, but not least, do _not_ disable queue_if_no_path during noflush
suspend. There is no need/benefit to saving off queue_if_no_path via
MPATHF_SAVED_QUEUE_IF_NO_PATH and clearing MPATHF_QUEUE_IF_NO_PATH for
noflush suspend -- by avoiding this needless queue_if_no_path flag
churn there is less potential for MPATHF_QUEUE_IF_NO_PATH to get lost.
Which avoids potential for IOs to be errored back up to userspace
during DM multipath's handling of path failures.
That said, this last change papers over a reported issue concerning
request-based dm-multipath's interaction with blk-mq, relative to
suspend and resume: multipath_endio is being called _before_
multipath_resume. This should never happen if DM suspend's
blk_mq_quiesce_queue() + dm_wait_for_completion() is genuinely waiting
for all inflight blk-mq requests to complete. Similarly:
drivers/md/dm.c:__dm_resume() clearly calls dm_table_resume_targets()
_before_ dm_start_queue()'s blk_mq_unquiesce_queue() is called. If
the queue isn't even restarted until after multipath_resume(); the BIG
question that still needs answering is: how can multipath_end_io beat
multipath_resume in a race!?
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-rw-r--r-- | drivers/md/dm-mpath.c | 38 |
1 files changed, 28 insertions, 10 deletions
diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index 4c34d037aa35..bc846cf7b0d8 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -695,12 +695,25 @@ static int queue_if_no_path(struct multipath *m, bool queue_if_no_path, bool save_old_value) { unsigned long flags; + bool queue_if_no_path_bit, saved_queue_if_no_path_bit; spin_lock_irqsave(&m->lock, flags); - assign_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags, - (save_old_value && test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags)) || - (!save_old_value && queue_if_no_path)); + + queue_if_no_path_bit = test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags); + saved_queue_if_no_path_bit = test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags); + + if (save_old_value) { + if (unlikely(!queue_if_no_path_bit && saved_queue_if_no_path_bit)) { + DMERR("%s: QIFNP disabled but saved as enabled, saving again loses state, not saving!", + dm_device_name(dm_table_get_md(m->ti->table))); + } else + assign_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags, queue_if_no_path_bit); + } else if (!queue_if_no_path && saved_queue_if_no_path_bit) { + /* due to "fail_if_no_path" message, need to honor it. */ + clear_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags); + } assign_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags, queue_if_no_path); + spin_unlock_irqrestore(&m->lock, flags); if (!queue_if_no_path) { @@ -1653,16 +1666,19 @@ done: } /* - * Suspend can't complete until all the I/O is processed so if - * the last path fails we must error any remaining I/O. - * Note that if the freeze_bdev fails while suspending, the - * queue_if_no_path state is lost - userspace should reset it. + * Suspend with flush can't complete until all the I/O is processed + * so if the last path fails we must error any remaining I/O. + * - Note that if the freeze_bdev fails while suspending, the + * queue_if_no_path state is lost - userspace should reset it. + * Otherwise, during noflush suspend, queue_if_no_path will not change. */ static void multipath_presuspend(struct dm_target *ti) { struct multipath *m = ti->private; - queue_if_no_path(m, false, true); + /* FIXME: bio-based shouldn't need to always disable queue_if_no_path */ + if (m->queue_mode == DM_TYPE_BIO_BASED || !dm_noflush_suspending(m->ti)) + queue_if_no_path(m, false, true); } static void multipath_postsuspend(struct dm_target *ti) @@ -1683,8 +1699,10 @@ static void multipath_resume(struct dm_target *ti) unsigned long flags; spin_lock_irqsave(&m->lock, flags); - assign_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags, - test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags)); + if (test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags)) { + set_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags); + clear_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags); + } spin_unlock_irqrestore(&m->lock, flags); } |