btrfs: ensure relocation never runs while we have send operations running

Relocation and send do not play well together because while send is running a block group can be relocated, a transaction committed and the respective disk extents get re-allocated and written to or discarded while send is about to do something with the extents. This was explained in commit 9e967495e0e0ae ("Btrfs: prevent send failures and crashes due to concurrent relocation"), which prevented balance and send from running in parallel but it did not address one remaining case where chunk relocation can happen: shrinking a device (and device deletion which shrinks a device's size to 0 before deleting the device). We also have now one more case where relocation is triggered: on zoned filesystems partially used block groups get relocated by a background thread, introduced in commit 18bb8bbf13c183 ("btrfs: zoned: automatically reclaim zones"). So make sure that instead of preventing balance from running when there are ongoing send operations, we prevent relocation from happening. This uses the infrastructure recently added by a patch that has the subject: "btrfs: add cancellable chunk relocation support". Also it adds a spinlock used exclusively for the exclusivity between send and relocation, as before fs_info->balance_mutex was used, which would make an attempt to run send to block waiting for balance to finish, which can take a lot of time on large filesystems. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
author: Filipe Manana <fdmanana@suse.com> 2021-06-21 11:10:38 +0100
committer: David Sterba <dsterba@suse.com> 2021-06-22 14:11:58 +0200
commit: 1cea5cf0e664290cc917da9a2c1f8df3716891cd (patch)
tree: fa5549809528f500b2971565e3b5ec9efe8f48e9 /fs/btrfs/relocation.c
parent: cbeaae4f6f6e787b7dac6230a31d9ad93d594f95 (diff)
download: linux-1cea5cf0e664290cc917da9a2c1f8df3716891cd.tar.bz2
1 files changed, 13 insertions, 0 deletions
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 420a89869889..fc831597cb22 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -3789,14 +3789,25 @@ out:
  *   0             success
  *   -EINPROGRESS  operation is already in progress, that's probably a bug
  *   -ECANCELED    cancellation request was set before the operation started
+ *   -EAGAIN       can not start because there are ongoing send operations
  */
 static int reloc_chunk_start(struct btrfs_fs_info *fs_info)
 {
+	spin_lock(&fs_info->send_reloc_lock);
+	if (fs_info->send_in_progress) {
+		btrfs_warn_rl(fs_info,
+"cannot run relocation while send operations are in progress (%d in progress)",
+			      fs_info->send_in_progress);
+		spin_unlock(&fs_info->send_reloc_lock);
+		return -EAGAIN;
+	}
 	if (test_and_set_bit(BTRFS_FS_RELOC_RUNNING, &fs_info->flags)) {
 		/* This should not happen */
+		spin_unlock(&fs_info->send_reloc_lock);
 		btrfs_err(fs_info, "reloc already running, cannot start");
 		return -EINPROGRESS;
 	}
+	spin_unlock(&fs_info->send_reloc_lock);
 
 	if (atomic_read(&fs_info->reloc_cancel_req) > 0) {
 		btrfs_info(fs_info, "chunk relocation canceled on start");
@@ -3818,7 +3829,9 @@ static void reloc_chunk_end(struct btrfs_fs_info *fs_info)
 	/* Requested after start, clear bit first so any waiters can continue */
 	if (atomic_read(&fs_info->reloc_cancel_req) > 0)
 		btrfs_info(fs_info, "chunk relocation canceled during operation");
+	spin_lock(&fs_info->send_reloc_lock);
 	clear_and_wake_up_bit(BTRFS_FS_RELOC_RUNNING, &fs_info->flags);
+	spin_unlock(&fs_info->send_reloc_lock);
 	atomic_set(&fs_info->reloc_cancel_req, 0);
 }
author	Filipe Manana <fdmanana@suse.com>	2021-06-21 11:10:38 +0100
committer	David Sterba <dsterba@suse.com>	2021-06-22 14:11:58 +0200
commit	1cea5cf0e664290cc917da9a2c1f8df3716891cd (patch)
tree	fa5549809528f500b2971565e3b5ec9efe8f48e9 /fs/btrfs/relocation.c
parent	cbeaae4f6f6e787b7dac6230a31d9ad93d594f95 (diff)
download	linux-1cea5cf0e664290cc917da9a2c1f8df3716891cd.tar.bz2