summaryrefslogtreecommitdiffstats
path: root/fs/btrfs
diff options
context:
space:
mode:
authorQu Wenruo <wqu@suse.com>2021-04-06 08:27:18 +0800
committerDavid Sterba <dsterba@suse.com>2021-06-21 15:19:08 +0200
commit266a258678b9f254647f4297843cfbfbddde220a (patch)
treedb635d27ef532030806f5cef10d29ff4362ed66f /fs/btrfs
parente65f152e43484807b4caf7300e70d882e4652566 (diff)
downloadlinux-266a258678b9f254647f4297843cfbfbddde220a.tar.bz2
btrfs: update comments in btrfs_invalidatepage()
The existing comments in btrfs_invalidatepage() don't really get to the point, especially for what Private2 is really representing and how the race avoidance is done. The truth is, there are only three entrances to do ordered extent accounting: - btrfs_writepage_endio_finish_ordered() - __endio_write_update_ordered() Those two entrance are just endio functions for dio and buffered write. - btrfs_invalidatepage() But there is a pitfall, in endio functions there is no check on whether the ordered extent is already accounted. They just blindly clear the Private2 bit and do the accounting. So it's all btrfs_invalidatepage()'s responsibility to make sure we won't do double account for the same sector. That's why in btrfs_invalidatepage() we have to wait for page writeback, this will ensure all submitted bios have finished, thus their endio functions have finished the accounting on the ordered extent. Then we also check page Private2 to ensure that, we only run ordered extent accounting on pages who has no bio submitted. This patch will rework related comments to make it more clear on the race and how we use wait_on_page_writeback() and Private2 to prevent double accounting on ordered extent. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Diffstat (limited to 'fs/btrfs')
-rw-r--r--fs/btrfs/inode.c21
1 files changed, 15 insertions, 6 deletions
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index c6243d242bc9..e86a6113e149 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8329,11 +8329,16 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset,
bool completed_ordered = false;
/*
- * we have the page locked, so new writeback can't start,
- * and the dirty bit won't be cleared while we are here.
+ * We have page locked so no new ordered extent can be created on this
+ * page, nor bio can be submitted for this page.
*
- * Wait for IO on this page so that we can safely clear
- * the PagePrivate2 bit and do ordered accounting
+ * But already submitted bio can still be finished on this page.
+ * Furthermore, endio function won't skip page which has Private2
+ * already cleared, so it's possible for endio and invalidatepage to do
+ * the same ordered extent accounting twice on one page.
+ *
+ * So here we wait for any submitted bios to finish, so that we won't
+ * do double ordered extent accounting on the same page.
*/
wait_on_page_writeback(page);
@@ -8363,8 +8368,12 @@ again:
EXTENT_LOCKED | EXTENT_DO_ACCOUNTING |
EXTENT_DEFRAG, 1, 0, &cached_state);
/*
- * whoever cleared the private bit is responsible
- * for the finish_ordered_io
+ * A page with Private2 bit means no bio has been submitted
+ * covering the page, thus we have to manually do the ordered
+ * extent accounting.
+ *
+ * For page without Private2, the ordered extent accounting is
+ * done in its endio function of the submitted bio.
*/
if (TestClearPagePrivate2(page)) {
spin_lock_irq(&inode->ordered_tree.lock);