summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2013-11-20Btrfs: fix lockdep error in async commitLiu Bo1-2/+2
Lockdep complains about btrfs's async commit: [ 2372.462171] [ BUG: bad unlock balance detected! ] [ 2372.462191] 3.12.0+ #32 Tainted: G W [ 2372.462209] ------------------------------------- [ 2372.462228] ceph-osd/14048 is trying to release lock (sb_internal) at: [ 2372.462275] [<ffffffffa022cb10>] btrfs_commit_transaction_async+0x1b0/0x2a0 [btrfs] [ 2372.462305] but there are no more locks to release! [ 2372.462324] [ 2372.462324] other info that might help us debug this: [ 2372.462349] no locks held by ceph-osd/14048. [ 2372.462367] [ 2372.462367] stack backtrace: [ 2372.462386] CPU: 2 PID: 14048 Comm: ceph-osd Tainted: G W 3.12.0+ #32 [ 2372.462414] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS 080015 11/09/2011 [ 2372.462455] ffffffffa022cb10 ffff88007490fd28 ffffffff816f094a ffff8800378aa320 [ 2372.462491] ffff88007490fd50 ffffffff810adf4c ffff8800378aa320 ffff88009af97650 [ 2372.462526] ffffffffa022cb10 ffff88007490fd88 ffffffff810b01ee ffff8800898c0000 [ 2372.462562] Call Trace: [ 2372.462584] [<ffffffffa022cb10>] ? btrfs_commit_transaction_async+0x1b0/0x2a0 [btrfs] [ 2372.462619] [<ffffffff816f094a>] dump_stack+0x45/0x56 [ 2372.462642] [<ffffffff810adf4c>] print_unlock_imbalance_bug+0xec/0x100 [ 2372.462677] [<ffffffffa022cb10>] ? btrfs_commit_transaction_async+0x1b0/0x2a0 [btrfs] [ 2372.462710] [<ffffffff810b01ee>] lock_release+0x18e/0x210 [ 2372.462742] [<ffffffffa022cb36>] btrfs_commit_transaction_async+0x1d6/0x2a0 [btrfs] [ 2372.462783] [<ffffffffa025a7ce>] btrfs_ioctl_start_sync+0x3e/0xc0 [btrfs] [ 2372.462822] [<ffffffffa025f1d3>] btrfs_ioctl+0x4c3/0x1f70 [btrfs] [ 2372.462849] [<ffffffff812c0321>] ? avc_has_perm+0x121/0x1b0 [ 2372.462873] [<ffffffff812c0224>] ? avc_has_perm+0x24/0x1b0 [ 2372.462897] [<ffffffff8107ecc8>] ? sched_clock_cpu+0xa8/0x100 [ 2372.462922] [<ffffffff8117b145>] do_vfs_ioctl+0x2e5/0x4e0 [ 2372.462946] [<ffffffff812c19e6>] ? file_has_perm+0x86/0xa0 [ 2372.462969] [<ffffffff8117b3c1>] SyS_ioctl+0x81/0xa0 [ 2372.462991] [<ffffffff817045a4>] tracesys+0xdd/0xe2 ==================================================== It's because that we don't do the right thing when checking if it's ok to tell lockdep that we're trying to release the rwsem. If the trans handle's type is TRANS_ATTACH, we won't acquire the freeze rwsem, but as TRANS_ATTACH fits the check (trans < TRANS_JOIN_NOLOCK), we'll release the freeze rwsem, which makes lockdep complains a lot. Reported-by: Ma Jianpeng <majianpeng@gmail.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-20Btrfs: avoid heavy operations in btrfs_commit_superLiu Bo1-20/+1
The 'git blame' history shows that, the old transaction commit code has to do twice to ensure roots are updated and we have to flush metadata and super block manually, however, right now all of these can be handled well inside the transaction commit code without extra efforts. And the error handling part remains same with the current code, -- 'return to caller once we get error'. This saves us a transaction commit and a flush of super block, which are both heavy operations according to ftrace output analysis. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-20Btrfs: fix __btrfs_start_workers retvalIlya Dryomov1-0/+1
__btrfs_start_workers returns 0 in case it raced with btrfs_stop_workers and lost the race. This is wrong because worker in this case is not allowed to start and is in fact destroyed. Return -EINVAL instead. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-20Btrfs: disable online raid-repair on ro mountsIlya Dryomov1-3/+8
This disables the "if needed, write the good copy back before the read is completed" part of the read sequence for read-only mounts. Cc: Jan Schmidt <list.btrfs@jan-o-sch.net> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-20Btrfs: do not inc uncorrectable_errors counter on ro scrubsIlya Dryomov1-2/+4
Currently if we discover an error when scrubbing in ro mode we a) blindly increment the uncorrectable_errors counter, and b) spam the dmesg with the 'unable to fixup (regular) error at ...' message, even though a) we haven't tried to determine if the error is correctable or not, and b) we haven't tried to fixup anything. Fix this. Cc: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-20Btrfs: only drop modified extents if we logged the whole inodeJosef Bacik1-1/+1
If we fsync, seek and write, rename and then fsync again we will lose the modified hole extent because the rename will drop all of the modified extents since we didn't do the fast search. We need to only drop the modified extents if we didn't do the fast search and we were logging the entire inode as we don't need them anymore, otherwise this is being premature. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-20Btrfs: make sure to copy everything if we renameJosef Bacik1-1/+2
If we rename a file that is already in the log and we fsync again we will lose the new name. This is because we just log the inode update and not the new ref. To fix this we just need to check if we are logging the new name of the inode and copy all the metadata instead of just updating the inode itself. With this patch my testcase now passes. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-20Btrfs: don't BUG_ON() if we get an error walking backrefsJosef Bacik1-1/+2
We can just return false for this so we stop doing the snapshot aware defrag stuff. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-15btrfs: get rid of fdentry()Al Viro2-9/+4
3 of 4 callers actually want file_inode()... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-15btrfs: fix empty_zero_page misusageChris Mason1-2/+7
Heiko Carstens noticed that btrfs was using empty_zero_page incorrectly. He explained: The definition of empty_zero_page is architecture specific. It is (currently) either a character array, an unsigned long containing the address of the empty_zero_page, or even worse only the address of the struct page belonging to the empty_zero_page. This commit changes btrfs to use a for-loop instead. On x86 the resulting .ko is smaller, and we're no longer worrying about how each arch builds its zeros. Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: rename btrfs_start_all_delalloc_inodesMiao Xie7-9/+7
rename the function -- btrfs_start_all_delalloc_inodes(), and make its name be compatible to btrfs_wait_ordered_roots(), since they are always used at the same place. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: don't wait for the completion of all the ordered extentsMiao Xie8-18/+31
It is very likely that there are lots of ordered extents in the filesytem, if we wait for the completion of all of them when we want to reclaim some space for the metadata space reservation, we would be blocked for a long time. The performance would drop down suddenly for a long time. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: don't wait for all the async delalloc when shrinking delallocMiao Xie1-2/+12
It was very likely that there were lots of async delalloc pages in the filesystem, if we waited until all the pages were flushed, we would be blocked for a long time, and the performance would also drop down. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: fix the confusion between delalloc bytes and metadata bytesMiao Xie1-0/+6
In shrink_delalloc(), what we need reclaim is the metadata space, so flushing pages by to_reclaim is not reasonable, it is very likely that the pages we flush are not enough. And then we had to invoke the flush function for several times, at the worst, we need call flush_space for several times. It wasted time. We improve this problem by converting the metadata space size we need reserve to the delalloc bytes, By this way, we can flush the pages by a reasonable number. (Now we use a fixed number to do conversion, it is not flexible, maybe we can find a good way to improve it in the future.) Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: pick up the code for the item number calculation in flush_space()Miao Xie1-9/+16
This patch picked up the code that was used to calculate the number of the items for which we need reserve space, and we will use it in the next patch. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: wait for the ordered extent only when we wantMiao Xie1-1/+2
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: remove unnecessary initialization and memory barrior in shrink_delalloc()Miao Xie1-4/+3
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: avoid unnecessary scrub workers allocationWang Shilong1-13/+10
We only allocate scrub workers if we pass all the necessary checks, for example, there are no operation in progress. Besides, move mutex lock protection outside of scrub_workers_get() /scrub_workers_put(). Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: check file extent type before anything elseJosef Bacik1-5/+5
I hit this problem with my no holes patch and it made me realize what the problem was for bz 60834. If the first item in the leaf is an inline extent and we try to read anything starting from disk_bytenr onward we will read off the end of the leaf. So we need to check to see what it's type is, and if it's not REG we can just break out. This should fix this problem. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11btrfs: Remove useless variable in write_ctree_super()Rashika1-4/+1
The function write_ctree_super() in disk-io.c uses variable ret to return the result of function write_all_supers(). Since, this variable serves no purpose, hence the patch removes it and returns the call of the called function. Reviewed-by: Zach Brown <zab@redhat.com> Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11btrfs: Fix checkpatch.pl warning of spacing issuesDulshani Gunawardhana9-19/+19
Fix spacing issues detected via checkpatch.pl in accordance with the kernel style guidelines. Signed-off-by: Dulshani Gunawardhana <dulshani.gunawardhana89@gmail.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11btrfs: Replace kmalloc with kmalloc_arrayDulshani Gunawardhana4-5/+5
Replace kmalloc(size * nr, ) with kmalloc_array(nr, size), thus making it easier to check is that the calculation doesn't wrap or return a smaller allocation Signed-off-by: Dulshani Gunawardhana <dulshani.gunawardhana89@gmail.com> Reviewed-by: Zach Brown <zab@redhat.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11btrfs: Enclose macros with complex values within parenthesisDulshani Gunawardhana1-4/+4
Enclose macros with complex values within parenthesis in accordance to checkpatch.pl. Signed-off-by: Dulshani Gunawardhana <dulshani.gunawardhana89@gmail.com> Reviewed-by: Zach Brown <zab@redhat.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11btrfs: Use WARN_ON()'s return value in place of WARN_ON(1)Dulshani Gunawardhana14-70/+28
Use WARN_ON()'s return value in place of WARN_ON(1) for cleaner source code that outputs a more descriptive warnings. Also fix the styling warning of redundant braces that came up as a result of this fix. Signed-off-by: Dulshani Gunawardhana <dulshani.gunawardhana89@gmail.com> Reviewed-by: Zach Brown <zab@redhat.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11btrfs: Remove redundant local zero structureDulshani Gunawardhana1-3/+2
Remove redundant local zero structure, replacing it by the kernel's global ZERO_PAGE. Signed-off-by: Dulshani Gunawardhana <dulshani.gunawardhana89@gmail.com> Reviewed-by: Zach Brown <zab@redhat.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11btrfs: Pack struct btrfs_deviceDulshani Gunawardhana1-10/+10
Pack the structure btrfs_device in volumes.h to eliminate holes detected by pahole, thus reducing binary memory footprint. Signed-off-by: Dulshani Gunawardhana <dulshani.gunawardhana89@gmail.com> Reviewed-by: Zach Brown <zab@redhat.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11btrfs: Replace multiple atomic_inc() with atomic_add()Rashika1-4/+4
This patch replaces multiple atomic_inc() with atomic_add() in delayed-inode.c to reduce source code and have few instructions for compilation. Reviewed-by: Zach Brown <zab@redhat.com> Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11btrfs: Add helper function for free_root_pointers()Rashika1-41/+19
The function free_root_pointers() in disk-io.h contains redundant code. Therefore, this patch adds a helper function free_root_extent_buffers() to free_root_pointers() to eliminate redundancy. Reviewed-by: Zach Brown <zab@redhat.com> Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: fix a crash when running balance and defrag concurrentlyLiu Bo1-0/+3
Running balance and defrag concurrently can end up with a crash: kernel BUG at fs/btrfs/relocation.c:4528! RIP: 0010:[<ffffffffa01ac33b>] [<ffffffffa01ac33b>] btrfs_reloc_cow_block+ 0x1eb/0x230 [btrfs] Call Trace: [<ffffffffa01398c1>] ? update_ref_for_cow+0x241/0x380 [btrfs] [<ffffffffa0180bad>] ? copy_extent_buffer+0xad/0x110 [btrfs] [<ffffffffa0139da1>] __btrfs_cow_block+0x3a1/0x520 [btrfs] [<ffffffffa013a0b6>] btrfs_cow_block+0x116/0x1b0 [btrfs] [<ffffffffa013ddad>] btrfs_search_slot+0x43d/0x970 [btrfs] [<ffffffffa0153c57>] btrfs_lookup_file_extent+0x37/0x40 [btrfs] [<ffffffffa0172a5e>] __btrfs_drop_extents+0x11e/0xae0 [btrfs] [<ffffffffa013b3fd>] ? generic_bin_search.constprop.39+0x8d/0x1a0 [btrfs] [<ffffffff8117d14a>] ? kmem_cache_alloc+0x1da/0x200 [<ffffffffa0138e7a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs] [<ffffffffa0173ef0>] btrfs_drop_extents+0x60/0x90 [btrfs] [<ffffffffa016b24d>] relink_extent_backref+0x2ed/0x780 [btrfs] [<ffffffffa0162fe0>] ? btrfs_submit_bio_hook+0x1e0/0x1e0 [btrfs] [<ffffffffa01b8ed7>] ? iterate_inodes_from_logical+0x87/0xa0 [btrfs] [<ffffffffa016b909>] btrfs_finish_ordered_io+0x229/0xac0 [btrfs] [<ffffffffa016c3b5>] finish_ordered_fn+0x15/0x20 [btrfs] [<ffffffffa018cbe5>] worker_loop+0x125/0x4e0 [btrfs] [<ffffffffa018cac0>] ? btrfs_queue_worker+0x300/0x300 [btrfs] [<ffffffff81075ea0>] kthread+0xc0/0xd0 [<ffffffff81075de0>] ? insert_kthread_work+0x40/0x40 [<ffffffff8164796c>] ret_from_fork+0x7c/0xb0 [<ffffffff81075de0>] ? insert_kthread_work+0x40/0x40 ---------------------------------------------------------------------- It turns out to be that balance operation will bump root's @last_snapshot, which enables snapshot-aware defrag path, and backref walking stuff will find data reloc tree as refs' parent, and hit the BUG_ON() during COW. As data reloc tree's data is just for relocation purpose, and will be deleted right after relocation is done, it's unnecessary to walk those refs belonged to data reloc tree, it'd be better to skip them. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: do not run snapshot-aware defragment on errorLiu Bo1-19/+28
If something wrong happens in write endio, running snapshot-aware defragment can end up with undefined results, maybe a crash, so we should avoid it. In order to share similar code, this also adds a helper to free the struct for snapshot-aware defrag. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: log recovery, don't unlink inode always on errorFilipe David Borba Manana1-1/+4
If we get any error while doing a dir index/item lookup in the log tree, we were always unlinking the corresponding inode in the subvolume. It makes sense to unlink only if the lookup failed to find the dir index/item, which corresponds to NULL or -ENOENT, and not when other errors happen (like a transient -ENOMEM or -EIO). Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: fix csum search offset/length calculation in log treeFilipe David Borba Manana1-7/+7
We were setting the csums search offset and length to the right values if the extent is compressed, but later on right before doing the csums lookup we were overriding these two parameters regardless of compression being set or not for the extent. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: fix verification of dir_itemFilipe David Borba Manana1-2/+4
We were ignoring the name component of the dir_item. Both the name and data must fit within BTRFS_MAX_XATTR_SIZE(root). Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: remove scrub_super_lock holding in btrfs_sync_log()Wang Shilong5-20/+8
Originally, we introduced scrub_super_lock to synchronize tree log code with scrubbing super. However we can replace scrub_super_lock with device_list_mutex, because writing super will hold this mutex, this will reduce an extra lock holding when writing supers in sync log code. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: use 'u64' rather than 'int' to get extent's generationWang Shilong1-1/+1
We define a 'int' to get extent's generation by mistake,fix it. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: fix the free space write out failure when there is no data spaceMiao Xie1-3/+12
After running space balance on a new fs, the fs check program outputed the following warning message: free space inode generation (0) did not match free space cache generation (20) Steps to reproduce: # mkfs.btrfs -f <dev> # mount <dev> <mnt> # btrfs balance start <mnt> # umount <mnt> # btrfs check <dev> It was because there was no data space after the space balance, and the free space write out task didn't try to allocate a new data chunk for the free space inode when doing the reservation. So the data space reservation failed, and in order to tell the free space loader that this free space inode could not be trusted, the generation of the free space inode wasn't updated. Then the check program found this problem and outputed the above message. But in fact, it is safe that we try to allocate a new data chunk when we find the data space is not enough. The patch fixes the above problem by this way. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: stop committing the transaction so much during relocateJosef Bacik1-20/+13
I noticed with my horrible snapshot excercisor that we were taking forever to relocate the larger the file system got. This appeared to be because we were committing the transaction _constantly_. There were a few places where we do braindead things with metadata reservation, like start a transaction and then try to refill the block rsv, which not only keeps us from committing a transaction during the enospc stuff, but keeps us from doing some of the harder flushing work which will make us more likely to need to commit the transaction. We also were checking the block rsv and committing the transaction if the block rsv was below a certain threshold, but we were doing this in a place where we don't actually keep anything in the block rsv so this was always ending up false so we always committed the transaction in this case. I tested this to make sure it didn't break anything, but it takes about 10 hours to get the box to this state so I don't know how much of an impact it will really make. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: make sure the delalloc workers actually flush compressed writesJosef Bacik1-6/+12
When using delalloc workers in a non-waiting way (like for enospc handling) we can end up not actually waiting for the dirty pages to be started if we have compression. We need to add an extra filemap flush to make sure any async extents that have started are actually moved along before returning. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: take ordered root lock when removing ordered operations inodeJosef Bacik1-0/+2
A user reported a list corruption warning from btrfs_remove_ordered_extent, it is because we aren't taking the ordered_root_lock when we remove the inode from the ordered operations list. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: don't abort transaction in run_delalloc_nocowJosef Bacik1-15/+5
This is just the write path, the only reason we start a transaction is so we can check cross references, we don't make any actual changes, so there is no reason to abort the transaction if we fail. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: do not bug_on if we try to cow a free space cache inodeJosef Bacik1-1/+4
We can just return an error and we'll bail out properly. We still want to catch this case to make sure we don't have a bug somewhere, so just warn if this pops up. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: return an error from btrfs_wait_ordered_rangeJosef Bacik6-41/+78
I noticed that if the free space cache has an error writing out it's data it won't actually error out, it will just carry on. This is because it doesn't check the return value of btrfs_wait_ordered_range, which didn't actually return anything. So fix this in order to keep us from making free space cache look valid when it really isnt. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: stop using vfs_read in sendJosef Bacik1-103/+72
Apparently we don't actually close the files until we return to userspace, so stop using vfs_read in send. This is actually better for us since we can avoid all the extra logic of holding the file we're sending open and making sure to clean it up. This will fix people who have been hitting too many files open errors when trying to send. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: check_int, remove warning for mixed-modeStefan Behrens1-4/+5
In mixed-mode, when a data-block was later reused for metadata, a warning was printed. This condition is now filtered out and the warning is eliminated in this case. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: fix check_int 'leaf item out of bounce' regressionStefan Behrens1-1/+1
Yet another cleanup patch broke code for which no xfstest exists. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11Btrfs: optimize extent item search in run_delayed_extent_opFilipe David Borba Manana1-7/+20
Instead of doing another extent tree search if the first search failed to find a metadata item, check if the previous item in the leaf is an extent item and use it if it is, otherwise do the second tree search for an extent item. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11btrfs: add tracing for failed reservationsJeff Mahoney1-0/+7
When debugging ENOSPC issues, it's nice to be able to see which reservations failed as well as the ones which succeeded. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11btrfs: remove fs/btrfs/compat.hZach Brown14-26/+6
fs/btrfs/compat.h only contained trivial macro wrappers of drop_nlink() and inc_nlink(). This doesn't belong in mainline. Signed-off-by: Zach Brown <zab@redhat.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11btrfs: remove move_pages()Zach Brown1-18/+1
move_pages() has an inefficient backwards byte copy of regions of two different pages. They're different pages so the regions won't overlap and it could use memcpy(). At that point, though, move_pages() would be a slightly dimmer re-implementation of copy_pages() that lacked the test for overlapping page regions. So remove move_pages() and just call copy_pages(). Signed-off-by: Zach Brown <zab@redhat.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-11-11btrfs: use get_seconds() instead of btrfs wrapperZach Brown1-12/+4
Signed-off-by: Zach Brown <zab@redhat.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>