summaryrefslogtreecommitdiffstats
path: root/fs/xfs/libxfs
AgeCommit message (Collapse)AuthorFilesLines
2020-01-26xfs: remove unnecessary null pointer checks from _read_agf callersDarrick J. Wong1-6/+0
Drop the null buffer pointer checks in all code that calls xfs_alloc_read_agf and doesn't pass XFS_ALLOC_FLAG_TRYLOCK because they're no longer necessary. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-26xfs: make xfs_*read_agf return EAGAIN to ALLOC_FLAG_TRYLOCK callersDarrick J. Wong2-27/+20
Refactor xfs_read_agf and xfs_alloc_read_agf to return EAGAIN if the caller passed TRYLOCK and we weren't able to get the lock; and change the callers to recognize this. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-26xfs: remove the xfs_btree_get_buf[ls] functionsDarrick J. Wong4-79/+18
Remove the xfs_btree_get_bufs and xfs_btree_get_bufl functions, since they're pretty trivial oneliners. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-26xfs: make xfs_trans_get_buf return an error codeDarrick J. Wong3-17/+27
Convert xfs_trans_get_buf() to return numeric error codes like most everywhere else in xfs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-26xfs: make xfs_trans_get_buf_map return an error codeDarrick J. Wong1-6/+2
Convert xfs_trans_get_buf_map() to return numeric error codes like most everywhere else in xfs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-26xfs: make xfs_buf_read return an error codeDarrick J. Wong1-4/+4
Convert xfs_buf_read() to return numeric error codes like most everywhere else in xfs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-26xfs: make xfs_buf_get_uncached return an error codeDarrick J. Wong1-9/+12
Convert xfs_buf_get_uncached() to return numeric error codes like most everywhere else in xfs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-26xfs: make xfs_buf_get return an error codeDarrick J. Wong2-7/+7
Convert xfs_buf_get() to return numeric error codes like most everywhere else in xfs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-26xfs: make xfs_buf_read_map return an error codeDarrick J. Wong2-14/+7
Convert xfs_buf_read_map() to return numeric error codes like most everywhere else in xfs. This involves moving the open-coded logic that reports metadata IO read / corruption errors and stales the buffer into xfs_buf_read_map so that the logic is all in one place. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-16xfs: make struct xfs_buf_log_format have a consistent sizeDarrick J. Wong1-5/+14
Increase XFS_BLF_DATAMAP_SIZE by 1 to fill in the implied padding at the end of struct xfs_buf_log_format. This makes the size consistent so that we can check it in xfs_ondisk.h, and will be needed once we start logging attribute values. On amd64 we get the following pahole: struct xfs_buf_log_format { short unsigned int blf_type; /* 0 2 */ short unsigned int blf_size; /* 2 2 */ short unsigned int blf_flags; /* 4 2 */ short unsigned int blf_len; /* 6 2 */ long long int blf_blkno; /* 8 8 */ unsigned int blf_map_size; /* 16 4 */ unsigned int blf_data_map[16]; /* 20 64 */ /* --- cacheline 1 boundary (64 bytes) was 20 bytes ago --- */ /* size: 88, cachelines: 2, members: 7 */ /* padding: 4 */ /* last cacheline: 24 bytes */ }; But on i386 we get the following: struct xfs_buf_log_format { short unsigned int blf_type; /* 0 2 */ short unsigned int blf_size; /* 2 2 */ short unsigned int blf_flags; /* 4 2 */ short unsigned int blf_len; /* 6 2 */ long long int blf_blkno; /* 8 8 */ unsigned int blf_map_size; /* 16 4 */ unsigned int blf_data_map[16]; /* 20 64 */ /* --- cacheline 1 boundary (64 bytes) was 20 bytes ago --- */ /* size: 84, cachelines: 2, members: 7 */ /* last cacheline: 20 bytes */ }; Notice how the amd64 compiler inserts 4 bytes of padding to the end of the structure to ensure 8-byte alignment. Prior to "xfs: fix memory corruption during remote attr value buffer invalidation" we would try to write to blf_data_map[17], which is harmless on amd64 but really bad on i386. This shouldn't cause any changes in the ondisk logging formats because the log code writes out the log vectors with the appropriate size for the log item's map_size, and log recovery treats the data_map array as a VLA. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-01-16xfs: streamline xfs_attr3_leaf_inactiveDarrick J. Wong1-9/+0
Now that we know we don't have to take a transaction to stale the incore buffers for a remote value, get rid of the unnecessary memory allocation in the leaf walker and call the rmt_stale function directly. Flatten the loop while we're at it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-01-16xfs: fix memory corruption during remote attr value buffer invalidationDarrick J. Wong1-6/+31
While running generic/103, I observed what looks like memory corruption and (with slub debugging turned on) a slub redzone warning on i386 when inactivating an inode with a 64k remote attr value. On a v5 filesystem, maximally sized remote attr values require one block more than 64k worth of space to hold both the remote attribute value header (64 bytes). On a 4k block filesystem this results in a 68k buffer; on a 64k block filesystem, this would be a 128k buffer. Note that even though we'll never use more than 65,600 bytes of this buffer, XFS_MAX_BLOCKSIZE is 64k. This is a problem because the definition of struct xfs_buf_log_format allows for XFS_MAX_BLOCKSIZE worth of dirty bitmap (64k). On i386 when we invalidate a remote attribute, xfs_trans_binval zeroes all 68k worth of the dirty map, writing right off the end of the log item and corrupting memory. We've gotten away with this on x86_64 for years because the compiler inserts a u32 padding on the end of struct xfs_buf_log_format. Fortunately for us, remote attribute values are written to disk with xfs_bwrite(), which is to say that they are not logged. Fix the problem by removing all places where we could end up creating a buffer log item for a remote attribute value and leave a note explaining why. Next, replace the open-coded buffer invalidation with a call to the helper we created in the previous patch that does better checking for bad metadata before marking the buffer stale. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-01-16xfs: refactor remote attr value buffer invalidationDarrick J. Wong2-20/+34
Hoist the code that invalidates remote extended attribute value buffers into a separate helper function. This prepares us for a memory corruption fix in the next patch. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-01-15xfs: Add __packed to xfs_dir2_sf_entry_t definitionVincenzo Frascino1-1/+1
xfs_check_ondisk_structs() verifies that the sizes of the data types used by xfs are correct via the XFS_CHECK_STRUCT_SIZE() macro. Since the structures padding can vary depending on the ABI (e.g. on ARM OABI structures are padded to multiple of 32 bits), it may happen that xfs_dir2_sf_entry_t size check breaks the compilation with the assertion below: In file included from linux/include/linux/string.h:6, from linux/include/linux/uuid.h:12, from linux/fs/xfs/xfs_linux.h:10, from linux/fs/xfs/xfs.h:22, from linux/fs/xfs/xfs_super.c:7: In function ‘xfs_check_ondisk_structs’, inlined from ‘init_xfs_fs’ at linux/fs/xfs/xfs_super.c:2025:2: linux/include/linux/compiler.h:350:38: error: call to ‘__compiletime_assert_107’ declared with attribute error: XFS: sizeof(xfs_dir2_sf_entry_t) is wrong, expected 3 _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__) Restore the correct behavior adding __packed to the structure definition. Cc: Darrick J. Wong <darrick.wong@oracle.com> Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-14xfs: introduce XFS_MAX_FILEOFFDarrick J. Wong1-0/+7
Introduce a new #define for the maximum supported file block offset. We'll use this in the next patch to make it more obvious that we're doing some operation for all possible inode fork mappings after a given offset. We can't use ULLONG_MAX here because bunmapi uses that to detect when it's done. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-01-09xfs: Remove all strlen in all xfs_attr_* functions for attr names.Allison Henderson2-7/+13
This helps to pre-simplify the extra handling of the null terminator in delayed operations which use memcpy rather than strlen. Later when we introduce parent pointers, attribute names will become binary, so strlen will not work at all. Removing uses of strlen now will help reduce complexities later Signed-off-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-09xfs: fix misuse of the XFS_ATTR_INCOMPLETE flagChristoph Hellwig4-6/+6
XFS_ATTR_INCOMPLETE is a flag in the on-disk attribute format, and thus in a different namespace as the ATTR_* flags in xfs_da_args.flags. Switch to using a XFS_DA_OP_INCOMPLETE flag in op_flags instead. Without this users might be able to inject this flag into operations using the attr by handle ioctl. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-09xfs: clear kernel only flags in XFS_IOC_ATTRMULTI_BY_HANDLEChristoph Hellwig1-2/+5
Don't allow passing arbitrary flags as they change behavior including memory allocation that the call stack is not prepared for. Fixes: ddbca70cc45c ("xfs: allocate xattr buffer on demand") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-07xfs: remove shadow variable in xfs_btree_lshiftEric Sandeen1-2/+0
Sparse warns about a shadow variable in this function after the Fixed: commit added another int i; with larger scope. It's safe to remove the one with the smaller scope to fix this shadow, although the shadow itself is harmless. Fixes: 2c813ad66a72 ("xfs: support btrees with overlapping intervals for keys") Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-12-20xfs: Make the symbol 'xfs_rtalloc_log_count' staticChen Wandun1-1/+1
Fix the following sparse warning: fs/xfs/libxfs/xfs_trans_resv.c:206:1: warning: symbol 'xfs_rtalloc_log_count' was not declared. Should it be static? Fixes: b1de6fc7520f ("xfs: fix log reservation overflows when allocating large rt extents") Signed-off-by: Chen Wandun <chenwandun@huawei.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-12-19xfs: don't commit sunit/swidth updates to disk if that would cause repair ↵Darrick J. Wong2-0/+65
failures Alex Lyakas reported[1] that mounting an xfs filesystem with new sunit and swidth values could cause xfs_repair to fail loudly. The problem here is that repair calculates the where mkfs should have allocated the root inode, based on the superblock geometry. The allocation decisions depend on sunit, which means that we really can't go updating sunit if it would lead to a subsequent repair failure on an otherwise correct filesystem. Port from xfs_repair some code that computes the location of the root inode and teach mount to skip the ondisk update if it would cause problems for repair. Along the way we'll update the documentation, provide a function for computing the minimum AGFL size instead of open-coding it, and cut down some indenting in the mount code. Note that we allow the mount to proceed (and new allocations will reflect this new geometry) because we've never screened this kind of thing before. We'll have to wait for a new future incompat feature to enforce correct behavior, alas. Note that the geometry reporting always uses the superblock values, not the incore ones, so that is what xfs_info and xfs_growfs will report. [1] https://lore.kernel.org/linux-xfs/20191125130744.GA44777@bfoster/T/#m00f9594b511e076e2fcdd489d78bc30216d72a7d Reported-by: Alex Lyakas <alex@zadara.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-12-19xfs: refactor agfl length computation functionDarrick J. Wong1-5/+13
Refactor xfs_alloc_min_freelist to accept a NULL @pag argument, in which case it returns the largest possible minimum length. This will be used in an upcoming patch to compute the length of the AGFL at mkfs time. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-12-19libxfs: resync with the userspace libxfsDarrick J. Wong4-24/+34
Prepare to resync the userspace libxfs with the kernel libxfs. There were a few things I missed -- a couple of static inline directory functions that have to be exported for xfs_repair; a couple of directory naming functions that make porting much easier if they're /not/ static inline; and a u16 usage that should have been uint16_t. None of these things are bugs in their own right; this just makes porting xfsprogs easier. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2019-12-17xfs: fix log reservation overflows when allocating large rt extentsDarrick J. Wong1-19/+77
Omar Sandoval reported that a 4G fallocate on the realtime device causes filesystem shutdowns due to a log reservation overflow that happens when we log the rtbitmap updates. Factor rtbitmap/rtsummary updates into the the tr_write and tr_itruncate log reservation calculation. "The following reproducer results in a transaction log overrun warning for me: mkfs.xfs -f -r rtdev=/dev/vdc -d rtinherit=1 -m reflink=0 /dev/vdb mount -o rtdev=/dev/vdc /dev/vdb /mnt fallocate -l 4G /mnt/foo Reported-by: Omar Sandoval <osandov@osandov.com> Tested-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-12-11xfs: stabilize insert range start boundary to avoid COW writeback raceBrian Foster1-2/+1
generic/522 (fsx) occasionally fails with a file corruption due to an insert range operation. The primary characteristic of the corruption is a misplaced insert range operation that differs from the requested target offset. The reason for this behavior is a race between the extent shift sequence of an insert range and a COW writeback completion that causes a front merge with the first extent in the shift. The shift preparation function flushes and unmaps from the target offset of the operation to the end of the file to ensure no modifications can be made and page cache is invalidated before file data is shifted. An insert range operation then splits the extent at the target offset, if necessary, and begins to shift the start offset of each extent starting from the end of the file to the start offset. The shift sequence operates at extent level and so depends on the preparation sequence to guarantee no changes can be made to the target range during the shift. If the block immediately prior to the target offset was dirty and shared, however, it can undergo writeback and move from the COW fork to the data fork at any point during the shift. If the block is contiguous with the block at the start offset of the insert range, it can front merge and alter the start offset of the extent. Once the shift sequence reaches the target offset, it shifts based on the latest start offset and silently changes the target offset of the operation and corrupts the file. To address this problem, update the shift preparation code to stabilize the start boundary along with the full range of the insert. Also update the existing corruption check to fail if any extent is shifted with a start offset behind the target offset of the insert range. This prevents insert from racing with COW writeback completion and fails loudly in the event of an unexpected extent shift. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-12-07Merge tag 'xfs-5.5-merge-17' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds1-12/+15
Pull xfs fixes from Darrick Wong: "Fix a couple of resource management errors and a hang: - fix a crash in the log setup code when log mounting fails - fix a hang when allocating space on the realtime device - fix a block leak when freeing space on the realtime device" * tag 'xfs-5.5-merge-17' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix mount failure crash on invalid iclog memory access xfs: don't check for AG deadlock for realtime files in bunmapi xfs: fix realtime file data space leak
2019-12-02xfs: don't check for AG deadlock for realtime files in bunmapiOmar Sandoval1-1/+1
Commit 5b094d6dac04 ("xfs: fix multi-AG deadlock in xfs_bunmapi") added a check in __xfs_bunmapi() to stop early if we would touch multiple AGs in the wrong order. However, this check isn't applicable for realtime files. In most cases, it just makes us do unnecessary commits. However, without the fix from the previous commit ("xfs: fix realtime file data space leak"), if the last and second-to-last extents also happen to have different "AG numbers", then the break actually causes __xfs_bunmapi() to return without making any progress, which sends xfs_itruncate_extents_flags() into an infinite loop. Fixes: 5b094d6dac04 ("xfs: fix multi-AG deadlock in xfs_bunmapi") Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-12-02xfs: fix realtime file data space leakOmar Sandoval1-11/+14
Realtime files in XFS allocate extents in rextsize units. However, the written/unwritten state of those extents is still tracked in blocksize units. Therefore, a realtime file can be split up into written and unwritten extents that are not necessarily aligned to the realtime extent size. __xfs_bunmapi() has some logic to handle these various corner cases. Consider how it handles the following case: 1. The last extent is unwritten. 2. The last extent is smaller than the realtime extent size. 3. startblock of the last extent is not aligned to the realtime extent size, but startblock + blockcount is. In this case, __xfs_bunmapi() calls xfs_bmap_add_extent_unwritten_real() to set the second-to-last extent to unwritten. This should merge the last and second-to-last extents, so __xfs_bunmapi() moves on to the second-to-last extent. However, if the size of the last and second-to-last extents combined is greater than MAXEXTLEN, xfs_bmap_add_extent_unwritten_real() does not merge the two extents. When that happens, __xfs_bunmapi() skips past the last extent without unmapping it, thus leaking the space. Fix it by only unwriting the minimum amount needed to align the last extent to the realtime extent size, which is guaranteed to merge with the last extent. Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-12-02Merge tag 'xfs-5.5-merge-16' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds42-3271/+3324
Pull XFS updates from Darrick Wong: "For this release, we changed quite a few things. Highlights: - Fixed some long tail latency problems in the block allocator - Removed some long deprecated (and for the past several years no-op) mount options and ioctls - Strengthened the extended attribute and directory verifiers - Audited and fixed all the places where we could return EFSCORRUPTED without logging anything - Refactored the old SGI space allocation ioctls to make the equivalent fallocate calls - Fixed a race between fallocate and directio - Fixed an integer overflow when files have more than a few billion(!) extents - Fixed a longstanding bug where quota accounting could be incorrect when performing unwritten extent conversion on a freshly mounted fs - Fixed various complaints in scrub about soft lockups and unresponsiveness to signals - De-vtable'd the directory handling code, which should make it faster - Converted to the new mount api, for better or for worse - Cleaned up some memory leaks and quite a lot of other smaller fixes and cleanups. A more detailed summary: - Fill out the build string - Prevent inode fork extent count overflows - Refactor the allocator to reduce long tail latency - Rework incore log locking a little to reduce spinning - Break up the xfs_iomap_begin functions into smaller more cohesive parts - Fix allocation alignment being dropped too early when the allocation request is for more blocks than an AG is large - Other small cleanups - Clean up file buftarg retrieval helpers - Hoist the resvsp and unresvsp ioctls to the vfs - Remove the undocumented biosize mount option, since it has never been mentioned as existing or supported on linux - Clean up some of the mount option printing and parsing - Enhance attr leaf verifier to check block structure - Check dirent and attr names for invalid characters before passing them to the vfs - Refactor open-coded bmbt walking - Fix a few places where we return EIO instead of EFSCORRUPTED after failing metadata sanity checks - Fix a synchronization problem between fallocate and aio dio corrupting the file length - Clean up various loose ends in the iomap and bmap code - Convert to the new mount api - Make sure we always log something when returning EFSCORRUPTED - Fix some problems where long running scrub loops could trigger soft lockup warnings and/or fail to exit due to fatal signals pending - Fix various Coverity complaints - Remove most of the function pointers from the directory code to reduce indirection penalties - Ensure that dquots are attached to the inode when performing unwritten extent conversion after io - Deuglify incore projid and crtime types - Fix another AGI/AGF locking order deadlock when renaming - Clean up some quota typedefs - Remove the FSSETDM ioctls which haven't done anything in 20 years - Fix some memory leaks when mounting the log fails - Fix an underflow when updating an xattr leaf freemap - Remove some trivial wrappers - Report metadata corruption as an error, not a (potentially) fatal assertion - Clean up the dir/attr buffer mapping code - Allow fatal signals to kill scrub during parent pointer checks" * tag 'xfs-5.5-merge-16' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (198 commits) xfs: allow parent directory scans to be interrupted with fatal signals xfs: remove the mappedbno argument to xfs_da_get_buf xfs: remove the mappedbno argument to xfs_da_read_buf xfs: split xfs_da3_node_read xfs: remove the mappedbno argument to xfs_dir3_leafn_read xfs: remove the mappedbno argument to xfs_dir3_leaf_read xfs: remove the mappedbno argument to xfs_attr3_leaf_read xfs: remove the mappedbno argument to xfs_da_reada_buf xfs: improve the xfs_dabuf_map calling conventions xfs: refactor xfs_dabuf_map xfs: simplify mappedbno handling in xfs_da_{get,read}_buf xfs: report corruption only as a regular error xfs: Remove kmem_zone_free() wrapper xfs: Remove kmem_zone_destroy() wrapper xfs: Remove slab init wrappers xfs: fix attr leaf header freemap.size underflow xfs: fix some memory leaks in log recovery xfs: fix another missing include xfs: remove XFS_IOC_FSSETDM and XFS_IOC_FSSETDM_BY_HANDLE xfs: remove duplicated include from xfs_dir2_data.c ...
2019-11-30Merge tag 'iomap-5.5-merge-11' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds2-6/+11
Pull iomap updates from Darrick Wong: "In this release, we hoisted as much of XFS' writeback code into iomap as was practicable, refactored the unshare file data function, added the ability to perform buffered io copy on write, and tweaked various parts of the directio implementation as needed to port ext4's directio code (that will be a separate pull). Summary: - Make iomap_dio_rw callers explicitly tell us if they want us to wait - Port the xfs writeback code to iomap to complete the buffered io library functions - Refactor the unshare code to share common pieces - Add support for performing copy on write with buffered writes - Other minor fixes - Fix unchecked return in iomap_bmap - Fix a type casting bug in a ternary statement in iomap_dio_bio_actor - Improve tracepoints for easier diagnostic ability - Fix pipe page leakage in directio reads" * tag 'iomap-5.5-merge-11' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (31 commits) iomap: Fix pipe page leakage during splicing iomap: trace iomap_appply results iomap: fix return value of iomap_dio_bio_actor on 32bit systems iomap: iomap_bmap should check iomap_apply return value iomap: Fix overflow in iomap_page_mkwrite fs/iomap: remove redundant check in iomap_dio_rw() iomap: use a srcmap for a read-modify-write I/O iomap: renumber IOMAP_HOLE to 0 iomap: use write_begin to read pages to unshare iomap: move the zeroing case out of iomap_read_page_sync iomap: ignore non-shared or non-data blocks in xfs_file_dirty iomap: always use AOP_FLAG_NOFS in iomap_write_begin iomap: remove the unused iomap argument to __iomap_write_end iomap: better document the IOMAP_F_* flags iomap: enhance writeback error message iomap: pass a struct page to iomap_finish_page_writeback iomap: cleanup iomap_ioend_compare iomap: move struct iomap_page out of iomap.h iomap: warn on inline maps in iomap_writepage_map iomap: lift the xfs writeback code to iomap ...
2019-11-22xfs: remove the mappedbno argument to xfs_da_get_bufChristoph Hellwig6-22/+9
Use the xfs_da_get_buf_daddr function directly for the two callers that pass a mapped disk address, and then remove the mappedbno argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-22xfs: remove the mappedbno argument to xfs_da_read_bufChristoph Hellwig8-43/+39
Move the code for reading an already mapped block into xfs_da3_node_read_mapped, which is the only caller ever passing a block number in the mappedbno argument and replace the mappedbno argument with the simple xfs_dabuf_get flags. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-22xfs: split xfs_da3_node_readChristoph Hellwig3-56/+75
Split xfs_da3_node_read into one variant that always looks up the daddr and doesn't accept holes, and one that already has a daddr at hand. This is in preparation of splitting up xfs_da_read_buf in a similar way. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-22xfs: remove the mappedbno argument to xfs_dir3_leafn_readChristoph Hellwig3-7/+5
This argument is always hard coded to -1, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-22xfs: remove the mappedbno argument to xfs_dir3_leaf_readChristoph Hellwig2-7/+6
This argument is always hard coded to -1, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-22xfs: remove the mappedbno argument to xfs_attr3_leaf_readChristoph Hellwig3-16/+14
This argument is always hard coded to -1, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-22xfs: remove the mappedbno argument to xfs_da_reada_bufChristoph Hellwig4-15/+9
Replace the mappedbno argument with the simple flags for xfs_da_reada_buf and xfs_dir3_data_readahead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-22xfs: improve the xfs_dabuf_map calling conventionsChristoph Hellwig2-29/+17
Use a flags argument with the XFS_DABUF_MAP_HOLE_OK flag to signal that a hole is okay and not corruption, and return 0 with *nmap set to 0 to signal that case in the return value instead of a nameless -1 return code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-22xfs: refactor xfs_dabuf_mapChristoph Hellwig1-102/+54
Merge xfs_buf_map_from_irec and xfs_da_map_covers_blocks into a single loop in the caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-22xfs: simplify mappedbno handling in xfs_da_{get,read}_bufChristoph Hellwig1-52/+51
Shortcut the creation of xfs_bmbt_irec and xfs_buf_map for the case where the callers passed an already mapped xfs_daddr_t. This is in preparation for splitting these cases out entirely later. Also reject the mappedbno case for xfs_da_reada_buf as no callers currently uses it and it will be removed soon. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-18xfs: Remove kmem_zone_free() wrapperCarlos Maiolino3-6/+6
We can remove it now, without needing to rework the KM_ flags. Use kmem_cache_free() directly. Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-15xfs: fix attr leaf header freemap.size underflowBrian Foster1-1/+3
The leaf format xattr addition helper xfs_attr3_leaf_add_work() adjusts the block freemap in a couple places. The first update drops the size of the freemap that the caller had already selected to place the xattr name/value data. Before the function returns, it also checks whether the entries array has encroached on a freemap range by virtue of the new entry addition. This is necessary because the entries array grows from the start of the block (but end of the block header) towards the end of the block while the name/value data grows from the end of the block in the opposite direction. If the associated freemap is already empty, however, size is zero and the subtraction underflows the field and causes corruption. This is reproduced rarely by generic/070. The observed behavior is that a smaller sized freemap is aligned to the end of the entries list, several subsequent xattr additions land in larger freemaps and the entries list expands into the smaller freemap until it is fully consumed and then underflows. Note that it is not otherwise a corruption for the entries array to consume an empty freemap because the nameval list (i.e. the firstused pointer in the xattr header) starts beyond the end of the corrupted freemap. Update the freemap size modification to account for the fact that the freemap entry can be empty and thus stale. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-13xfs: remove duplicated include from xfs_dir2_data.cYueHaibing1-1/+0
Remove duplicated include. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-13xfs: remove unused structure members & simple typedefsEric Sandeen1-2/+0
Remove some unused typedef'd simple types, and some unused structure members. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-13xfs: remove unused typedef definitionsEric Sandeen2-4/+4
Remove some typdefs for type_t's that are no longer referred to by their typedef'd types. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-13xfs: remove the xfs_qoff_logitem_t typedefPavel Reichl1-2/+2
Signed-off-by: Pavel Reichl <preichl@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: fix a comment] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-13xfs: remove the xfs_disk_dquot_t and xfs_dquot_tPavel Reichl3-10/+10
Signed-off-by: Pavel Reichl <preichl@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: fix some of the comments] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-13xfs: avoid time_t in user apiArnd Bergmann1-1/+1
The ioctl definitions for XFS_IOC_SWAPEXT, XFS_IOC_FSBULKSTAT and XFS_IOC_FSBULKSTAT_SINGLE are part of libxfs and based on time_t. The definition for time_t differs between current kernels and coming 32-bit libc variants that define it as 64-bit. For most ioctls, that means the kernel has to be able to handle two different command codes based on the different structure sizes. The same solution could be applied for XFS_IOC_SWAPEXT, but it would not work for XFS_IOC_FSBULKSTAT and XFS_IOC_FSBULKSTAT_SINGLE because the structure with the time_t is passed through an indirect pointer, and the command number itself is based on struct xfs_fsop_bulkreq, which does not differ based on time_t. This means any solution that can be applied requires a change of the ABI definition in the xfs_fs.h header file, as well as doing the same change in any user application that contains a copy of this header. The usual solution would be to define a replacement structure and use conditional compilation for the ioctl command codes to use one or the other, such as #define XFS_IOC_FSBULKSTAT_OLD _IOWR('X', 101, struct xfs_fsop_bulkreq) #define XFS_IOC_FSBULKSTAT_NEW _IOWR('X', 129, struct xfs_fsop_bulkreq) #define XFS_IOC_FSBULKSTAT ((sizeof(time_t) == sizeof(__kernel_long_t)) ? \ XFS_IOC_FSBULKSTAT_OLD : XFS_IOC_FSBULKSTAT_NEW) After this, the kernel would be able to implement both XFS_IOC_FSBULKSTAT_OLD and XFS_IOC_FSBULKSTAT_NEW handlers on 32-bit architectures with the correct ABI for either definition of time_t. However, as long as two observations are true, a much simpler solution can be used: 1. xfsprogs is the only user space project that has a copy of this header 2. xfsprogs already has a replacement for all three affected ioctl commands, based on the xfs_bulkstat structure to pass 64-bit timestamps regardless of the architecture Based on those assumptions, changing xfs_bstime to use __kernel_long_t instead of time_t in both the kernel and in xfsprogs preserves the current ABI for any libc definition of time_t and solves the problem of passing 64-bit timestamps to 32-bit user space. If either of the two assumptions is invalid, more discussion is needed for coming up with a way to fix as much of the affected user space code as possible. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-13xfs: Fix deadlock between AGI and AGF when target_ip exists in xfs_rename()kaixuxia2-5/+25
When target_ip exists in xfs_rename(), the xfs_dir_replace() call may need to hold the AGF lock to allocate more blocks, and then invoking the xfs_droplink() call to hold AGI lock to drop target_ip onto the unlinked list, so we get the lock order AGF->AGI. This would break the ordering constraint on AGI and AGF locking - inode allocation locks the AGI, then can allocate a new extent for new inodes, locking the AGF after the AGI. In this patch we check whether the replace operation need more blocks firstly. If so, acquire the agi lock firstly to preserve locking order(AGI/AGF). Actually, the locking order problem only occurs when we are locking the AGI/AGF of the same AG. For multiple AGs the AGI lock will be released after the transaction committed. Signed-off-by: kaixuxia <kaixuxia@tencent.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: reword the comment] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-13xfs: don't reset the "inode core" in xfs_ireadChristoph Hellwig1-2/+0
We have the exact same memset in xfs_inode_alloc, which is always called just before xfs_iread. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>