summaryrefslogtreecommitdiffstats
path: root/fs/f2fs
AgeCommit message (Collapse)AuthorFilesLines
2021-07-13f2fs: compress: fix to set zstd compress level correctlyChao Yu1-1/+2
As 5kft reported in [1]: set_compress_context() should set compress level into .i_compress_flag for zstd as well as lz4hc, otherwise, zstd compressor will still use default zstd compress level during compression, fix it. [1] https://lore.kernel.org/linux-f2fs-devel/8e29f52b-6b0d-45ec-9520-e63eb254287a@www.fastmail.com/T/#u Fixes: 3fde13f817e2 ("f2fs: compress: support compress level") Reported-by: 5kft <5kft@5kft.org> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-07-13f2fs: add sysfs nodes to get GC info for each GC modeDaeho Jeong4-0/+43
Added gc_reclaimed_segments and gc_segment_mode sysfs nodes. 1) "gc_reclaimed_segments" shows how many segments have been reclaimed by GC during a specific GC mode. 2) "gc_segment_mode" is used to control for which gc mode the "gc_reclaimed_segments" node shows. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-07-13f2fs: Convert to using invalidate_lockJan Kara4-38/+34
Use invalidate_lock instead of f2fs' private i_mmap_sem. The intended purpose is exactly the same. By this conversion we fix a long standing race between hole punching and read(2) / readahead(2) paths that can lead to stale page cache contents. CC: Jaegeuk Kim <jaegeuk@kernel.org> CC: Chao Yu <yuchao0@huawei.com> CC: linux-f2fs-devel@lists.sourceforge.net Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz>
2021-07-06f2fs: drop dirty node pages when cp is in error statusJaegeuk Kim1-7/+4
Otherwise, writeback is going to fall in a loop to flush dirty inode forever before getting SBI_CLOSING. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-07-05f2fs: initialize page->private when using for our internal useJaegeuk Kim2-1/+6
We need to guarantee it's initially zero. Otherwise, it'll hurt entire flag operations. Fixes: b763f3bedc2d ("f2fs: restructure f2fs page.private layout") Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-07-01f2fs: compress: add nocompress extensions supportFengnan Chang3-6/+95
When we create a directory with enable compression, all file write into directory will try to compress.But sometimes we may know, new file cannot meet compression ratio requirements. We need a nocompress extension to skip those files to avoid unnecessary compress page test. After add nocompress_extension, the priority should be: dir_flag < comp_extention,nocompress_extension < comp_file_flag, no_comp_file_flag. Priority in between FS_COMPR_FL, FS_NOCOMP_FS, extensions: * compress_extension=so; nocompress_extension=zip; chattr +c dir; touch dir/foo.so; touch dir/bar.zip; touch dir/baz.txt; then foo.so and baz.txt should be compresse, bar.zip should be non-compressed. chattr +c dir/bar.zip can enable compress on bar.zip. * compress_extension=so; nocompress_extension=zip; chattr -c dir; touch dir/foo.so; touch dir/bar.zip; touch dir/baz.txt; then foo.so should be compresse, bar.zip and baz.txt should be non-compressed. chattr+c dir/bar.zip; chattr+c dir/baz.txt; can enable compress on bar.zip and baz.txt. Signed-off-by: Fengnan Chang <changfengnan@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-06-28f2fs: remove false alarm on iget failure during GCJaegeuk Kim1-3/+1
This patch removes setting SBI_NEED_FSCK when GC gets an error on f2fs_iget, since f2fs_iget can give ENOMEM and others by race condition. If we set this critical fsck flag, we'll get EIO during fsync via the below code path. In f2fs_inplace_write_data(), if (is_sbi_flag_set(sbi, SBI_NEED_FSCK) || f2fs_cp_error(sbi)) { err = -EIO; goto drop_bio; } Fixes: 9557727876674 ("f2fs: drop inplace IO if fs status is abnormal") Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-06-28f2fs: enable extent cache for compression files in read-onlyDaeho Jeong1-19/+20
Let's allow extent cache for RO partition. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-06-23f2fs: introduce f2fs_casefolded_name slab cacheChao Yu3-7/+40
Add a slab cache: "f2fs_casefolded_name" for memory allocation of casefold name. Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-06-23f2fs: swap: support migrating swapfile in aligned write modeChao Yu4-14/+101
This patch supports to migrate swapfile in aligned write mode during swapon in order to keep swapfile being aligned to section as much as possible, then pinned swapfile will locates fully filled section which may not affected by GC. However, for the case that swapfile's size is not aligned to section size, it will still leave last extent in file's tail as unaligned due to its size is smaller than section size, like case #2. case #1 xfs_io -f /mnt/f2fs/file -c "pwrite 0 4M" -c "fsync" Before swapon: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..3047]: 1123352..1126399 3048 0x1000 1: [3048..7143]: 237568..241663 4096 0x1000 2: [7144..8191]: 245760..246807 1048 0x1001 After swapon: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..8191]: 249856..258047 8192 0x1001 Kmsg: F2FS-fs (zram0): Swapfile (2) is not align to section: 1) creat(), 2) ioctl(F2FS_IOC_SET_PIN_FILE), 3) fallocate(2097152 * n) case #2 xfs_io -f /mnt/f2fs/file -c "pwrite 0 3M" -c "fsync" Before swapon: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..3047]: 246808..249855 3048 0x1000 1: [3048..6143]: 237568..240663 3096 0x1001 After swapon: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..4095]: 258048..262143 4096 0x1000 1: [4096..6143]: 238616..240663 2048 0x1001 Kmsg: F2FS-fs (zram0): Swapfile: last extent is not aligned to section F2FS-fs (zram0): Swapfile (2) is not align to section: 1) creat(), 2) ioctl(F2FS_IOC_SET_PIN_FILE), 3) fallocate(2097152 * n) Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-06-23f2fs: swap: remove dead codesChao Yu1-167/+1
After commit af4b6b8edf6a ("f2fs: introduce check_swap_activate_fast()"), we will never run into original logic of check_swap_activate() before f2fs supports non 4k-sized page, so let's delete those dead codes. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-06-23f2fs: compress: add compress_inode to cache compressed blocksChao Yu10-14/+357
Support to use address space of inner inode to cache compressed block, in order to improve cache hit ratio of random read. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-06-23f2fs: clean up /sys/fs/f2fs/<disk>/featuresJaegeuk Kim2-64/+134
Let's create /sys/fs/f2fs/<disk>/feature_list/ to meet sysfs rule. Note that there are three feature list entries: 1) /sys/fs/f2fs/features : shows runtime features supported by in-kernel f2fs along with Kconfig. - ref. F2FS_FEATURE_RO_ATTR() 2) /sys/fs/f2fs/$s_id/features <deprecated> : shows on-disk features enabled by mkfs.f2fs, used for old kernels. This won't add new feature anymore, and thus, users should check entries in 3) instead of this 2). 3) /sys/fs/f2fs/$s_id/feature_list : shows on-disk features enabled by mkfs.f2fs per instance, which follows sysfs entry rule where each entry should expose single value. This list covers old feature list provided by 2) and beyond. Therefore, please add new on-disk feature in this list only. - ref. F2FS_SB_FEATURE_RO_ATTR() Reviewed-by: Chao Yu <yuchao0@huawei.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-06-23f2fs: add pin_file in feature listJaegeuk Kim1-0/+2
This patch adds missing pin_file feature supported by kernel. Fixes: f5a53edcf01e ("f2fs: support aligned pinned file") Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-06-23f2fs: Advertise encrypted casefolding in sysfsDaniel Rosenberg1-0/+8
Older kernels don't support encryption with casefolding. This adds the sysfs entry encrypted_casefold to show support for those combined features. Support for this feature was originally added by commit 7ad08a58bf67 ("f2fs: Handle casefolding with Encryption") Fixes: 7ad08a58bf67 ("f2fs: Handle casefolding with Encryption") Cc: stable@vger.kernel.org # v5.11+ Signed-off-by: Daniel Rosenberg <drosen@google.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-06-23f2fs: Show casefolding support only when supportedDaniel Rosenberg1-0/+4
The casefolding feature is only supported when CONFIG_UNICODE is set. This modifies the feature list f2fs presents under sysfs accordingly. Fixes: 5aba54302a46 ("f2fs: include charset encoding information in the superblock") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Daniel Rosenberg <drosen@google.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-06-23f2fs: support RO featureJaegeuk Kim4-6/+46
Given RO feature in superblock, we don't need to check provisioning/reserve spaces and SSA area. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-06-23f2fs: logging neateningJoe Perches5-21/+17
Update the logging uses that have unnecessary newlines as the f2fs_printk function and so its f2fs_<level> macro callers already adds one. This allows searching single line logging entries with an easier grep and also avoids unnecessary blank lines in the logging. Miscellanea: o Coalesce formats o Align to open parenthesis Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-06-23f2fs: introduce FI_COMPRESS_RELEASED instead of using IMMUTABLE bitJaegeuk Kim3-7/+20
Once we release compressed blocks, we used to set IMMUTABLE bit. But it turned out it disallows every fs operations which we don't need for compression. Let's just prevent writing data only. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-06-23f2fs: compress: remove unneeded preallocationChao Yu2-28/+3
We will reserve iblocks for compression saved, so during compressed cluster overwrite, we don't need to preallocate blocks for later write. In addition, it adds a bug_on to detect wrong reserved iblock number in __f2fs_cluster_blocks(). Bug fix in the original patch by Jaegeuk: If we released compressed blocks having an immutable bit, we can see less number of compressed block addresses. Let's fix wrong BUG_ON. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-26f2fs: atgc: export entries for better tunability via sysfsChao Yu1-0/+27
This patch export below sysfs entries for better ATGC tunability. /sys/fs/f2fs/<disk>/atgc_candidate_ratio /sys/fs/f2fs/<disk>/atgc_candidate_count /sys/fs/f2fs/<disk>/atgc_age_weight /sys/fs/f2fs/<disk>/atgc_age_threshold Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-26f2fs: compress: fix to disallow temp extensionChao Yu1-4/+12
This patch restricts to configure compress extension as format of: [filename + '.' + extension] rather than: [filename + '.' + extension + (optional: '.' + temp extension)] in order to avoid to enable compression incorrectly: 1. compress_extension=so 2. touch file.soa 3. touch file.so.tmp Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-26f2fs: let's allow compression for mmap filesJaegeuk Kim1-2/+0
This patch allows to compress mmap files. E.g., for so files. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-26f2fs: add MODULE_SOFTDEP to ensure crc32 is included in the initramfsChao Yu1-0/+1
As marcosfrm reported in bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=213089 Initramfs generators rely on "pre" softdeps (and "depends") to include additional required modules. F2FS does not declare "pre: crc32" softdep. Then every generator (dracut, mkinitcpio...) has to maintain a hardcoded list for this purpose. Hence let's use MODULE_SOFTDEP("pre: crc32") in f2fs code. Fixes: 43b6573bac95 ("f2fs: use cryptoapi crc32 functions") Reported-by: marcosfrm <marcosfrm@gmail.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-26f2fs: return success if there is no work to doTom Rix1-1/+1
Static analysis reports this problem file.c:3206:2: warning: Undefined or garbage value returned to caller return err; ^~~~~~~~~~ err is only set if there is some work to do. Because the loop returns immediately on an error, if all the work was done, a 0 would be returned. Instead of checking the unlikely case that there was no work to do, change the return of err to 0. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-14f2fs: compress: clean up parameter of __f2fs_cluster_blocks()Chao Yu1-20/+13
Previously, in order to reuse __f2fs_cluster_blocks(), f2fs_is_compressed_cluster() assigned a compress_ctx type variable, which is used to pass few parameters (cc.inode, cc.cluster_size, cc.cluster_idx), it's wasteful to allocate such large space in stack. Let's clean up parameters of __f2fs_cluster_blocks() to avoid that. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-14f2fs: compress: remove unneeded f2fs_put_dnode()Chao Yu1-1/+0
If we don't initialize dn.inode_page for f2fs_get_block(), f2fs_get_block() will call f2fs_put_dnode() itself, so let's remove unneeded f2fs_put_dnode() in f2fs_vm_page_mkwrite(). Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-14f2fs: atgc: fix to set default age thresholdChao Yu1-0/+1
Default age threshold value is missed to set, fix it. Fixes: 093749e296e2 ("f2fs: support age threshold based garbage collection") Reported-by: Sahitya Tummala <stummala@codeaurora.org> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-14f2fs: Prevent swap file in LFS modeShin'ichiro Kawasaki1-0/+6
The kernel writes to swap files on f2fs directly without the assistance of the filesystem. This direct write by kernel can be non-sequential even when the f2fs is in LFS mode. Such non-sequential write conflicts with the LFS semantics. Especially when f2fs is set up on zoned block devices, the non-sequential write causes unaligned write command errors. To avoid the non-sequential writes to swap files, prevent swap file activation when the filesystem is in LFS mode. Fixes: 4969c06a0d83 ("f2fs: support swap file w/ DIO") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Cc: stable@vger.kernel.org # v5.10+ Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-14f2fs: fix to avoid racing on fsync_entry_slab by multi filesystem instancesChao Yu3-10/+23
As syzbot reported, there is an use-after-free issue during f2fs recovery: Use-after-free write at 0xffff88823bc16040 (in kfence-#10): kmem_cache_destroy+0x1f/0x120 mm/slab_common.c:486 f2fs_recover_fsync_data+0x75b0/0x8380 fs/f2fs/recovery.c:869 f2fs_fill_super+0x9393/0xa420 fs/f2fs/super.c:3945 mount_bdev+0x26c/0x3a0 fs/super.c:1367 legacy_get_tree+0xea/0x180 fs/fs_context.c:592 vfs_get_tree+0x86/0x270 fs/super.c:1497 do_new_mount fs/namespace.c:2905 [inline] path_mount+0x196f/0x2be0 fs/namespace.c:3235 do_mount fs/namespace.c:3248 [inline] __do_sys_mount fs/namespace.c:3456 [inline] __se_sys_mount+0x2f9/0x3b0 fs/namespace.c:3433 do_syscall_64+0x3f/0xb0 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae The root cause is multi f2fs filesystem instances can race on accessing global fsync_entry_slab pointer, result in use-after-free issue of slab cache, fixes to init/destroy this slab cache only once during module init/destroy procedure to avoid this issue. Reported-by: syzbot+9d90dad32dd9727ed084@syzkaller.appspotmail.com Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-14f2fs: restructure f2fs page.private layoutChao Yu11-109/+146
Restruct f2fs page private layout for below reasons: There are some cases that f2fs wants to set a flag in a page to indicate a specified status of page: a) page is in transaction list for atomic write b) page contains dummy data for aligned write c) page is migrating for GC d) page contains inline data for inline inode flush e) page belongs to merkle tree, and is verified for fsverity f) page is dirty and has filesystem/inode reference count for writeback g) page is temporary and has decompress io context reference for compression There are existed places in page structure we can use to store f2fs private status/data: - page.flags: PG_checked, PG_private - page.private However it was a mess when we using them, which may cause potential confliction: page.private PG_private PG_checked page._refcount (+1 at most) a) -1 set +1 b) -2 set c), d), e) set f) 0 set +1 g) pointer set The other problem is page.flags has no free slot, if we can avoid set zero to page.private and set PG_private flag, then we use non-zero value to indicate PG_private status, so that we may have chance to reclaim PG_private slot for other usage. [1] The other concern is f2fs has bad scalability in aspect of indicating more page status. So in this patch, let's restructure f2fs' page.private as below to solve above issues: Layout A: lowest bit should be 1 | bit0 = 1 | bit1 | bit2 | ... | bit MAX | private data .... | bit 0 PAGE_PRIVATE_NOT_POINTER bit 1 PAGE_PRIVATE_ATOMIC_WRITE bit 2 PAGE_PRIVATE_DUMMY_WRITE bit 3 PAGE_PRIVATE_ONGOING_MIGRATION bit 4 PAGE_PRIVATE_INLINE_INODE bit 5 PAGE_PRIVATE_REF_RESOURCE bit 6- f2fs private data Layout B: lowest bit should be 0 page.private is a wrapped pointer. After the change: page.private PG_private PG_checked page._refcount (+1 at most) a) 11 set +1 b) 101 set +1 c) 1001 set +1 d) 10001 set +1 e) set f) 100001 set +1 g) pointer set +1 [1] https://lore.kernel.org/linux-f2fs-devel/20210422154705.GO3596236@casper.infradead.org/T/#u Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-14f2fs: add cp_error check in f2fs_write_compressed_pagesChao Yu1-0/+6
This patch adds cp_error check in f2fs_write_compressed_pages() like we did in f2fs_write_single_data_page() Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-14f2fs: compress: rename __cluster_may_compressChao Yu1-4/+4
This patch renames __cluster_may_compress() to cluster_has_invalid_data() for better readability. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-12f2fs: return EINVAL for hole cases in swap fileJaegeuk Kim1-2/+2
This tries to fix xfstests/generic/495. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11f2fs: avoid swapon failure by giving a warning firstJaegeuk Kim1-6/+23
The final solution can be migrating blocks to form a section-aligned file internally. Meanwhile, let's ask users to do that when preparing the swap file initially like: 1) create() 2) ioctl(F2FS_IOC_SET_PIN_FILE) 3) fallocate() Reported-by: kernel test robot <oliver.sang@intel.com> Fixes: 36e4d95891ed ("f2fs: check if swapfile is section-alligned") Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11f2fs: compress: fix to assign cc.cluster_idx correctlyChao Yu3-12/+13
In f2fs_destroy_compress_ctx(), after f2fs_destroy_compress_ctx(), cc.cluster_idx will be cleared w/ NULL_CLUSTER, f2fs_cluster_blocks() may check wrong cluster metadata, fix it. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11f2fs: compress: fix race condition of overwrite vs truncateChao Yu1-23/+12
pos_fsstress testcase complains a panic as belew: ------------[ cut here ]------------ kernel BUG at fs/f2fs/compress.c:1082! invalid opcode: 0000 [#1] SMP PTI CPU: 4 PID: 2753477 Comm: kworker/u16:2 Tainted: G OE 5.12.0-rc1-custom #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Workqueue: writeback wb_workfn (flush-252:16) RIP: 0010:prepare_compress_overwrite+0x4c0/0x760 [f2fs] Call Trace: f2fs_prepare_compress_overwrite+0x5f/0x80 [f2fs] f2fs_write_cache_pages+0x468/0x8a0 [f2fs] f2fs_write_data_pages+0x2a4/0x2f0 [f2fs] do_writepages+0x38/0xc0 __writeback_single_inode+0x44/0x2a0 writeback_sb_inodes+0x223/0x4d0 __writeback_inodes_wb+0x56/0xf0 wb_writeback+0x1dd/0x290 wb_workfn+0x309/0x500 process_one_work+0x220/0x3c0 worker_thread+0x53/0x420 kthread+0x12f/0x150 ret_from_fork+0x22/0x30 The root cause is truncate() may race with overwrite as below, so that one reference count left in page can not guarantee the page attaching in mapping tree all the time, after truncation, later find_lock_page() may return NULL pointer. - prepare_compress_overwrite - f2fs_pagecache_get_page - unlock_page - f2fs_setattr - truncate_setsize - truncate_inode_page - delete_from_page_cache - find_lock_page Fix this by avoiding referencing updated page. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11f2fs: compress: fix to free compress page correctlyChao Yu1-1/+2
In error path of f2fs_write_compressed_pages(), it needs to call f2fs_compress_free_page() to release temporary page. Fixes: 5e6bbde95982 ("f2fs: introduce mempool for {,de}compress intermediate page allocation") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11f2fs: support iflag change given the maskJaegeuk Kim1-1/+2
In f2fs_fileattr_set(), if (!fa->flags_valid) mask &= FS_COMMON_FL; In this case, we can set supported flags by mask only instead of BUG_ON. /* Flags shared betwen flags/xflags */ (FS_SYNC_FL | FS_IMMUTABLE_FL | FS_APPEND_FL | \ FS_NODUMP_FL | FS_NOATIME_FL | FS_DAX_FL | \ FS_PROJINHERIT_FL) Fixes: 9b1bb01c8ae7 ("f2fs: convert to fileattr") Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11f2fs: avoid null pointer access when handling IPU errorJaegeuk Kim1-2/+2
Unable to handle kernel NULL pointer dereference at virtual address 000000000000001a pc : f2fs_inplace_write_data+0x144/0x208 lr : f2fs_inplace_write_data+0x134/0x208 Call trace: f2fs_inplace_write_data+0x144/0x208 f2fs_do_write_data_page+0x270/0x770 f2fs_write_single_data_page+0x47c/0x830 __f2fs_write_data_pages+0x444/0x98c f2fs_write_data_pages.llvm.16514453770497736882+0x2c/0x38 do_writepages+0x58/0x118 __writeback_single_inode+0x44/0x300 writeback_sb_inodes+0x4b8/0x9c8 wb_writeback+0x148/0x42c wb_do_writeback+0xc8/0x390 wb_workfn+0xb0/0x2f4 process_one_work+0x1fc/0x444 worker_thread+0x268/0x4b4 kthread+0x13c/0x158 ret_from_fork+0x10/0x18 Fixes: 955772787667 ("f2fs: drop inplace IO if fs status is abnormal") Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-04Merge tag 'f2fs-for-5.13-rc1' of ↵Linus Torvalds24-219/+615
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this round, we added a new mount option, "checkpoint_merge", which introduces a kernel thread dealing with the f2fs checkpoints. Once we start to manage the IO priority along with blk-cgroup, the checkpoint operation can be processed in a lower priority under the process context. Since the checkpoint holds all the filesystem operations, we give a higher priority to the checkpoint thread all the time. Enhancements: - introduce gc_merge mount option to introduce a checkpoint thread - improve to run discard thread efficiently - allow modular compression algorithms - expose # of overprivision segments to sysfs - expose runtime compression stat to sysfs Bug fixes: - fix OOB memory access by the node id lookup - avoid touching checkpointed data in the checkpoint-disabled mode - fix the resizing flow to avoid kernel panic and race conditions - fix block allocation issues on pinned files - address some swapfile issues - fix hugtask problem and kernel panic during atomic write operations - don't start checkpoint thread in RO And, we've cleaned up some kernel coding style and build warnings. In addition, we fixed some minor race conditions and error handling routines" * tag 'f2fs-for-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (48 commits) f2fs: drop inplace IO if fs status is abnormal f2fs: compress: remove unneed check condition f2fs: clean up left deprecated IO trace codes f2fs: avoid using native allocate_segment_by_default() f2fs: remove unnecessary struct declaration f2fs: fix to avoid NULL pointer dereference f2fs: avoid duplicated codes for cleanup f2fs: document: add description about compressed space handling f2fs: clean up build warnings f2fs: fix the periodic wakeups of discard thread f2fs: fix to avoid accessing invalid fio in f2fs_allocate_data_block() f2fs: fix to avoid GC/mmap race with f2fs_truncate() f2fs: set checkpoint_merge by default f2fs: Fix a hungtask problem in atomic write f2fs: fix to restrict mount condition on readonly block device f2fs: introduce gc_merge mount option f2fs: fix to cover __allocate_new_section() with curseg_lock f2fs: fix wrong alloc_type in f2fs_do_replace_block f2fs: delete empty compress.h f2fs: fix a typo in inode.c ...
2021-05-02Merge branch 'work.misc' of ↵Linus Torvalds2-5/+2
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs updates from Al Viro: "Assorted stuff all over the place" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: useful constants: struct qstr for ".." hostfs_open(): don't open-code file_dentry() whack-a-mole: kill strlen_user() (again) autofs: should_expire() argument is guaranteed to be positive apparmor:match_mn() - constify devpath argument buffer: a small optimization in grow_buffers get rid of autofs_getpath() constify dentry argument of dentry_path()/dentry_path_raw()
2021-04-27Merge tag 'netfs-lib-20210426' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull network filesystem helper library updates from David Howells: "Here's a set of patches for 5.13 to begin the process of overhauling the local caching API for network filesystems. This set consists of two parts: (1) Add a helper library to handle the new VM readahead interface. This is intended to be used unconditionally by the filesystem (whether or not caching is enabled) and provides a common framework for doing caching, transparent huge pages and, in the future, possibly fscrypt and read bandwidth maximisation. It also allows the netfs and the cache to align, expand and slice up a read request from the VM in various ways; the netfs need only provide a function to read a stretch of data to the pagecache and the helper takes care of the rest. (2) Add an alternative fscache/cachfiles I/O API that uses the kiocb facility to do async DIO to transfer data to/from the netfs's pages, rather than using readpage with wait queue snooping on one side and vfs_write() on the other. It also uses less memory, since it doesn't do buffered I/O on the backing file. Note that this uses SEEK_HOLE/SEEK_DATA to locate the data available to be read from the cache. Whilst this is an improvement from the bmap interface, it still has a problem with regard to a modern extent-based filesystem inserting or removing bridging blocks of zeros. Fixing that requires a much greater overhaul. This is a step towards overhauling the fscache API. The change is opt-in on the part of the network filesystem. A netfs should not try to mix the old and the new API because of conflicting ways of handling pages and the PG_fscache page flag and because it would be mixing DIO with buffered I/O. Further, the helper library can't be used with the old API. This does not change any of the fscache cookie handling APIs or the way invalidation is done at this time. In the near term, I intend to deprecate and remove the old I/O API (fscache_allocate_page{,s}(), fscache_read_or_alloc_page{,s}(), fscache_write_page() and fscache_uncache_page()) and eventually replace most of fscache/cachefiles with something simpler and easier to follow. This patchset contains the following parts: - Some helper patches, including provision of an ITER_XARRAY iov iterator and a function to do readahead expansion. - Patches to add the netfs helper library. - A patch to add the fscache/cachefiles kiocb API. - A pair of patches to fix some review issues in the ITER_XARRAY and read helpers as spotted by Al and Willy. Jeff Layton has patches to add support in Ceph for this that he intends for this merge window. I have a set of patches to support AFS that I will post a separate pull request for. With this, AFS without a cache passes all expected xfstests; with a cache, there's an extra failure, but that's also there before these patches. Fixing that probably requires a greater overhaul. Ceph also passes the expected tests. I also have patches in a separate branch to tidy up the handling of PG_fscache/PG_private_2 and their contribution to page refcounting in the core kernel here, but I haven't included them in this set and will route them separately" Link: https://lore.kernel.org/lkml/3779937.1619478404@warthog.procyon.org.uk/ * tag 'netfs-lib-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: netfs: Miscellaneous fixes iov_iter: Four fixes for ITER_XARRAY fscache, cachefiles: Add alternate API to use kiocb for read/write to cache netfs: Add a tracepoint to log failures that would be otherwise unseen netfs: Define an interface to talk to a cache netfs: Add write_begin helper netfs: Gather stats netfs: Add tracepoints netfs: Provide readahead and readpage netfs helpers netfs, mm: Add set/end/wait_on_page_fscache() aliases netfs, mm: Move PG_fscache helper funcs to linux/netfs.h netfs: Documentation for helper library netfs: Make a netfs helper module mm: Implement readahead_control pageset expansion mm/readahead: Handle ractl nr_pages being modified fs: Document file_ra_state mm/filemap: Pass the file_ra_state in the ractl mm: Add set/end/wait functions for PG_private_2 iov_iter: Add ITER_XARRAY
2021-04-26f2fs: drop inplace IO if fs status is abnormalChao Yu1-1/+16
If filesystem has cp_error or need_fsck status, let's drop inplace IO to avoid further corruption of fs data. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-04-26f2fs: compress: remove unneed check conditionChao Yu1-7/+1
In only call path of __cluster_may_compress(), __f2fs_write_data_pages() has checked SBI_POR_DOING condition, and also cluster_may_compress() has checked CP_ERROR_FLAG condition, so remove redundant check condition in __cluster_may_compress() for cleanup. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-04-24f2fs: clean up left deprecated IO trace codesChao Yu2-14/+0
Commit d5f7bc0064e0 ("f2fs: deprecate f2fs_trace_io") left some dead codes, delete them. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-04-23mm/filemap: Pass the file_ra_state in the ractlMatthew Wilcox (Oracle)2-2/+2
For readahead_expand(), we need to modify the file ra_state, so pass it down by adding it to the ractl. We have to do this because it's not always the same as f_ra in the struct file that is already being passed. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dave Wysochanski <dwysocha@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> Link: https://lore.kernel.org/r/20210407201857.3582797-2-willy@infradead.org/ Link: https://lore.kernel.org/r/161789067431.6155.8063840447229665720.stgit@warthog.procyon.org.uk/ # v6
2021-04-21f2fs: avoid using native allocate_segment_by_default()Chao Yu3-10/+12
As we did for other cases, in fix_curseg_write_pointer(), let's use wrapped f2fs_allocate_new_section() instead of native allocate_segment_by_default(), by this way, it fixes to cover segment allocation with curseg_lock and sentry_lock. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-04-19f2fs: remove unnecessary struct declarationWan Jiabing1-1/+0
struct dnode_of_data is defined at 897th line. The declaration here is unnecessary. Remove it. Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-04-15useful constants: struct qstr for ".."Al Viro2-5/+2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>