summaryrefslogtreecommitdiffstats
path: root/fs/f2fs/f2fs.h
AgeCommit message (Collapse)AuthorFilesLines
2022-12-12f2fs: add block_age-based extent cacheJaegeuk Kim1-0/+38
This patch introduces a runtime hot/cold data separation method for f2fs, in order to improve the accuracy for data temperature classification, reduce the garbage collection overhead after long-term data updates. Enhanced hot/cold data separation can record data block update frequency as "age" of the extent per inode, and take use of the age info to indicate better temperature type for data block allocation: - It records total data blocks allocated since mount; - When file extent has been updated, it calculate the count of data blocks allocated since last update as the age of the extent; - Before the data block allocated, it searches for the age info and chooses the suitable segment for allocation. Test and result: - Prepare: create about 30000 files * 3% for cold files (with cold file extension like .apk, from 3M to 10M) * 50% for warm files (with random file extension like .FcDxq, from 1K to 4M) * 47% for hot files (with hot file extension like .db, from 1K to 256K) - create(5%)/random update(90%)/delete(5%) the files * total write amount is about 70G * fsync will be called for .db files, and buffered write will be used for other files The storage of test device is large enough(128G) so that it will not switch to SSR mode during the test. Benefit: dirty segment count increment reduce about 14% - before: Dirty +21110 - after: Dirty +18286 Signed-off-by: qixiaoyu1 <qixiaoyu1@xiaomi.com> Signed-off-by: xiongping1 <xiongping1@xiaomi.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12f2fs: allocate the extent_cache by defaultJaegeuk Kim1-1/+2
Let's allocate it to remove the runtime complexity. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12f2fs: refactor extent_cache to support for read and moreJaegeuk Kim1-42/+77
This patch prepares extent_cache to be ready for addition. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12f2fs: move internal functions into extent_cache.cJaegeuk Kim1-67/+2
No functional change. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12f2fs: specify extent cache for read explicitlyJaegeuk Kim1-5/+5
Let's descrbie it's read extent cache. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12f2fs: introduce f2fs_is_readonly() for readabilityYangtao Li1-0/+5
Introduce f2fs_is_readonly() and use it to simplify code. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12f2fs: remove F2FS_SET_FEATURE() and F2FS_CLEAR_FEATURE() macroYangtao Li1-5/+1
F2FS_SET_FEATURE() and F2FS_CLEAR_FEATURE() have never been used since they were introduced by this commit 76f105a2dbcd("f2fs: add feature facility in superblock"). So let's remove them. BTW, convert f2fs_sb_has_##name to return bool. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-11-28f2fs: introduce discard_urgent_util sysfs nodeYangtao Li1-0/+1
Through this node, you can control the background discard to run more aggressively or not aggressively when reach the utilization rate of the space. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-11-28f2fs: define MIN_DISCARD_GRANULARITY macroYangtao Li1-0/+2
Do cleanup in f2fs_tuning_parameters() and __init_discard_policy(), let's use macro instead of number. Suggested-by: Chao Yu <chao@kernel.org> Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-11-28f2fs: fix to enable compress for newly created file if extension matchesSheng Yong1-1/+1
If compress_extension is set, and a newly created file matches the extension, the file could be marked as compression file. However, if inline_data is also enabled, there is no chance to check its extension since f2fs_should_compress() always returns false. This patch moves set_compress_inode(), which do extension check, in f2fs_should_compress() to check extensions before setting inline data flag. Fixes: 7165841d578e ("f2fs: fix to check inline_data during compressed inode conversion") Signed-off-by: Sheng Yong <shengyong@oppo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-11-28f2fs: change type for 'sbi->readdir_ra'Yuwei Guan1-1/+1
Before this patch, the varibale 'readdir_ra' takes effect if it's equal to '1' or not, so we can change type for it from 'int' to 'bool'. Signed-off-by: Yuwei Guan <Yuwei.Guan@zeekrlife.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-11-28f2fs: introduce F2FS_IOC_START_ATOMIC_REPLACEDaeho Jeong1-0/+1
introduce a new ioctl to replace the whole content of a file atomically, which means it induces truncate and content update at the same time. We can start it with F2FS_IOC_START_ATOMIC_REPLACE and complete it with F2FS_IOC_COMMIT_ATOMIC_WRITE. Or abort it with F2FS_IOC_ABORT_ATOMIC_WRITE. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-11-11f2fs: optimize iteration over sparse directoriesChao Yu1-2/+3
Wei Chen reports a kernel bug as blew: INFO: task syz-executor.0:29056 blocked for more than 143 seconds. Not tainted 5.15.0-rc5 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz-executor.0 state:D stack:14632 pid:29056 ppid: 6574 flags:0x00000004 Call Trace: __schedule+0x4a1/0x1720 schedule+0x36/0xe0 rwsem_down_write_slowpath+0x322/0x7a0 fscrypt_ioctl_set_policy+0x11f/0x2a0 __f2fs_ioctl+0x1a9f/0x5780 f2fs_ioctl+0x89/0x3a0 __x64_sys_ioctl+0xe8/0x140 do_syscall_64+0x34/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae Eric did some investigation on this issue, quoted from reply of Eric: "Well, the quality of this bug report has a lot to be desired (not on upstream kernel, reproducer is full of totally irrelevant stuff, not sent to the mailing list of the filesystem whose disk image is being fuzzed, etc.). But what is going on is that f2fs_empty_dir() doesn't consider the case of a directory with an extremely large i_size on a malicious disk image. Specifically, the reproducer mounts an f2fs image with a directory that has an i_size of 14814520042850357248, then calls FS_IOC_SET_ENCRYPTION_POLICY on it. That results in a call to f2fs_empty_dir() to check whether the directory is empty. f2fs_empty_dir() then iterates through all 3616826182336513 blocks the directory allegedly contains to check whether any contain anything. i_rwsem is held during this, so anything else that tries to take it will hang." In order to solve this issue, let's use f2fs_get_next_page_offset() to speed up iteration by skipping holes for all below functions: - f2fs_empty_dir - f2fs_readdir - find_in_level The way why we can speed up iteration was described in 'commit 3cf4574705b4 ("f2fs: introduce get_next_page_offset to speed up SEEK_DATA")'. Meanwhile, in f2fs_empty_dir(), let's use f2fs_find_data_page() instead f2fs_get_lock_data_page(), due to i_rwsem was held in caller of f2fs_empty_dir(), there shouldn't be any races, so it's fine to not lock dentry page during lookuping dirents in the page. Link: https://lore.kernel.org/lkml/536944df-a0ae-1dd8-148f-510b476e1347@kernel.org/T/ Reported-by: Wei Chen <harperchen1110@gmail.com> Cc: Eric Biggers <ebiggers@google.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-11-11f2fs: correct i_size change for atomic writesDaeho Jeong1-0/+8
We need to make sure i_size doesn't change until atomic write commit is successful and restore it when commit is failed. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-11-01f2fs: replace gc_urgent_high_remaining with gc_remaining_trialsYangtao Li1-2/+3
The user can set the trial count limit for GC urgent and idle mode with replaced gc_remaining_trials.. If GC thread gets to the limit, the mode will turn back to GC normal mode finally. It was applied only to GC_URGENT, while this patch expands it for GC_IDLE. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-11-01f2fs: introduce gc_mode sysfs nodeYangtao Li1-0/+1
Revert "f2fs: make gc_urgent and gc_segment_mode sysfs node readable". Add a gc_mode sysfs node to show the current gc_mode as a string. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-11-01f2fs: introduce max_ordered_discard sysfs nodeYangtao Li1-0/+3
The current max_ordered_discard is a fixed value, change it to be configurable through the sys node. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-11-01f2fs: remove batched_trim_sections nodeYangtao Li1-3/+0
commit 377224c47118("f2fs: don't split checkpoint in fstrim") obsolete batch mode and related sysfs entry. Since this testing sysfs node has been deprecated for a long time, let's remove it. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-11-01f2fs: support fault injection for f2fs_is_valid_blkaddr()Chao Yu1-0/+1
This patch supports to inject fault into f2fs_is_valid_blkaddr() to simulate accessing inconsistent data/meta block addressses from caller. Usage: a) echo 262144 > /sys/fs/f2fs/<dev>/inject_type or b) mount -o fault_type=262144 <dev> <mountpoint> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-10-10Merge tag 'f2fs-for-6.1-rc1' of ↵Linus Torvalds1-9/+39
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "This round looks fairly small comparing to the previous updates and includes mostly minor bug fixes. Nevertheless, as we've still interested in improving the stability, Chao added some debugging methods to diagnoze subtle runtime inconsistency problem. Enhancements: - store all the corruption or failure reasons in superblock - detect meta inode, summary info, and block address inconsistency - increase the limit for reserve_root for low-end devices - add the number of compressed IO in iostat Bug fixes: - DIO write fix for zoned devices - do out-of-place writes for cold files - fix some stat updates (FS_CP_DATA_IO, dirty page count) - fix race condition on setting FI_NO_EXTENT flag - fix data races when freezing super - fix wrong continue condition check in GC - do not allow ATGC for LFS mode In addition, there're some code enhancement and clean-ups as usual" * tag 'f2fs-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (32 commits) f2fs: change to use atomic_t type form sbi.atomic_files f2fs: account swapfile inodes f2fs: allow direct read for zoned device f2fs: support recording errors into superblock f2fs: support recording stop_checkpoint reason into super_block f2fs: remove the unnecessary check in f2fs_xattr_fiemap f2fs: introduce cp_status sysfs entry f2fs: fix to detect corrupted meta ino f2fs: fix to account FS_CP_DATA_IO correctly f2fs: code clean and fix a type error f2fs: add "c_len" into trace_f2fs_update_extent_tree_range for compressed file f2fs: fix to do sanity check on summary info f2fs: port to vfs{g,u}id_t and associated helpers f2fs: fix to do sanity check on destination blkaddr during recovery f2fs: let FI_OPU_WRITE override FADVISE_COLD_BIT f2fs: fix race condition on setting FI_NO_EXTENT flag f2fs: remove redundant check in f2fs_sanity_check_cluster f2fs: add static init_idisk_time function to reduce the code f2fs: fix typo f2fs: fix wrong dirty page count when race between mmap and fallocate. ...
2022-10-07f2fs: change to use atomic_t type form sbi.atomic_filesChao Yu1-3/+8
inode_lock[ATOMIC_FILE] was used for protecting sbi->atomic_files, update atomic_files variable's type to atomic_t instead of unsigned int, then inode_lock[ATOMIC_FILE] can be obsoleted. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-10-07f2fs: account swapfile inodesChao Yu1-1/+8
In order to check count of opened swapfile inodes. Signed-off-by: Chao Yu <chao.yu@oppo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-10-05f2fs: allow direct read for zoned deviceJaegeuk Kim1-1/+6
This reverts dbf8e63f48af ("f2fs: remove device type check for direct IO"), and apply the below first version, since it contributed out-of-order DIO writes. For zoned devices, f2fs forbids direct IO and forces buffered IO to serialize write IOs. However, the constraint does not apply to read IOs. Cc: stable@vger.kernel.org Fixes: dbf8e63f48af ("f2fs: remove device type check for direct IO") Signed-off-by: Eunhee Rho <eunhee83.rho@samsung.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-10-04f2fs: support recording errors into superblockChao Yu1-0/+5
This patch supports to record detail reason of FSCORRUPTED error into f2fs_super_block.s_errors[]. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-10-04f2fs: support recording stop_checkpoint reason into super_blockChao Yu1-1/+3
This patch supports to record stop_checkpoint error into f2fs_super_block.s_stop_reason[]. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-10-04f2fs: fix to account FS_CP_DATA_IO correctlyChao Yu1-1/+3
f2fs_inode_info.cp_task was introduced for FS_CP_DATA_IO accounting since commit b0af6d491a6b ("f2fs: add app/fs io stat"). However, cp_task usage coverage has been increased due to below commits: commit 040d2bb318d1 ("f2fs: fix to avoid deadloop if data_flush is on") commit 186857c5a14a ("f2fs: fix potential recursive call when enabling data_flush") So that, if data_flush mountoption is on, when data flush was triggered from background, the IO from data flush will be accounted as checkpoint IO type incorrectly. In order to fix this issue, this patch splits cp_task into two: a) cp_task: used for IO accounting b) wb_task: used to avoid deadlock Fixes: 040d2bb318d1 ("f2fs: fix to avoid deadloop if data_flush is on") Fixes: 186857c5a14a ("f2fs: fix potential recursive call when enabling data_flush") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-10-04f2fs: fix to do sanity check on destination blkaddr during recoveryChao Yu1-0/+4
As Wenqing Liu reported in bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=216456 loop5: detected capacity change from 0 to 131072 F2FS-fs (loop5): recover_inode: ino = 6, name = hln, inline = 1 F2FS-fs (loop5): recover_data: ino = 6 (i_size: recover) err = 0 F2FS-fs (loop5): recover_inode: ino = 6, name = hln, inline = 1 F2FS-fs (loop5): recover_data: ino = 6 (i_size: recover) err = 0 F2FS-fs (loop5): recover_inode: ino = 6, name = hln, inline = 1 F2FS-fs (loop5): recover_data: ino = 6 (i_size: recover) err = 0 F2FS-fs (loop5): Bitmap was wrongly set, blk:5634 ------------[ cut here ]------------ WARNING: CPU: 3 PID: 1013 at fs/f2fs/segment.c:2198 RIP: 0010:update_sit_entry+0xa55/0x10b0 [f2fs] Call Trace: <TASK> f2fs_do_replace_block+0xa98/0x1890 [f2fs] f2fs_replace_block+0xeb/0x180 [f2fs] recover_data+0x1a69/0x6ae0 [f2fs] f2fs_recover_fsync_data+0x120d/0x1fc0 [f2fs] f2fs_fill_super+0x4665/0x61e0 [f2fs] mount_bdev+0x2cf/0x3b0 legacy_get_tree+0xed/0x1d0 vfs_get_tree+0x81/0x2b0 path_mount+0x47e/0x19d0 do_mount+0xce/0xf0 __x64_sys_mount+0x12c/0x1a0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd If we enable CONFIG_F2FS_CHECK_FS config, it will trigger a kernel panic instead of warning. The root cause is: in fuzzed image, SIT table is inconsistent with inode mapping table, result in triggering such warning during SIT table update. This patch introduces a new flag DATA_GENERIC_ENHANCE_UPDATE, w/ this flag, data block recovery flow can check destination blkaddr's validation in SIT table, and skip f2fs_replace_block() to avoid inconsistent status. Cc: stable@vger.kernel.org Reported-by: Wenqing Liu <wenqingliu0120@gmail.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-10-04f2fs: fix typoYonggil Song1-1/+1
Fix typo in f2fs.h Detected by Jaeyoon Choi Signed-off-by: Yonggil Song <yonggil.song@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-10-04f2fs: return the tmp_ptr directly in __bitmap_ptrZhang Qilong1-1/+1
Just return tmp_ptr here, it's no need to dereference checkpoint pointer again. Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-09-12f2fs: flush pending checkpoints when freezing superJaegeuk Kim1-0/+1
This avoids -EINVAL when trying to freeze f2fs. Cc: stable@vger.kernel.org Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-09-11f2fs: move f2fs_force_buffered_io() into file.cEric Biggers1-40/+0
f2fs_force_buffered_io() is only used in file.c, so move it into there. No behavior change. This makes it easier to review later patches. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> Link: https://lore.kernel.org/r/20220827065851.135710-6-ebiggers@kernel.org
2022-09-11fscrypt: change fscrypt_dio_supported() to prepare for STATX_DIOALIGNEric Biggers1-1/+1
To prepare for STATX_DIOALIGN support, make two changes to fscrypt_dio_supported(). First, remove the filesystem-block-alignment check and make the filesystems handle it instead. It previously made sense to have it in fs/crypto/; however, to support STATX_DIOALIGN the alignment restriction would have to be returned to filesystems. It ends up being simpler if filesystems handle this part themselves, especially for f2fs which only allows fs-block-aligned DIO in the first place. Second, make fscrypt_dio_supported() work on inodes whose encryption key hasn't been set up yet, by making it set up the key if needed. This is required for statx(), since statx() doesn't require a file descriptor. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20220827065851.135710-4-ebiggers@kernel.org
2022-08-29f2fs: remove gc_urgent_high_limited for cleanupChao Yu1-1/+0
Remove redundant sbi->gc_urgent_high_limited. Signed-off-by: Chao Yu <chao.yu@oppo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-08-29f2fs: iostat: support accounting compressed IOChao Yu1-0/+5
Previously, we supported to account FS_CDATA_READ_IO type IO only, in this patch, it adds to account more type IO for compressed file: - APP_BUFFERED_CDATA_IO - APP_MAPPED_CDATA_IO - FS_CDATA_IO - APP_BUFFERED_CDATA_READ_IO - APP_MAPPED_CDATA_READ_IO Signed-off-by: Chao Yu <chao.yu@oppo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-08-08Merge tag 'f2fs-for-6.0' of ↵Linus Torvalds1-29/+73
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this cycle, we mainly fixed some corner cases that manipulate a per-file compression flag inappropriately. And, we found f2fs counted valid blocks in a section incorrectly when zone capacity is set, and thus, fixed it with additional sysfs entry to check it easily. Lastly, this series includes several patches with respect to the new atomic write support such as a couple of bug fixes and re-adding atomic_write_abort support that we removed by mistake in the previous release. Enhancements: - add sysfs entries to understand atomic write operations and zone capacity - introduce memory mode to get a hint for low-memory devices - adjust the waiting time of foreground GC - decompress clusters under softirq to avoid non-deterministic latency - do not skip updating inode when retrying to flush node page - enforce single zone capacity Bug fixes: - set the compression/no-compression flags correctly - revive F2FS_IOC_ABORT_VOLATILE_WRITE - check inline_data during compressed inode conversion - understand zone capacity when calculating valid block count As usual, the series includes several minor clean-ups and sanity checks" * tag 'f2fs-for-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (29 commits) f2fs: use onstack pages instead of pvec f2fs: intorduce f2fs_all_cluster_page_ready f2fs: clean up f2fs_abort_atomic_write() f2fs: handle decompress only post processing in softirq f2fs: do not allow to decompress files have FI_COMPRESS_RELEASED f2fs: do not set compression bit if kernel doesn't support f2fs: remove device type check for direct IO f2fs: fix null-ptr-deref in f2fs_get_dnode_of_data f2fs: revive F2FS_IOC_ABORT_VOLATILE_WRITE f2fs: fix to do sanity check on segment type in build_sit_entries() f2fs: obsolete unused MAX_DISCARD_BLOCKS f2fs: fix to avoid use f2fs_bug_on() in f2fs_new_node_page() f2fs: fix to remove F2FS_COMPR_FL and tag F2FS_NOCOMP_FL at the same time f2fs: introduce sysfs atomic write statistics f2fs: don't bother wait_ms by foreground gc f2fs: invalidate meta pages only for post_read required inode f2fs: allow compression of files without blocks f2fs: fix to check inline_data during compressed inode conversion f2fs: Delete f2fs_copy_page() and replace with memcpy_page() f2fs: fix to invalidate META_MAPPING before DIO write ...
2022-08-05f2fs: use onstack pages instead of pvecFengnan Chang1-1/+3
Since pvec have 15 pages, it not a multiple of 4, when write compressed pages, write in 64K as a unit, it will call pagevec_lookup_range_tag agagin, sometimes this will take a lot of time. Use onstack pages instead of pvec to mitigate this problem. Signed-off-by: Fengnan Chang <fengnanchang@gmail.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-08-05f2fs: intorduce f2fs_all_cluster_page_readyFengnan Chang1-2/+2
When write total cluster, all pages is uptodate, there is not need to call f2fs_prepare_compress_overwrite, intorduce f2fs_all_cluster_page_ready to avoid this. Signed-off-by: Fengnan Chang <changfengnan@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-08-05f2fs: handle decompress only post processing in softirqDaeho Jeong1-7/+10
Now decompression is being handled in workqueue and it makes read I/O latency non-deterministic, because of the non-deterministic scheduling nature of workqueues. So, I made it handled in softirq context only if possible, not in low memory devices, since this modification will maintain decompresion related memory a little longer. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-08-05f2fs: do not set compression bit if kernel doesn't supportJaegeuk Kim1-1/+6
If kernel doesn't have CONFIG_F2FS_FS_COMPRESSION, a file having FS_COMPR_FL via ioctl(FS_IOC_SETFLAGS) is unaccessible due to f2fs_is_compress_backend_ready(). Let's avoid it. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-08-05f2fs: remove device type check for direct IOEunhee Rho1-6/+1
To ensure serialized IOs, f2fs allows only LFS mode for zoned device. Remove redundant check for direct IO. Signed-off-by: Eunhee Rho <eunhee83.rho@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-08-05f2fs: fix null-ptr-deref in f2fs_get_dnode_of_dataYe Bin1-0/+6
There is issue as follows when test f2fs atomic write: F2FS-fs (loop0): Can't find valid F2FS filesystem in 2th superblock F2FS-fs (loop0): invalid crc_offset: 0 F2FS-fs (loop0): f2fs_check_nid_range: out-of-range nid=1, run fsck to fix. F2FS-fs (loop0): f2fs_check_nid_range: out-of-range nid=2, run fsck to fix. ================================================================== BUG: KASAN: null-ptr-deref in f2fs_get_dnode_of_data+0xac/0x16d0 Read of size 8 at addr 0000000000000028 by task rep/1990 CPU: 4 PID: 1990 Comm: rep Not tainted 5.19.0-rc6-next-20220715 #266 Call Trace: <TASK> dump_stack_lvl+0x6e/0x91 print_report.cold+0x49a/0x6bb kasan_report+0xa8/0x130 f2fs_get_dnode_of_data+0xac/0x16d0 f2fs_do_write_data_page+0x2a5/0x1030 move_data_page+0x3c5/0xdf0 do_garbage_collect+0x2015/0x36c0 f2fs_gc+0x554/0x1d30 f2fs_balance_fs+0x7f5/0xda0 f2fs_write_single_data_page+0xb66/0xdc0 f2fs_write_cache_pages+0x716/0x1420 f2fs_write_data_pages+0x84f/0x9a0 do_writepages+0x130/0x3a0 filemap_fdatawrite_wbc+0x87/0xa0 file_write_and_wait_range+0x157/0x1c0 f2fs_do_sync_file+0x206/0x12d0 f2fs_sync_file+0x99/0xc0 vfs_fsync_range+0x75/0x140 f2fs_file_write_iter+0xd7b/0x1850 vfs_write+0x645/0x780 ksys_write+0xf1/0x1e0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd As 3db1de0e582c commit changed atomic write way which new a cow_inode for atomic write file, and also mark cow_inode as FI_ATOMIC_FILE. When f2fs_do_write_data_page write cow_inode will use cow_inode's cow_inode which is NULL. Then will trigger null-ptr-deref. To solve above issue, introduce FI_COW_FILE flag for COW inode. Fiexes: 3db1de0e582c("f2fs: change the current atomic write way") Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-08-03Merge tag 'folio-6.0' of git://git.infradead.org/users/willy/pagecacheLinus Torvalds1-4/+0
Pull folio updates from Matthew Wilcox: - Fix an accounting bug that made NR_FILE_DIRTY grow without limit when running xfstests - Convert more of mpage to use folios - Remove add_to_page_cache() and add_to_page_cache_locked() - Convert find_get_pages_range() to filemap_get_folios() - Improvements to the read_cache_page() family of functions - Remove a few unnecessary checks of PageError - Some straightforward filesystem conversions to use folios - Split PageMovable users out from address_space_operations into their own movable_operations - Convert aops->migratepage to aops->migrate_folio - Remove nobh support (Christoph Hellwig) * tag 'folio-6.0' of git://git.infradead.org/users/willy/pagecache: (78 commits) fs: remove the NULL get_block case in mpage_writepages fs: don't call ->writepage from __mpage_writepage fs: remove the nobh helpers jfs: stop using the nobh helper ext2: remove nobh support ntfs3: refactor ntfs_writepages mm/folio-compat: Remove migration compatibility functions fs: Remove aops->migratepage() secretmem: Convert to migrate_folio hugetlb: Convert to migrate_folio aio: Convert to migrate_folio f2fs: Convert to filemap_migrate_folio() ubifs: Convert to filemap_migrate_folio() btrfs: Convert btrfs_migratepage to migrate_folio mm/migrate: Add filemap_migrate_folio() mm/migrate: Convert migrate_page() to migrate_folio() nfs: Convert to migrate_folio btrfs: Convert btree_migratepage to migrate_folio mm/migrate: Convert expected_page_refs() to folio_expected_refs() mm/migrate: Convert buffer_migrate_page() to buffer_migrate_folio() ...
2022-08-02f2fs: Convert to filemap_migrate_folio()Matthew Wilcox (Oracle)1-4/+0
filemap_migrate_folio() fits f2fs's needs perfectly. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Chao Yu <chao@kernel.org>
2022-07-30f2fs: obsolete unused MAX_DISCARD_BLOCKSChao Yu1-1/+0
After commit a7eeb823854c ("f2fs: use bitmap in discard_entry"), MAX_DISCARD_BLOCKS became obsolete, remove it. Signed-off-by: Chao Yu <chao.yu@oppo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30f2fs: introduce sysfs atomic write statisticsDaeho Jeong1-0/+30
introduce the below 4 new sysfs node for atomic write statistics. - current_atomic_write: the total current atomic write block count, which is not committed yet. - peak_atomic_write: the peak value of total current atomic write block count after boot. - committed_atomic_block: the accumulated total committed atomic write block count after boot. - revoked_atomic_block: the accumulated total revoked atomic write block count after boot. Signed-off-by: Daeho Jeong <daehojeong@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30f2fs: invalidate meta pages only for post_read required inodeChao Yu1-0/+1
After commit e3b49ea36802 ("f2fs: invalidate META_MAPPING before IPU/DIO write"), invalidate_mapping_pages() will be called to avoid race condition in between IPU/DIO and readahead for GC. However, readahead flow is only used for post_read required inode, so this patch adds check condition to avoids unnecessary page cache invalidating for non-post_read inode. Signed-off-by: Chao Yu <chao.yu@oppo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30f2fs: fix to check inline_data during compressed inode conversionChao Yu1-1/+1
When converting inode to compressed one via ioctl, it needs to check inline_data, since inline_data flag and compressed flag are incompatible. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu <chao.yu@oppo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30f2fs: Delete f2fs_copy_page() and replace with memcpy_page()Fabio M. De Francesco1-10/+0
f2fs_copy_page() is a wrapper around two kmap() + one memcpy() from/to the mapped pages. It unnecessarily duplicates a kernel API and it makes use of kmap(), which is being deprecated in favor of kmap_local_page(). Two main problems with kmap(): (1) It comes with an overhead as mapping space is restricted and protected by a global lock for synchronization and (2) it also requires global TLB invalidation when the kmap’s pool wraps and it might block when the mapping space is fully utilized until a slot becomes available. With kmap_local_page() the mappings are per thread, CPU local, can take page faults, and can be called from any context (including interrupts). It is faster than kmap() in kernels with HIGHMEM enabled. Therefore, its use in __clone_blkaddrs() is safe and should be preferred. Delete f2fs_copy_page() and use a plain memcpy_page() in the only one site calling the removed function. memcpy_page() avoids open coding two kmap_local_page() + one memcpy() between the two kernel virtual addresses. Suggested-by: Christoph Hellwig <hch@infradead.org> Suggested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30f2fs: enforce single zone capacityJaegeuk Kim1-1/+1
In order to simplify the complicated per-zone capacity, let's support only one capacity for entire zoned device. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30f2fs: introduce memory modeDaeho Jeong1-0/+13
Introduce memory mode to supports "normal" and "low" memory modes. "low" mode is to support low memory devices. Because of the nature of low memory devices, in this mode, f2fs will try to save memory sometimes by sacrificing performance. "normal" mode is the default mode and same as before. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>