summaryrefslogtreecommitdiffstats
path: root/fs/ext4
AgeCommit message (Collapse)AuthorFilesLines
2020-06-03ext4: fix EXT_MAX_EXTENT/INDEX to check for zeroed eh_maxHarshad Shirwadkar1-3/+6
If eh->eh_max is 0, EXT_MAX_EXTENT/INDEX would evaluate to unsigned (-1) resulting in illegal memory accesses. Although there is no consistent repro, we see that generic/019 sometimes crashes because of this bug. Ran gce-xfstests smoke and verified that there were no regressions. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20200421023959.20879-2-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2020-06-03ext4: remove unnecessary comparisons to boolJason Yan2-2/+2
Fix the following coccicheck warning: fs/ext4/extents_status.c:1057:5-28: WARNING: Comparison to bool fs/ext4/inode.c:2314:18-24: WARNING: Comparison to bool Signed-off-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com> Link: https://lore.kernel.org/r/20200420042918.19459-1-yanaijie@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-06-03ext4: remove EXT4_GET_BLOCKS_KEEP_SIZE flagEric Whitney3-14/+4
The eofblocks code was removed in the 5.7 release by "ext4: remove EOFBLOCKS_FL and associated code" (4337ecd1fe99). The ext4_map_blocks() flag used to trigger it can now be removed as well. Signed-off-by: Eric Whitney <enwlinux@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20200415203140.30349-2-enwlinux@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-06-03ext4: fix a style issue in fs/ext4/acl.cCarlos Guerrero Álvarez1-2/+1
Fixed an if statement where braces were not needed. Link: https://lore.kernel.org/r/20200416141456.1089-1-carlosteniswarrior@gmail.com Signed-off-by: Carlos Guerrero Álvarez <carlosteniswarrior@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com>
2020-06-02Merge tag 'for-5.8/block-2020-06-01' of git://git.kernel.dk/linux-blockLinus Torvalds3-3/+3
Pull block updates from Jens Axboe: "Core block changes that have been queued up for this release: - Remove dead blk-throttle and blk-wbt code (Guoqing) - Include pid in blktrace note traces (Jan) - Don't spew I/O errors on wouldblock termination (me) - Zone append addition (Johannes, Keith, Damien) - IO accounting improvements (Konstantin, Christoph) - blk-mq hardware map update improvements (Ming) - Scheduler dispatch improvement (Salman) - Inline block encryption support (Satya) - Request map fixes and improvements (Weiping) - blk-iocost tweaks (Tejun) - Fix for timeout failing with error injection (Keith) - Queue re-run fixes (Douglas) - CPU hotplug improvements (Christoph) - Queue entry/exit improvements (Christoph) - Move DMA drain handling to the few drivers that use it (Christoph) - Partition handling cleanups (Christoph)" * tag 'for-5.8/block-2020-06-01' of git://git.kernel.dk/linux-block: (127 commits) block: mark bio_wouldblock_error() bio with BIO_QUIET blk-wbt: rename __wbt_update_limits to wbt_update_limits blk-wbt: remove wbt_update_limits blk-throttle: remove tg_drain_bios blk-throttle: remove blk_throtl_drain null_blk: force complete for timeout request blk-mq: drain I/O when all CPUs in a hctx are offline blk-mq: add blk_mq_all_tag_iter blk-mq: open code __blk_mq_alloc_request in blk_mq_alloc_request_hctx blk-mq: use BLK_MQ_NO_TAG in more places blk-mq: rename BLK_MQ_TAG_FAIL to BLK_MQ_NO_TAG blk-mq: move more request initialization to blk_mq_rq_ctx_init blk-mq: simplify the blk_mq_get_request calling convention blk-mq: remove the bio argument to ->prepare_request nvme: force complete cancelled requests blk-mq: blk-mq: provide forced completion method block: fix a warning when blkdev.h is included for !CONFIG_BLOCK builds block: blk-crypto-fallback: remove redundant initialization of variable err block: reduce part_stat_lock() scope block: use __this_cpu_add() instead of access by smp_processor_id() ...
2020-06-02ext4: pass the inode to ext4_mpage_readpagesMatthew Wilcox (Oracle)3-5/+4
This function now only uses the mapping argument to look up the inode, and both callers already have the inode, so just pass the inode instead of the mapping. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-22-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02ext4: convert from readpages to readaheadMatthew Wilcox (Oracle)3-28/+18
Use the new readahead operation in ext4 Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-21-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02mm: add page_cache_readahead_unboundedMatthew Wilcox (Oracle)1-33/+2
ext4 and f2fs have duplicated the guts of the readahead code so they can read past i_size. Instead, separate out the guts of the readahead code so they can call it directly. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-14-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-01fs: ext4: default KUNIT_* fragments to KUNIT_ALL_TESTSAnders Roxell1-1/+2
This makes it easier to enable all KUnit fragments. Adding 'if !KUNIT_ALL_TESTS' so individual tests can not be turned off. Therefore if KUNIT_ALL_TESTS is enabled that will hide the prompt in menuconfig. Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-06-01Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscryptLinus Torvalds3-17/+60
Pull fscrypt updates from Eric Biggers: - Add the IV_INO_LBLK_32 encryption policy flag which modifies the encryption to be optimized for eMMC inline encryption hardware. - Make the test_dummy_encryption mount option for ext4 and f2fs support v2 encryption policies. - Fix kerneldoc warnings and some coding style inconsistencies. * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: fscrypt: add support for IV_INO_LBLK_32 policies fscrypt: make test_dummy_encryption use v2 by default fscrypt: support test_dummy_encryption=v2 fscrypt: add fscrypt_add_test_dummy_key() linux/parser.h: add include guards fscrypt: remove unnecessary extern keywords fscrypt: name all function parameters fscrypt: fix all kerneldoc warnings
2020-05-31vfs, afs, ext4: Make the inode hash table RCU searchableDavid Howells1-22/+22
Make the inode hash table RCU searchable so that searches that want to access or modify an inode without taking a ref on that inode can do so without taking the inode hash table lock. The main thing this requires is some RCU annotation on the list manipulation operations. Inodes are already freed by RCU in most cases. Users of this interface must take care as the inode may be still under construction or may be being torn down around them. There are at least three instances where this can be of use: (1) Testing whether the inode number iunique() is going to return is currently unique (the iunique_lock is still held). (2) Ext4 date stamp updating. (3) AFS callback breaking. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> cc: linux-ext4@vger.kernel.org cc: linux-afs@lists.infradead.org
2020-05-22block: remove the error_sector argument to blkdev_issue_flushChristoph Hellwig3-3/+3
The argument isn't used by any caller, and drivers don't fill out bi_sector for flush requests either. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-19ext4: fix fiemap size checks for bitmap filesChristoph Hellwig2-31/+33
Add an extra validation of the len parameter, as for ext4 some files might have smaller file size limits than others. This also means the redundant size check in ext4_ioctl_get_es_cache can go away, as all size checking is done in the shared fiemap handler. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20200505154324.3226743-3-hch@lst.de Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-05-19ext4: fix EXT4_MAX_LOGICAL_BLOCK macroRitesh Harjani1-1/+1
ext4 supports max number of logical blocks in a file to be 0xffffffff. (This is since ext4_extent's ee_block is __le32). This means that EXT4_MAX_LOGICAL_BLOCK should be 0xfffffffe (starting from 0 logical offset). This patch fixes this. The issue was seen when ext4 moved to iomap_fiemap API and when overlayfs was mounted on top of ext4. Since overlayfs was missing filemap_check_ranges(), so it could pass a arbitrary huge length which lead to overflow of map.m_len logic. This patch fixes that. Fixes: d3b6f23f7167 ("ext4: move ext4_fiemap to use iomap framework") Reported-by: syzbot+77fa5bdb65cc39711820@syzkaller.appspotmail.com Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20200505154324.3226743-2-hch@lst.de Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-05-18fscrypt: support test_dummy_encryption=v2Eric Biggers3-17/+60
v1 encryption policies are deprecated in favor of v2, and some new features (e.g. encryption+casefolding) are only being added for v2. Therefore, the "test_dummy_encryption" mount option (which is used for encryption I/O testing with xfstests) needs to support v2 policies. To do this, extend its syntax to be "test_dummy_encryption=v1" or "test_dummy_encryption=v2". The existing "test_dummy_encryption" (no argument) also continues to be accepted, to specify the default setting -- currently v1, but the next patch changes it to v2. To cleanly support both v1 and v2 while also making it easy to support specifying other encryption settings in the future (say, accepting "$contents_mode:$filenames_mode:v2"), make ext4 and f2fs maintain a pointer to the dummy fscrypt_context rather than using mount flags. To avoid concurrency issues, don't allow test_dummy_encryption to be set or changed during a remount. (The former restriction is new, but xfstests doesn't run into it, so no one should notice.) Tested with 'gce-xfstests -c {ext4,f2fs}/encrypt -g auto'. On ext4, there are two regressions, both of which are test bugs: ext4/023 and ext4/028 fail because they set an xattr and expect it to be stored inline, but the increase in size of the fscrypt_context from 24 to 40 bytes causes this xattr to be spilled into an external block. Link: https://lore.kernel.org/r/20200512233251.118314-4-ebiggers@kernel.org Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-05-14ext4: remove redundant variable has_bigalloc in ext4_fill_superKaixu Xia1-3/+2
We can use the ext4_has_feature_bigalloc() function directly to check bigalloc feature and the variable has_bigalloc is reduncant, so remove it. Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/1586935542-29588-1-git-send-email-kaixuxia@tencent.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-05-07ext4: remove unnecessary test_opt for DIOREAD_NOLOCKKaixu Xia1-5/+1
The DIOREAD_NOLOCK flag has been cleared when doing the test_opt that is meaningless, so remove the unnecessary test_opt for DIOREAD_NOLOCK. Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Link: https://lore.kernel.org/r/1586751862-19437-1-git-send-email-kaixuxia@tencent.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-04-15ext4: convert BUG_ON's to WARN_ON's in mballoc.cTheodore Ts'o1-2/+4
If the in-core buddy bitmap gets corrupted (or out of sync with the block bitmap), issue a WARN_ON and try to recover. In most cases this involves skipping trying to allocate out of a particular block group. We can end up declaring the file system corrupted, which is fair, since the file system probably should be checked before we proceed any further. Link: https://lore.kernel.org/r/20200414035649.293164-1-tytso@mit.edu Google-Bug-Id: 34811296 Google-Bug-Id: 34639169 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-04-15ext4: increase wait time needed before reuse of deleted inode numbersTheodore Ts'o1-1/+1
Current wait times have proven to be too short to protect against inode reuses that lead to metadata inconsistencies. Now that we will retry the inode allocation if we can't find any recently deleted inodes, it's a lot safer to increase the recently deleted time from 5 seconds to a minute. Link: https://lore.kernel.org/r/20200414023925.273867-1-tytso@mit.edu Google-Bug-Id: 36602237 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-04-15ext4: remove set but not used variable 'es' in ext4_jbd2.cJason Yan1-3/+0
Fix the following gcc warning: fs/ext4/ext4_jbd2.c:341:30: warning: variable 'es' set but not used [-Wunused-but-set-variable] struct ext4_super_block *es; ^~ Fixes: 2ea2fc775321 ("ext4: save all error info in save_error_info() and drop ext4_set_errno()") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Link: https://lore.kernel.org/r/20200402034759.29957-1-yanaijie@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-04-15ext4: remove set but not used variable 'es'Jason Yan1-2/+0
Fix the following gcc warning: fs/ext4/super.c:599:27: warning: variable 'es' set but not used [-Wunused-but-set-variable] struct ext4_super_block *es; ^~ Fixes: 2ea2fc775321 ("ext4: save all error info in save_error_info() and drop ext4_set_errno()") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Link: https://lore.kernel.org/r/20200402033939.25303-1-yanaijie@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-04-15ext4: do not zeroout extents beyond i_disksizeJan Kara1-4/+4
We do not want to create initialized extents beyond end of file because for e2fsck it is impossible to distinguish them from a case of corrupted file size / extent tree and so it complains like: Inode 12, i_size is 147456, should be 163840. Fix? no Code in ext4_ext_convert_to_initialized() and ext4_split_convert_extents() try to make sure it does not create initialized extents beyond inode size however they check against inode->i_size which is wrong. They should instead check against EXT4_I(inode)->i_disksize which is the current inode size on disk. That's what e2fsck is going to see in case of crash before all dirty data is written. This bug manifests as generic/456 test failure (with recent enough fstests where fsx got fixed to properly pass FALLOC_KEEP_SIZE_FL flags to the kernel) when run with dioread_lock mount option. CC: stable@vger.kernel.org Fixes: 21ca087a3891 ("ext4: Do not zero out uninitialized extents beyond i_size") Reviewed-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20200331105016.8674-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-04-15ext4: fix return-value types in several function commentsJosh Triplett2-3/+3
The documentation comments for ext4_read_block_bitmap_nowait and ext4_read_inode_bitmap describe them as returning NULL on error, but they return an ERR_PTR on error; update the documentation to match. The documentation comment for ext4_wait_block_bitmap describes it as returning 1 on error, but it returns -errno on error; update the documentation to match. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Reviewed-by: Ritesh Harani <riteshh@linux.ibm.com> Link: https://lore.kernel.org/r/60a3f4996f4932c45515aaa6b75ca42f2a78ec9b.1585512514.git.josh@joshtriplett.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-04-15ext4: use non-movable memory for superblock readaheadRoman Gushchin2-2/+2
Since commit a8ac900b8163 ("ext4: use non-movable memory for the superblock") buffers for ext4 superblock were allocated using the sb_bread_unmovable() helper which allocated buffer heads out of non-movable memory blocks. It was necessarily to not block page migrations and do not cause cma allocation failures. However commit 85c8f176a611 ("ext4: preload block group descriptors") broke this by introducing pre-reading of the ext4 superblock. The problem is that __breadahead() is using __getblk() underneath, which allocates buffer heads out of movable memory. It resulted in page migration failures I've seen on a machine with an ext4 partition and a preallocated cma area. Fix this by introducing sb_breadahead_unmovable() and __breadahead_gfp() helpers which use non-movable memory for buffer head allocations and use them for the ext4 superblock readahead. Reviewed-by: Andreas Dilger <adilger@dilger.ca> Fixes: 85c8f176a611 ("ext4: preload block group descriptors") Signed-off-by: Roman Gushchin <guro@fb.com> Link: https://lore.kernel.org/r/20200229001411.128010-1-guro@fb.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-04-15ext4: use matching invalidatepage in ext4_writepageyangerkun1-1/+1
Run generic/388 with journal data mode sometimes may trigger the warning in ext4_invalidatepage. Actually, we should use the matching invalidatepage in ext4_writepage. Signed-off-by: yangerkun <yangerkun@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20200226041002.13914-1-yangerkun@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-04-05Merge tag 'ext4_for_linus' of ↵Linus Torvalds20-670/+384
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: - Replace ext4's bmap and iopoll implementations to use iomap. - Clean up extent tree handling. - Other cleanups and miscellaneous bug fixes * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (31 commits) ext4: save all error info in save_error_info() and drop ext4_set_errno() ext4: fix incorrect group count in ext4_fill_super error message ext4: fix incorrect inodes per group in error message ext4: don't set dioread_nolock by default for blocksize < pagesize ext4: disable dioread_nolock whenever delayed allocation is disabled ext4: do not commit super on read-only bdev ext4: avoid ENOSPC when avoiding to reuse recently deleted inodes ext4: unregister sysfs path before destroying jbd2 journal ext4: check for non-zero journal inum in ext4_calculate_overhead ext4: remove map_from_cluster from ext4_ext_map_blocks ext4: clean up ext4_ext_insert_extent() call in ext4_ext_map_blocks() ext4: mark block bitmap corrupted when found instead of BUGON ext4: use flexible-array member for xattr structs ext4: use flexible-array member in struct fname Documentation: correct the description of FIEMAP_EXTENT_LAST ext4: move ext4_fiemap to use iomap framework ext4: make ext4_ind_map_blocks work with fiemap ext4: move ext4 bmap to use iomap infrastructure ext4: optimize ext4_ext_precache for 0 depth ext4: add IOMAP_F_MERGED for non-extent based mapping ...
2020-04-01ext4: save all error info in save_error_info() and drop ext4_set_errno()Theodore Ts'o15-215/+199
Using a separate function, ext4_set_errno() to set the errno is problematic because it doesn't do the right thing once s_last_error_errorcode is non-zero. It's also less racy to set all of the error information all at once. (Also, as a bonus, it shrinks code size slightly.) Link: https://lore.kernel.org/r/20200329020404.686965-1-tytso@mit.edu Fixes: 878520ac45f9 ("ext4: save the error code which triggered...") Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-31Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscryptLinus Torvalds1-0/+6
Pull fscrypt updates from Eric Biggers: "Add an ioctl FS_IOC_GET_ENCRYPTION_NONCE which retrieves a file's encryption nonce. This makes it easier to write automated tests which verify that fscrypt is doing the encryption correctly" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: ubifs: wire up FS_IOC_GET_ENCRYPTION_NONCE f2fs: wire up FS_IOC_GET_ENCRYPTION_NONCE ext4: wire up FS_IOC_GET_ENCRYPTION_NONCE fscrypt: add FS_IOC_GET_ENCRYPTION_NONCE ioctl
2020-03-30Merge branch 'locking-core-for-linus' of ↵Linus Torvalds1-5/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in this cycle were: - Continued user-access cleanups in the futex code. - percpu-rwsem rewrite that uses its own waitqueue and atomic_t instead of an embedded rwsem. This addresses a couple of weaknesses, but the primary motivation was complications on the -rt kernel. - Introduce raw lock nesting detection on lockdep (CONFIG_PROVE_RAW_LOCK_NESTING=y), document the raw_lock vs. normal lock differences. This too originates from -rt. - Reuse lockdep zapped chain_hlocks entries, to conserve RAM footprint on distro-ish kernels running into the "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!" depletion of the lockdep chain-entries pool. - Misc cleanups, smaller fixes and enhancements - see the changelog for details" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (55 commits) fs/buffer: Make BH_Uptodate_Lock bit_spin_lock a regular spinlock_t thermal/x86_pkg_temp: Make pkg_temp_lock a raw_spinlock_t Documentation/locking/locktypes: Minor copy editor fixes Documentation/locking/locktypes: Further clarifications and wordsmithing m68knommu: Remove mm.h include from uaccess_no.h x86: get rid of user_atomic_cmpxchg_inatomic() generic arch_futex_atomic_op_inuser() doesn't need access_ok() x86: don't reload after cmpxchg in unsafe_atomic_op2() loop x86: convert arch_futex_atomic_op_inuser() to user_access_begin/user_access_end() objtool: whitelist __sanitizer_cov_trace_switch() [parisc, s390, sparc64] no need for access_ok() in futex handling sh: no need of access_ok() in arch_futex_atomic_op_inuser() futex: arch_futex_atomic_op_inuser() calling conventions change completion: Use lockdep_assert_RT_in_threaded_ctx() in complete_all() lockdep: Add posixtimer context tracing bits lockdep: Annotate irq_work lockdep: Add hrtimer context tracing bits lockdep: Introduce wait-type checks completion: Use simple wait queues sched/swait: Prepare usage in completions ...
2020-03-28ext4: fix incorrect group count in ext4_fill_super error messageJosh Triplett1-2/+2
ext4_fill_super doublechecks the number of groups before mounting; if that check fails, the resulting error message prints the group count from the ext4_sb_info sbi, which hasn't been set yet. Print the freshly computed group count instead (which at that point has just been computed in "blocks_count"). Signed-off-by: Josh Triplett <josh@joshtriplett.org> Fixes: 4ec1102813798 ("ext4: Add sanity checks for the superblock before mounting the filesystem") Link: https://lore.kernel.org/r/8b957cd1513fcc4550fe675c10bcce2175c33a49.1585431964.git.josh@joshtriplett.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-28ext4: fix incorrect inodes per group in error messageJosh Triplett1-1/+1
If ext4_fill_super detects an invalid number of inodes per group, the resulting error message printed the number of blocks per group, rather than the number of inodes per group. Fix it to print the correct value. Fixes: cd6bb35bf7f6d ("ext4: use more strict checks for inodes_per_block on mount") Link: https://lore.kernel.org/r/8be03355983a08e5d4eed480944613454d7e2550.1585434649.git.josh@joshtriplett.org Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-28ext4: don't set dioread_nolock by default for blocksize < pagesizeRitesh Harjani1-1/+12
Currently on calling echo 3 > drop_caches on host machine, we see FS corruption in the guest. This happens on Power machine where blocksize < pagesize. So as a temporary workaound don't enable dioread_nolock by default for blocksize < pagesize until we identify the root cause. Also emit a warning msg in case if this mount option is manually enabled for blocksize < pagesize. Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Link: https://lore.kernel.org/r/20200327200744.12473-1-riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-28fs/buffer: Make BH_Uptodate_Lock bit_spin_lock a regular spinlock_tThomas Gleixner1-5/+3
Bit spinlocks are problematic if PREEMPT_RT is enabled, because they disable preemption, which is undesired for latency reasons and breaks when regular spinlocks are taken within the bit_spinlock locked region because regular spinlocks are converted to 'sleeping spinlocks' on RT. PREEMPT_RT replaced the bit spinlocks with regular spinlocks to avoid this problem. The replacement was done conditionaly at compile time, but Christoph requested to do an unconditional conversion. Jan suggested to move the spinlock into a existing padding hole which avoids a size increase of struct buffer_head on production kernels. As a benefit the lock gains lockdep coverage. [ bigeasy: Remove the wrapper and use always spinlock_t and move it into the padding hole ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Christoph Hellwig <hch@infradead.org> Link: https://lkml.kernel.org/r/20191118132824.rclhrbujqh4b4g4d@linutronix.de
2020-03-26ext4: disable dioread_nolock whenever delayed allocation is disabledEric Whitney1-0/+3
The patch "ext4: make dioread_nolock the default" (244adf6426ee) causes generic/422 to fail when run in kvm-xfstests' ext3conv test case. This applies both the dioread_nolock and nodelalloc mount options, a combination not previously tested by kvm-xfstests. The failure occurs because the dioread_nolock code path splits a previously fallocated multiblock extent into a series of single block extents when overwriting a portion of that extent. That causes allocation of an extent tree leaf node and a reshuffling of extents. Once writeback is completed, the individual extents are recombined into a single extent, the extent is moved again, and the leaf node is deleted. The difference in block utilization before and after writeback due to the leaf node triggers the failure. The original reason for this behavior was to avoid ENOSPC when handling I/O completions during writeback in the dioread_nolock code paths when delayed allocation is disabled. It may no longer be necessary, because code was added in the past to reserve extra space to solve this problem when delayed allocation is enabled, and this code may also apply when delayed allocation is disabled. Until this can be verified, don't use the dioread_nolock code paths if delayed allocation is disabled. Signed-off-by: Eric Whitney <enwlinux@gmail.com> Link: https://lore.kernel.org/r/20200319150028.24592-1-enwlinux@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-26ext4: do not commit super on read-only bdevEric Sandeen1-1/+2
Under some circumstances we may encounter a filesystem error on a read-only block device, and if we try to save the error info to the superblock and commit it, we'll wind up with a noisy error and backtrace, i.e.: [ 3337.146838] EXT4-fs error (device pmem1p2): ext4_get_journal_inode:4634: comm mount: inode #0: comm mount: iget: illegal inode # ------------[ cut here ]------------ generic_make_request: Trying to write to read-only block-device pmem1p2 (partno 2) WARNING: CPU: 107 PID: 115347 at block/blk-core.c:788 generic_make_request_checks+0x6b4/0x7d0 ... To avoid this, commit the error info in the superblock only if the block device is writable. Reported-by: Ritesh Harjani <riteshh@linux.ibm.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/4b6e774d-cc00-3469-7abb-108eb151071a@sandeen.net Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-26ext4: avoid ENOSPC when avoiding to reuse recently deleted inodesJan Kara1-5/+18
When ext4 is running on a filesystem without a journal, it tries not to reuse recently deleted inodes to provide better chances for filesystem recovery in case of crash. However this logic forbids reuse of freed inodes for up to 5 minutes and especially for filesystems with smaller number of inodes can lead to ENOSPC errors returned when allocating new inodes. Fix the problem by allowing to reuse recently deleted inode if there's no other inode free in the scanned range. Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20200318121317.31941-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-26ext4: unregister sysfs path before destroying jbd2 journalRitesh Harjani1-1/+7
Call ext4_unregister_sysfs(), before destroying jbd2 journal, since below might cause, NULL pointer dereference issue. This got reported with LTP tests. ext4_put_super() cat /sys/fs/ext4/loop2/journal_task | ext4_attr_show(); ext4_jbd2_journal_destroy(); | | journal_task_show() | | | task_pid_vnr(NULL); sbi->s_journal = NULL; Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20200318061301.4320-1-riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-26ext4: check for non-zero journal inum in ext4_calculate_overheadRitesh Harjani1-1/+2
While calculating overhead for internal journal, also check that j_inum shouldn't be 0. Otherwise we get below error with xfstests generic/050 with external journal (XXX_LOGDEV config) enabled. It could be simply reproduced with loop device with an external journal and marking blockdev as RO before mounting. [ 3337.146838] EXT4-fs error (device pmem1p2): ext4_get_journal_inode:4634: comm mount: inode #0: comm mount: iget: illegal inode # ------------[ cut here ]------------ generic_make_request: Trying to write to read-only block-device pmem1p2 (partno 2) WARNING: CPU: 107 PID: 115347 at block/blk-core.c:788 generic_make_request_checks+0x6b4/0x7d0 CPU: 107 PID: 115347 Comm: mount Tainted: G L --------- -t - 4.18.0-167.el8.ppc64le #1 NIP: c0000000006f6d44 LR: c0000000006f6d40 CTR: 0000000030041dd4 <...> NIP [c0000000006f6d44] generic_make_request_checks+0x6b4/0x7d0 LR [c0000000006f6d40] generic_make_request_checks+0x6b0/0x7d0 <...> Call Trace: generic_make_request_checks+0x6b0/0x7d0 (unreliable) generic_make_request+0x3c/0x420 submit_bio+0xd8/0x200 submit_bh_wbc+0x1e8/0x250 __sync_dirty_buffer+0xd0/0x210 ext4_commit_super+0x310/0x420 [ext4] __ext4_error+0xa4/0x1e0 [ext4] __ext4_iget+0x388/0xe10 [ext4] ext4_get_journal_inode+0x40/0x150 [ext4] ext4_calculate_overhead+0x5a8/0x610 [ext4] ext4_fill_super+0x3188/0x3260 [ext4] mount_bdev+0x778/0x8f0 ext4_mount+0x28/0x50 [ext4] mount_fs+0x74/0x230 vfs_kern_mount.part.6+0x6c/0x250 do_mount+0x2fc/0x1280 sys_mount+0x158/0x180 system_call+0x5c/0x70 EXT4-fs (pmem1p2): no journal found EXT4-fs (pmem1p2): can't get journal size EXT4-fs (pmem1p2): mounted filesystem without journal. Opts: dax,norecovery Fixes: 3c816ded78bb ("ext4: use journal inode to determine journal overhead") Reported-by: Harish Sriram <harish@linux.ibm.com> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20200316093038.25485-1-riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-25block: move the part_stat* helpers from genhd.h to a new headerChristoph Hellwig2-1/+2
These macros are just used by a few files. Move them out of genhd.h, which is included everywhere into a new standalone header. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-24block: remove __bdevnameChristoph Hellwig1-3/+3
There is no good reason for __bdevname to exist. Just open code printing the string in the callers. For three of them the format string can be trivially merged into existing printk statements, and in init/do_mounts.c we can at least do the scnprintf once at the start of the function, and unconditional of CONFIG_BLOCK to make the output for tiny configfs a little more helpful. Acked-by: Theodore Ts'o <tytso@mit.edu> # for ext4 Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-19ext4: wire up FS_IOC_GET_ENCRYPTION_NONCEEric Biggers1-0/+6
This new ioctl retrieves a file's encryption nonce, which is useful for testing. See the corresponding fs/crypto/ patch for more details. Link: https://lore.kernel.org/r/20200314205052.93294-3-ebiggers@kernel.org Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-03-14ext4: remove map_from_cluster from ext4_ext_map_blocksEric Whitney1-4/+1
We can use the variable allocated_clusters rather than map_from_clusters to control reserved block/cluster accounting in ext4_ext_map_blocks. This eliminates a variable and associated code and improves readability a little. Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com> Signed-off-by: Eric Whitney <enwlinux@gmail.com> Link: https://lore.kernel.org/r/20200311205125.25061-1-enwlinux@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-14ext4: clean up ext4_ext_insert_extent() call in ext4_ext_map_blocks()Eric Whitney1-13/+17
Now that the eofblocks code has been removed, we don't need to assign 0 to err before calling ext4_ext_insert_extent() since it will assign a return value to ret anyway. The variable free_on_err can be eliminated and replaced by a reference to allocated_clusters which clearly conveys the idea that newly allocated blocks should be freed when recovering from an extent insertion failure. The error handling code itself should be restructured so that it errors out immediately on an insertion failure in the case where no new blocks have been allocated (bigalloc) rather than proceeding further into the mapping code. The initializer for fb_flags can also be rearranged for improved readability. Finally, insert a missing space in nearby code. No known bugs are addressed by this patch - it's simply a cleanup. Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com> Signed-off-by: Eric Whitney <enwlinux@gmail.com> Link: https://lore.kernel.org/r/20200311205033.25013-1-enwlinux@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-14ext4: mark block bitmap corrupted when found instead of BUGONDmitry Monakhov1-2/+9
We already has similar code in ext4_mb_complex_scan_group(), but ext4_mb_simple_scan_group() still affected. Other reports: https://www.spinics.net/lists/linux-ext4/msg60231.html Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Dmitry Monakhov <dmonakhov@gmail.com> Link: https://lore.kernel.org/r/20200310150156.641-1-dmonakhov@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-14ext4: use flexible-array member for xattr structsGustavo A. R. Silva1-2/+2
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Link: https://lore.kernel.org/r/20200309180813.GA3347@embeddedor Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-14ext4: use flexible-array member in struct fnameGustavo A. R. Silva1-1/+1
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Link: https://lore.kernel.org/r/20200309154838.GA31559@embeddedor Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-14ext4: move ext4_fiemap to use iomap frameworkRitesh Harjani2-283/+48
This patch moves ext4_fiemap to use iomap framework. For xattr a new 'ext4_iomap_xattr_ops' is added. Reported-by: kbuild test robot <lkp@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Link: https://lore.kernel.org/r/b9f45c885814fcdd0631747ff0fe08886270828c.1582880246.git.riteshh@linux.ibm.com Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-14ext4: make ext4_ind_map_blocks work with fiemapRitesh Harjani1-0/+16
For indirect block mapping if the i_block > max supported block in inode then ext4_ind_map_blocks() returns a -EIO error. But in case of fiemap this could be a valid query to ->iomap_begin call. So check if the offset >= s_bitmap_maxbytes in ext4_iomap_begin_report(), then simply skip calling ext4_map_blocks(). Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/87fa0ddc5967fa707656212a3b66a7233425325c.1582880246.git.riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-14ext4: move ext4 bmap to use iomap infrastructureRitesh Harjani1-1/+1
ext4_iomap_begin is already implemented which provides ext4_map_blocks, so just move the API from generic_block_bmap to iomap_bmap for iomap conversion. Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Link: https://lore.kernel.org/r/8bbd53bd719d5ccfecafcce93f2bf1d7955a44af.1582880246.git.riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-14ext4: optimize ext4_ext_precache for 0 depthRitesh Harjani1-3/+6
This patch avoids the memory alloc & free path when depth is 0, since anyway there is no extra caching done in that case. So on checking depth 0, simply return early. Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/93da0d0f073c73358e85bb9849d8a5378d1da539.1582880246.git.riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>