summaryrefslogtreecommitdiffstats
path: root/fs/reiserfs/super.c
AgeCommit message (Collapse)AuthorFilesLines
2022-10-10Merge tag 'mm-stable-2022-10-08' of ↵Linus Torvalds1-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in linux-next for a couple of months without, to my knowledge, any negative reports (or any positive ones, come to that). - Also the Maple Tree from Liam Howlett. An overlapping range-based tree for vmas. It it apparently slightly more efficient in its own right, but is mainly targeted at enabling work to reduce mmap_lock contention. Liam has identified a number of other tree users in the kernel which could be beneficially onverted to mapletrees. Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat at [1]. This has yet to be addressed due to Liam's unfortunately timed vacation. He is now back and we'll get this fixed up. - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses clang-generated instrumentation to detect used-unintialized bugs down to the single bit level. KMSAN keeps finding bugs. New ones, as well as the legacy ones. - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of memory into THPs. - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support file/shmem-backed pages. - userfaultfd updates from Axel Rasmussen - zsmalloc cleanups from Alexey Romanov - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure - Huang Ying adds enhancements to NUMA balancing memory tiering mode's page promotion, with a new way of detecting hot pages. - memcg updates from Shakeel Butt: charging optimizations and reduced memory consumption. - memcg cleanups from Kairui Song. - memcg fixes and cleanups from Johannes Weiner. - Vishal Moola provides more folio conversions - Zhang Yi removed ll_rw_block() :( - migration enhancements from Peter Xu - migration error-path bugfixes from Huang Ying - Aneesh Kumar added ability for a device driver to alter the memory tiering promotion paths. For optimizations by PMEM drivers, DRM drivers, etc. - vma merging improvements from Jakub Matěn. - NUMA hinting cleanups from David Hildenbrand. - xu xin added aditional userspace visibility into KSM merging activity. - THP & KSM code consolidation from Qi Zheng. - more folio work from Matthew Wilcox. - KASAN updates from Andrey Konovalov. - DAMON cleanups from Kaixu Xia. - DAMON work from SeongJae Park: fixes, cleanups. - hugetlb sysfs cleanups from Muchun Song. - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core. Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1] * tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits) hugetlb: allocate vma lock for all sharable vmas hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer hugetlb: fix vma lock handling during split vma and range unmapping mglru: mm/vmscan.c: fix imprecise comments mm/mglru: don't sync disk for each aging cycle mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol mm: memcontrol: use do_memsw_account() in a few more places mm: memcontrol: deprecate swapaccounting=0 mode mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled mm/secretmem: remove reduntant return value mm/hugetlb: add available_huge_pages() func mm: remove unused inline functions from include/linux/mm_inline.h selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd selftests/vm: add thp collapse shmem testing selftests/vm: add thp collapse file and tmpfs testing selftests/vm: modularize thp collapse memory operations selftests/vm: dedup THP helpers mm/khugepaged: add tracepoint to hpage_collapse_scan_file() mm/madvise: add file and shmem support to MADV_COLLAPSE ...
2022-09-11reiserfs: replace ll_rw_block()Zhang Yi1-3/+1
ll_rw_block() is not safe for the sync read/write path because it cannot guarantee that submitting read/write IO if the buffer has been locked. We could get false positive EIO after wait_on_buffer() in read path if the buffer has been locked by others. So stop using ll_rw_block() in reiserfs. We also switch to new bh_readahead_batch() helper for the buffer array readahead path. Link: https://lkml.kernel.org/r/20220901133505.2510834-10-yi.zhang@huawei.com Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-19fs/reiserfs: replace ternary operator with min() and min_t()Jiangshan Yi1-5/+2
Fix the following coccicheck warning: fs/reiserfs/prints.c:459: WARNING opportunity for min(). fs/reiserfs/resize.c:100: WARNING opportunity for min(). fs/reiserfs/super.c:2508: WARNING opportunity for min(). fs/reiserfs/super.c:2557: WARNING opportunity for min(). min() and min_t() macro is defined in include/linux/minmax.h. It avoids multiple evaluations of the arguments when non-constant and performs strict type-checking. Signed-off-by: Jiangshan Yi <yijiangshan@kylinos.cn> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220819075240.3199477-1-13667453960@163.com
2022-07-14fs/buffer: Combine two submit_bh() and ll_rw_block() argumentsBart Van Assche1-1/+1
Both submit_bh() and ll_rw_block() accept a request operation type and request flags as their first two arguments. Micro-optimize these two functions by combining these first two arguments into a single argument. This patch does not change the behavior of any of the modified code. Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jan Kara <jack@suse.cz> Acked-by: Song Liu <song@kernel.org> (for the md changes) Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20220714180729.1065367-48-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-25Merge tag 'fs_for_v5.18-rc1' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull reiserfs updates from Jan Kara: "The biggest change in this pull is the addition of a deprecation message about reiserfs with the outlook that we'd eventually be able to remove it from the kernel. Because it is practically unmaintained and untested and odd enough that people don't want to bother with it anymore... Otherwise there are small udf and ext2 fixes" * tag 'fs_for_v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: udf: remove redundant assignment of variable etype reiserfs: Deprecate reiserfs ext2: correct max file size computing reiserfs: get rid of AOP_FLAG_CONT_EXPAND flag
2022-03-22fs: allocate inode by using alloc_inode_sb()Muchun Song1-1/+1
The inode allocation is supposed to use alloc_inode_sb(), so convert kmem_cache_alloc() of all filesystems to alloc_inode_sb(). Link: https://lkml.kernel.org/r/20220228122126.37293-5-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Theodore Ts'o <tytso@mit.edu> [ext4] Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Alex Shi <alexs@kernel.org> Cc: Anna Schumaker <Anna.Schumaker@Netapp.com> Cc: Chao Yu <chao@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Fam Zheng <fam.zheng@bytedance.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kari Argillander <kari.argillander@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Xiongchun Duan <duanxiongchun@bytedance.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-02reiserfs: Deprecate reiserfsJan Kara1-0/+2
Reiserfs is relatively old filesystem and its development has ceased quite some years ago. Linux distributions moved away from it towards other filesystems such as btrfs, xfs, or ext4. To reduce maintenance burden on cross filesystem changes (such as new mount API, iomap, folios ...) let's add a deprecation notice when the filesystem is mounted and schedule its removal to 2025. Link: https://lore.kernel.org/r/20220225125445.29942-1-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz>
2021-11-06Merge tag 'fs_for_v5.16-rc1' of ↵Linus Torvalds1-6/+0
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota, isofs, and reiserfs updates from Jan Kara: "Fixes for handling of corrupted quota files, fix for handling of corrupted isofs filesystem, and a small cleanup for reiserfs" * tag 'fs_for_v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fs: reiserfs: remove useless new_opts in reiserfs_remount isofs: Fix out of bound access for corrupted isofs image quota: correct error number in free_dqentry() quota: check block number when reading the block in quota file
2021-10-27fs: reiserfs: remove useless new_opts in reiserfs_remountDongliang Mu1-6/+0
Since the commit c3d98ea08291 ("VFS: Don't use save/replace_mount_options if not using generic_show_options") eliminates replace_mount_options in reiserfs_remount, but does not handle the allocated new_opts, it will cause memory leak in the reiserfs_remount. Because new_opts is useless in reiserfs_mount, so we fix this bug by removing the useless new_opts in reiserfs_remount. Fixes: c3d98ea08291 ("VFS: Don't use save/replace_mount_options if not using generic_show_options") Link: https://lore.kernel.org/r/20211027143445.4156459-1-mudongliangabcd@gmail.com Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2021-10-18reiserfs: use sb_bdev_nr_blocksChristoph Hellwig1-3/+1
Use the sb_bdev_nr_blocks helper instead of open coding it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211018101130.1838532-30-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18reiserfs: use bdev_nr_bytes instead of open coding itChristoph Hellwig1-3/+1
Use the proper helper to read the block device size and remove two cargo culted checks that can't be false. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20211018101130.1838532-23-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-16reiserfs: add check for root_inode in reiserfs_fill_superYu Kuai1-0/+8
Our syzcaller report a NULL pointer dereference: BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 116e95067 P4D 116e95067 PUD 1080b5067 PMD 0 Oops: 0010 [#1] SMP KASAN CPU: 7 PID: 592 Comm: a.out Not tainted 5.13.0-next-20210629-dirty #67 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-p4 RIP: 0010:0x0 Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. RSP: 0018:ffff888114e779b8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 1ffff110229cef39 RCX: ffffffffaa67e1aa RDX: 0000000000000000 RSI: ffff88810a58ee00 RDI: ffff8881233180b0 RBP: ffffffffac38e9c0 R08: ffffffffaa67e17e R09: 0000000000000001 R10: ffffffffb91c5557 R11: fffffbfff7238aaa R12: ffff88810a58ee00 R13: ffff888114e77aa0 R14: 0000000000000000 R15: ffff8881233180b0 FS: 00007f946163c480(0000) GS:ffff88839f1c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 00000001099c1000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __lookup_slow+0x116/0x2d0 ? page_put_link+0x120/0x120 ? __d_lookup+0xfc/0x320 ? d_lookup+0x49/0x90 lookup_one_len+0x13c/0x170 ? __lookup_slow+0x2d0/0x2d0 ? reiserfs_schedule_old_flush+0x31/0x130 reiserfs_lookup_privroot+0x64/0x150 reiserfs_fill_super+0x158c/0x1b90 ? finish_unfinished+0xb10/0xb10 ? bprintf+0xe0/0xe0 ? __mutex_lock_slowpath+0x30/0x30 ? __kasan_check_write+0x20/0x30 ? up_write+0x51/0xb0 ? set_blocksize+0x9f/0x1f0 mount_bdev+0x27c/0x2d0 ? finish_unfinished+0xb10/0xb10 ? reiserfs_kill_sb+0x120/0x120 get_super_block+0x19/0x30 legacy_get_tree+0x76/0xf0 vfs_get_tree+0x49/0x160 ? capable+0x1d/0x30 path_mount+0xacc/0x1380 ? putname+0x97/0xd0 ? finish_automount+0x450/0x450 ? kmem_cache_free+0xf8/0x5a0 ? putname+0x97/0xd0 do_mount+0xe2/0x110 ? path_mount+0x1380/0x1380 ? copy_mount_options+0x69/0x140 __x64_sys_mount+0xf0/0x190 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae This is because 'root_inode' is initialized with wrong mode, and it's i_op is set to 'reiserfs_special_inode_operations'. Thus add check for 'root_inode' to fix the problem. Link: https://lore.kernel.org/r/20210702040743.1918552-1-yukuai3@huawei.com Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz>
2021-04-12reiserfs: convert to fileattrMiklos Szeredi1-1/+1
Use the fileattr API to let the VFS handle locking, permission checking and conversion. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: Jan Kara <jack@suse.cz>
2020-08-28reiserfs: Fix memory leak in reiserfs_parse_options()Jan Kara1-4/+4
When a usrjquota or grpjquota mount option is used multiple times, we will leak memory allocated for the file name. Make sure the last setting is used and all the previous ones are properly freed. Reported-by: syzbot+c9e294bbe0333a6b7640@syzkaller.appspotmail.com Signed-off-by: Jan Kara <jack@suse.cz>
2019-12-16reiserfs: Fix spurious unlock in reiserfs_fill_super() error handlingJan Kara1-1/+1
When we fail to allocate string for journal device name we jump to 'error' label which tries to unlock reiserfs write lock which is not held. Jump to 'error_unlocked' instead. Fixes: f32485be8397 ("reiserfs: delay reiserfs lock until journal initialization") Signed-off-by: Jan Kara <jack@suse.cz>
2019-12-16reiserfs: Fix memory leak of journal device stringJan Kara1-0/+2
When a filesystem is mounted with jdev mount option, we store the journal device name in an allocated string in superblock. However we fail to ever free that string. Fix it. Reported-by: syzbot+1c6756baf4b16b94d2a6@syzkaller.appspotmail.com Fixes: c3aa077648e1 ("reiserfs: Properly display mount options in /proc/mounts") CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz>
2019-10-31reiserfs: fix extended attributes on the root directoryJeff Mahoney1-0/+2
Since commit d0a5b995a308 (vfs: Add IOP_XATTR inode operations flag) extended attributes haven't worked on the root directory in reiserfs. This is due to reiserfs conditionally setting the sb->s_xattrs handler array depending on whether it located or create the internal privroot directory. It necessarily does this after the root inode is already read in. The IOP_XATTR flag is set during inode initialization, so it never gets set on the root directory. This commit unconditionally assigns sb->s_xattrs and clears IOP_XATTR on internal inodes. The old return values due to the conditional assignment are handled via open_xa_root, which now returns EOPNOTSUPP as the VFS would have done. Link: https://lore.kernel.org/r/20191024143127.17509-1-jeffm@suse.com CC: stable@vger.kernel.org Fixes: d0a5b995a308 ("vfs: Add IOP_XATTR inode operations flag") Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Jan Kara <jack@suse.cz>
2019-08-30fs: Fill in max and min timestamps in superblockDeepa Dinamani1-0/+3
Fill in the appropriate limits to avoid inconsistencies in the vfs cached inode times when timestamps are outside the permitted range. Even though some filesystems are read-only, fill in the timestamps to reflect the on-disk representation. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Acked-By: Tigran Aivazian <aivazian.tigran@gmail.com> Acked-by: Jeff Layton <jlayton@kernel.org> Cc: aivazian.tigran@gmail.com Cc: al@alarsen.net Cc: coda@cs.cmu.edu Cc: darrick.wong@oracle.com Cc: dushistov@mail.ru Cc: dwmw2@infradead.org Cc: hch@infradead.org Cc: jack@suse.com Cc: jaharkes@cs.cmu.edu Cc: luisbg@kernel.org Cc: nico@fluxnic.net Cc: phillip@squashfs.org.uk Cc: richard@nod.at Cc: salah.triki@gmail.com Cc: shaggy@kernel.org Cc: linux-xfs@vger.kernel.org Cc: codalist@coda.cs.cmu.edu Cc: linux-ext4@vger.kernel.org Cc: linux-mtd@lists.infradead.org Cc: jfs-discussion@lists.sourceforge.net Cc: reiserfs-devel@vger.kernel.org
2019-05-01reiserfs: convert to ->free_inode()Al Viro1-8/+2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-11-30Merge branch 'for_linus' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota & reiserfs changes from Jan Kara: - two error checking improvements for quota - remove bogus i_version increase for reiserfs * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: quota: Check for register_shrinker() failure. quota: propagate error from __dquot_initialize reiserfs: remove unneeded i_version bump
2017-11-27Rename superblock flags (MS_xyz -> SB_xyz)Linus Torvalds1-9/+9
This is a pure automated search-and-replace of the internal kernel superblock flags. The s_flags are now called SB_*, with the names and the values for the moment mirroring the MS_* flags that they're equivalent to. Note how the MS_xyz flags are the ones passed to the mount system call, while the SB_xyz flags are what we then use in sb->s_flags. The script to do this was: # places to look in; re security/*: it generally should *not* be # touched (that stuff parses mount(2) arguments directly), but # there are two places where we really deal with superblock flags. FILES="drivers/mtd drivers/staging/lustre fs ipc mm \ include/linux/fs.h include/uapi/linux/bfs_fs.h \ security/apparmor/apparmorfs.c security/apparmor/include/lib.h" # the list of MS_... constants SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \ DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \ POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \ I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \ ACTIVE NOUSER" SED_PROG= for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done # we want files that contain at least one of MS_..., # with fs/namespace.c and fs/pnode.c excluded. L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c') for f in $L; do sed -i $f $SED_PROG; done Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-27reiserfs: remove unneeded i_version bumpJeff Layton1-1/+0
The i_version field in reiserfs is not initialized and is only ever updated here. Nothing ever views it, so just remove it. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>
2017-07-17VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb)David Howells1-9/+9
Firstly by applying the following with coccinelle's spatch: @@ expression SB; @@ -SB->s_flags & MS_RDONLY +sb_rdonly(SB) to effect the conversion to sb_rdonly(sb), then by applying: @@ expression A, SB; @@ ( -(!sb_rdonly(SB)) && A +!sb_rdonly(SB) && A | -A != (sb_rdonly(SB)) +A != sb_rdonly(SB) | -A == (sb_rdonly(SB)) +A == sb_rdonly(SB) | -!(sb_rdonly(SB)) +!sb_rdonly(SB) | -A && (sb_rdonly(SB)) +A && sb_rdonly(SB) | -A || (sb_rdonly(SB)) +A || sb_rdonly(SB) | -(sb_rdonly(SB)) != A +sb_rdonly(SB) != A | -(sb_rdonly(SB)) == A +sb_rdonly(SB) == A | -(sb_rdonly(SB)) && A +sb_rdonly(SB) && A | -(sb_rdonly(SB)) || A +sb_rdonly(SB) || A ) @@ expression A, B, SB; @@ ( -(sb_rdonly(SB)) ? 1 : 0 +sb_rdonly(SB) | -(sb_rdonly(SB)) ? A : B +sb_rdonly(SB) ? A : B ) to remove left over excess bracketage and finally by applying: @@ expression A, SB; @@ ( -(A & MS_RDONLY) != sb_rdonly(SB) +(bool)(A & MS_RDONLY) != sb_rdonly(SB) | -(A & MS_RDONLY) == sb_rdonly(SB) +(bool)(A & MS_RDONLY) == sb_rdonly(SB) ) to make comparisons against the result of sb_rdonly() (which is a bool) work correctly. Signed-off-by: David Howells <dhowells@redhat.com>
2017-07-06VFS: Don't use save/replace_mount_options if not using generic_show_optionsDavid Howells1-4/+0
btrfs, debugfs, reiserfs and tracefs call save_mount_options() and reiserfs calls replace_mount_options(), but they then implement their own ->show_options() methods and don't touch s_options, rendering the saved options unnecessary. I'm trying to eliminate s_options to make it easier to implement a context-based mount where the mount options can be passed individually over a file descriptor. Remove the calls to save/replace_mount_options() call in these cases. Signed-off-by: David Howells <dhowells@redhat.com> cc: Chris Mason <clm@fb.com> cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> cc: Steven Rostedt <rostedt@goodmis.org> cc: linux-btrfs@vger.kernel.org cc: reiserfs-devel@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-04-19reiserfs: Set flags on quota files directlyJan Kara1-3/+54
Currently immutable and noatime flags on quota files are set by quota code which requires us to copy inode->i_flags to our on disk version of quota flags in GETFLAGS ioctl and when writing stat item. Move to setting / clearing these on-disk flags directly to save that copying. Signed-off-by: Jan Kara <jack@suse.cz>
2017-04-05reiserfs: Protect dquot_writeback_dquots() by s_umount semaphoreJan Kara1-0/+14
dquot_writeback_dquots() expects s_umount semaphore to be held to protect it from other concurrent quota operations. reiserfs_sync_fs() can call dquot_writeback_dquots() without holding s_umount semaphore when called from flush_old_commits(). Fix the problem by grabbing s_umount in flush_old_commits(). However we have to be careful and use only trylock since reiserfs_cancel_old_sync() can be waiting for flush_old_commits() to complete while holding s_umount semaphore. Possible postponing of sync work is not a big deal though as that is only an opportunistic flush. Fixes: 9d1ccbe70e0b14545caad12dc73adb3605447df0 Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Jan Kara <jack@suse.cz>
2017-04-05reiserfs: Make cancel_old_flush() reliableJan Kara1-6/+15
Currently canceling of delayed work that flushes old data using cancel_old_flush() does not prevent work from being requeued. Thus in theory new work can be queued after cancel_old_flush() from reiserfs_freeze() has run. This will become larger problem once flush_old_commits() can requeue the work itself. Fix the problem by recording in sbi->work_queue that flushing work is canceled and should not be requeued. Signed-off-by: Jan Kara <jack@suse.cz>
2017-02-27fs/reiserfs: atomically read inode sizeFabian Frederick1-1/+1
See i_size_read() comments in include/linux/fs.h Link: http://lkml.kernel.org/r/20170123174701.30394-1-fabf@skynet.be Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-05quota: constify struct path in quota_onAl Viro1-2/+2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-10Merge branch 'for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs updates from Al Viro: ">rename2() work from Miklos + current_time() from Deepa" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: Replace current_fs_time() with current_time() fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps fs: Replace CURRENT_TIME with current_time() for inode timestamps fs: proc: Delete inode time initializations in proc_alloc_inode() vfs: Add current_time() api vfs: add note about i_op->rename changes to porting fs: rename "rename2" i_op to "rename" vfs: remove unused i_op->rename fs: make remaining filesystems use .rename2 libfs: support RENAME_NOREPLACE in simple_rename() fs: support RENAME_NOREPLACE for local filesystems ncpfs: fix unused variable warning
2016-09-27fs: Replace CURRENT_TIME_SEC with current_time() for inode timestampsDeepa Dinamani1-1/+1
CURRENT_TIME_SEC is not y2038 safe. current_time() will be transitioned to use 64 bit time along with vfs in a separate patch. There is no plan to transistion CURRENT_TIME_SEC to use y2038 safe time interfaces. current_time() will also be extended to use superblock range checking parameters when range checking is introduced. This works because alloc_super() fills in the the s_time_gran in super block to NSEC_PER_SEC. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-16reiserfs: Unlock superblock before calling reiserfs_quota_on_mount()Mike Galbraith1-1/+11
If we hold the superblock lock while calling reiserfs_quota_on_mount(), we can deadlock our own worker - mount blocks kworker/3:2, sleeps forever more. crash> ps|grep UN 715 2 3 ffff880220734d30 UN 0.0 0 0 [kworker/3:2] 9369 9341 2 ffff88021ffb7560 UN 1.3 493404 123184 Xorg 9665 9664 3 ffff880225b92ab0 UN 0.0 47368 812 udisks-daemon 10635 10403 3 ffff880222f22c70 UN 0.0 14904 936 mount crash> bt ffff880220734d30 PID: 715 TASK: ffff880220734d30 CPU: 3 COMMAND: "kworker/3:2" #0 [ffff8802244c3c20] schedule at ffffffff8144584b #1 [ffff8802244c3cc8] __rt_mutex_slowlock at ffffffff814472b3 #2 [ffff8802244c3d28] rt_mutex_slowlock at ffffffff814473f5 #3 [ffff8802244c3dc8] reiserfs_write_lock at ffffffffa05f28fd [reiserfs] #4 [ffff8802244c3de8] flush_async_commits at ffffffffa05ec91d [reiserfs] #5 [ffff8802244c3e08] process_one_work at ffffffff81073726 #6 [ffff8802244c3e68] worker_thread at ffffffff81073eba #7 [ffff8802244c3ec8] kthread at ffffffff810782e0 #8 [ffff8802244c3f48] kernel_thread_helper at ffffffff81450064 crash> rd ffff8802244c3cc8 10 ffff8802244c3cc8: ffffffff814472b3 ffff880222f23250 .rD.....P2.".... ffff8802244c3cd8: 0000000000000000 0000000000000286 ................ ffff8802244c3ce8: ffff8802244c3d30 ffff880220734d80 0=L$.....Ms .... ffff8802244c3cf8: ffff880222e8f628 0000000000000000 (.."............ ffff8802244c3d08: 0000000000000000 0000000000000002 ................ crash> struct rt_mutex ffff880222e8f628 struct rt_mutex { wait_lock = { raw_lock = { slock = 65537 } }, wait_list = { node_list = { next = 0xffff8802244c3d48, prev = 0xffff8802244c3d48 } }, owner = 0xffff880222f22c71, save_state = 0 } crash> bt 0xffff880222f22c70 PID: 10635 TASK: ffff880222f22c70 CPU: 3 COMMAND: "mount" #0 [ffff8802216a9868] schedule at ffffffff8144584b #1 [ffff8802216a9910] schedule_timeout at ffffffff81446865 #2 [ffff8802216a99a0] wait_for_common at ffffffff81445f74 #3 [ffff8802216a9a30] flush_work at ffffffff810712d3 #4 [ffff8802216a9ab0] schedule_on_each_cpu at ffffffff81074463 #5 [ffff8802216a9ae0] invalidate_bdev at ffffffff81178aba #6 [ffff8802216a9af0] vfs_load_quota_inode at ffffffff811a3632 #7 [ffff8802216a9b50] dquot_quota_on_mount at ffffffff811a375c #8 [ffff8802216a9b80] finish_unfinished at ffffffffa05dd8b0 [reiserfs] #9 [ffff8802216a9cc0] reiserfs_fill_super at ffffffffa05de825 [reiserfs] RIP: 00007f7b9303997a RSP: 00007ffff443c7a8 RFLAGS: 00010202 RAX: 00000000000000a5 RBX: ffffffff8144ef12 RCX: 00007f7b932e9ee0 RDX: 00007f7b93d9a400 RSI: 00007f7b93d9a3e0 RDI: 00007f7b93d9a3c0 RBP: 00007f7b93d9a2c0 R8: 00007f7b93d9a550 R9: 0000000000000001 R10: ffffffffc0ed040e R11: 0000000000000202 R12: 000000000000040e R13: 0000000000000000 R14: 00000000c0ed040e R15: 00007ffff443ca20 ORIG_RAX: 00000000000000a5 CS: 0033 SS: 002b Signed-off-by: Mike Galbraith <efault@gmx.de> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Mike Galbraith <mgalbraith@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Jan Kara <jack@suse.cz>
2016-07-26Merge branch 'for-4.8/core' of git://git.kernel.dk/linux-blockLinus Torvalds1-1/+1
Pull core block updates from Jens Axboe: - the big change is the cleanup from Mike Christie, cleaning up our uses of command types and modified flags. This is what will throw some merge conflicts - regression fix for the above for btrfs, from Vincent - following up to the above, better packing of struct request from Christoph - a 2038 fix for blktrace from Arnd - a few trivial/spelling fixes from Bart Van Assche - a front merge check fix from Damien, which could cause issues on SMR drives - Atari partition fix from Gabriel - convert cfq to highres timers, since jiffies isn't granular enough for some devices these days. From Jan and Jeff - CFQ priority boost fix idle classes, from me - cleanup series from Ming, improving our bio/bvec iteration - a direct issue fix for blk-mq from Omar - fix for plug merging not involving the IO scheduler, like we do for other types of merges. From Tahsin - expose DAX type internally and through sysfs. From Toshi and Yigal * 'for-4.8/core' of git://git.kernel.dk/linux-block: (76 commits) block: Fix front merge check block: do not merge requests without consulting with io scheduler block: Fix spelling in a source code comment block: expose QUEUE_FLAG_DAX in sysfs block: add QUEUE_FLAG_DAX for devices to advertise their DAX support Btrfs: fix comparison in __btrfs_map_block() block: atari: Return early for unsupported sector size Doc: block: Fix a typo in queue-sysfs.txt cfq-iosched: Charge at least 1 jiffie instead of 1 ns cfq-iosched: Fix regression in bonnie++ rewrite performance cfq-iosched: Convert slice_resid from u64 to s64 block: Convert fifo_time from ulong to u64 blktrace: avoid using timespec block/blk-cgroup.c: Declare local symbols static block/bio-integrity.c: Add #include "blk.h" block/partition-generic.c: Remove a set-but-not-used variable block: bio: kill BIO_MAX_SIZE cfq-iosched: temporarily boost queue priority for idle classes block: drbd: avoid to use BIO_MAX_SIZE block: bio: remove BIO_MAX_SECTORS ...
2016-06-07fs: have ll_rw_block users pass in op and flags separatelyMike Christie1-1/+1
This has ll_rw_block users pass in the operation and flags separately, so ll_rw_block can setup the bio op and bi_rw flags on the bio that is submitted. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-25reiserfs: check kstrdup failureMikulas Patocka1-2/+7
Check out-of-memory failure of the kstrdup option. Note that the argument "arg" may be NULL (in that case kstrup returns NULL), so out of memory condition happened if arg was non-NULL and kstrdup returned NULL. The patch also changes the call to replace_mount_options - if we didn't pass any filesystem-specific options, we don't call replace_mount_options (thus we don't erase existing reported options). Note that to properly report options after remount, the reiserfs filesystem should implement the show_options method. Without the show_options method, options changed with remount replace existing options. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>
2016-02-09quota: Add support for ->get_nextdqblk() for VFS quotaJan Kara1-0/+1
Add infrastructure for supporting get_nextdqblk() callback for VFS quotas. Translate the operation into a callback to appropriate filesystem and consequently to quota format callback. Signed-off-by: Jan Kara <jack@suse.cz>
2016-01-21reiserfs: fix dereference of ERR_PTRSudip Mukherjee1-1/+1
reiserfs_iget() returns either NULL or error code in ERR_PTR. And we were only checking for NULL, so in case of some other error we will try to dereference the ERR_PTR(-errno) thinking it to be a valid pointer. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14kmemcg: account certain kmem allocations to memcgVladimir Davydov1-1/+2
Mark those kmem allocations that are known to be easily triggered from userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to memcg. For the list, see below: - threadinfo - task_struct - task_delay_info - pid - cred - mm_struct - vm_area_struct and vm_region (nommu) - anon_vma and anon_vma_chain - signal_struct - sighand_struct - fs_struct - files_struct - fdtable and fdtable->full_fds_bits - dentry and external_name - inode for all filesystems. This is the most tedious part, because most filesystems overwrite the alloc_inode method. The list is far from complete, so feel free to add more objects. Nevertheless, it should be close to "account everything" approach and keep most workloads within bounds. Malevolent users will be able to breach the limit, but this was possible even with the former "account everything" approach (simply because it did not account everything in fact). [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Tejun Heo <tj@kernel.org> Cc: Greg Thelen <gthelen@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04fs: create and use seq_show_option for escapingKees Cook1-3/+5
Many file systems that implement the show_options hook fail to correctly escape their output which could lead to unescaped characters (e.g. new lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files. This could lead to confusion, spoofed entries (resulting in things like systemd issuing false d-bus "mount" notifications), and who knows what else. This looks like it would only be the root user stepping on themselves, but it's possible weird things could happen in containers or in other situations with delegated mount privileges. Here's an example using overlay with setuid fusermount trusting the contents of /proc/mounts (via the /etc/mtab symlink). Imagine the use of "sudo" is something more sneaky: $ BASE="ovl" $ MNT="$BASE/mnt" $ LOW="$BASE/lower" $ UP="$BASE/upper" $ WORK="$BASE/work/ 0 0 none /proc fuse.pwn user_id=1000" $ mkdir -p "$LOW" "$UP" "$WORK" $ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt $ cat /proc/mounts none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0 none /proc fuse.pwn user_id=1000 0 0 $ fusermount -u /proc $ cat /proc/mounts cat: /proc/mounts: No such file or directory This fixes the problem by adding new seq_show_option and seq_show_option_n helpers, and updating the vulnerable show_option handlers to use them as needed. Some, like SELinux, need to be open coded due to unusual existing escape mechanisms. [akpm@linux-foundation.org: add lost chunk, per Kees] [keescook@chromium.org: seq_show_option should be using const parameters] Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Jan Kara <jack@suse.com> Acked-by: Paul Moore <paul@paul-moore.com> Cc: J. R. Okajima <hooanon05g@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-26Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-2/+1
Merge second patchbomb from Andrew Morton: - most of the rest of MM - lots of misc things - procfs updates - printk feature work - updates to get_maintainer, MAINTAINERS, checkpatch - lib/ updates * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (96 commits) exit,stats: /* obey this comment */ coredump: add __printf attribute to cn_*printf functions coredump: use from_kuid/kgid when formatting corename fs/reiserfs: remove unneeded cast NILFS2: support NFSv2 export fs/befs/btree.c: remove unneeded initializations fs/minix: remove unneeded cast init/do_mounts.c: add create_dev() failure log kasan: remove duplicate definition of the macro KASAN_FREE_PAGE fs/efs: femove unneeded cast checkpatch: emit "NOTE: <types>" message only once after multiple files checkpatch: emit an error when there's a diff in a changelog checkpatch: validate MODULE_LICENSE content checkpatch: add multi-line handling for PREFER_ETHER_ADDR_COPY checkpatch: suggest using eth_zero_addr() and eth_broadcast_addr() checkpatch: fix processing of MEMSET issues checkpatch: suggest using ether_addr_equal*() checkpatch: avoid NOT_UNIFIED_DIFF errors on cover-letter.patch files checkpatch: remove local from codespell path checkpatch: add --showfile to allow input via pipe to show filenames ...
2015-06-25fs/reiserfs: remove unneeded castFiro Yang1-2/+1
kmem_cache_alloc() returns void*. Signed-off-by: Firo Yang <firogm@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-02writeback: separate out include/linux/backing-dev-defs.hTejun Heo1-0/+1
With the planned cgroup writeback support, backing-dev related declarations will be more widely used across block and cgroup; unfortunately, including backing-dev.h from include/linux/blkdev.h makes cyclic include dependency quite likely. This patch separates out backing-dev-defs.h which only has the essential definitions and updates blkdev.h to include it. c files which need access to more backing-dev details now include backing-dev.h directly. This takes backing-dev.h off the common include dependency chain making it a lot easier to use it across block and cgroup. v2: fs/fat build failure fixed. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-04-26Merge branch 'for-linus' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull fourth vfs update from Al Viro: "d_inode() annotations from David Howells (sat in for-next since before the beginning of merge window) + four assorted fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: RCU pathwalk breakage when running into a symlink overmounting something fix I_DIO_WAKEUP definition direct-io: only inc/dec inode->i_dio_count for file systems fs/9p: fix readdir() VFS: assorted d_backing_inode() annotations VFS: fs/inode.c helpers: d_inode() annotations VFS: fs/cachefiles: d_backing_inode() annotations VFS: fs library helpers: d_inode() annotations VFS: assorted weird filesystems: d_inode() annotations VFS: normal filesystems (and lustre): d_inode() annotations VFS: security/: d_inode() annotations VFS: security/: d_backing_inode() annotations VFS: net/: d_inode() annotations VFS: net/unix: d_backing_inode() annotations VFS: kernel/: d_inode() annotations VFS: audit: d_backing_inode() annotations VFS: Fix up some ->d_inode accesses in the chelsio driver VFS: Cachefiles should perform fs modifications on the top layer only VFS: AF_UNIX sockets should call mknod on the top layer only
2015-04-15VFS: normal filesystems (and lustre): d_inode() annotationsDavid Howells1-2/+2
that's the bulk of filesystem drivers dealing with inodes of their own Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-03-04quota: Make VFS quotas use new interface for getting quota infoJan Kara1-1/+1
Create new internal interface for getting information about quota which contains everything needed for both VFS quotas and XFS quotas. Make VFS use this and hook it up to Q_GETINFO. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2014-12-12reiserfs: destroy allocated commit workqueueJiri Slaby1-0/+3
When resirefs is trying to mount a partition, it creates a commit workqueue (sbi->commit_wq). But when mount fails later, the workqueue is not freed. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Reported-by: auxsvr@gmail.com Reported-by: Benoît Monin <benoit.monin@gmx.fr> Cc: Jan Kara <jack@suse.cz> Cc: stable@vger.kernel.org # >= 3.16 Cc: reiserfs-devel@vger.kernel.org Fixes: 797d9016ceca69879bb273218810fa0beef46aac Signed-off-by: Jan Kara <jack@suse.cz>
2014-11-10reiserfs: Convert to private i_dquot fieldJan Kara1-0/+11
CC: reiserfs-devel@vger.kernel.org CC: Jeff Mahoney <jeffm@suse.de> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2014-09-17reiserfs: Don't use MAXQUOTAS valueJan Kara1-8/+8
MAXQUOTAS value defines maximum number of quota types VFS supports. This isn't necessarily the number of types reiserfs supports and with addition of project quotas these two numbers stop matching. So make reiserfs use its private definition. CC: reiserfs-devel@vger.kernel.org CC: Jeff Mahoney <jeffm@suse.de> Signed-off-by: Jan Kara <jack@suse.cz>
2014-08-13Merge branch 'for_linus' of ↵Linus Torvalds1-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota, reiserfs, UDF updates from Jan Kara: "Scalability improvements for quota, a few reiserfs fixes, and couple of misc cleanups (udf, ext2)" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: reiserfs: Fix use after free in journal teardown reiserfs: fix corruption introduced by balance_leaf refactor udf: avoid redundant memcpy when writing data in ICB fs/udf: re-use hex_asc_upper_{hi,lo} macros fs/quota: kernel-doc warning fixes udf: use linux/uaccess.h fs/ext2/super.c: Drop memory allocation cast quota: remove dqptr_sem quota: simplify remove_inode_dquot_ref() quota: avoid unnecessary dqget()/dqput() calls quota: protect Q_GETFMT by dqonoff_mutex
2014-08-12reiserfs: Fix use after free in journal teardownJan Kara1-1/+5
If do_journal_release() races with do_journal_end() which requeues delayed works for transaction flushing, we can leave work items for flushing outstanding transactions queued while freeing them. That results in use after free and possible crash in run_timers_softirq(). Fix the problem by not requeueing works if superblock is being shut down (MS_ACTIVE not set) and using cancel_delayed_work_sync() in do_journal_release(). CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz>