summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2018-01-24ovl: encode pure upper file handlesAmir Goldstein3-1/+106
Encode overlay file handles as struct ovl_fh containing the file handle encoding of the real upper inode. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24vfs: factor out helpers d_instantiate_anon() and d_alloc_anon()Miklos Szeredi1-31/+56
Those helpers are going to be used by overlayfs to implement NFS export decode. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: store 'has_upper' and 'opaque' as bit flagsAmir Goldstein5-20/+41
We need to make some room in struct ovl_entry to store information about redirected ancestors for NFS export, so cram two booleans as bit flags. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: copy up of disconnected dentriesAmir Goldstein3-19/+48
With NFS export, some operations on decoded file handles (e.g. open, link, setattr, xattr_set) may call copy up with a disconnected non-dir. In this case, we will copy up lower inode to index dir without linking it to upper dir. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: use d_splice_alias() in place of d_add() in lookupAmir Goldstein1-3/+1
This is required for NFS export. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: do not pass overlay dentry to ovl_get_inode()Amir Goldstein3-12/+12
This is needed for using ovl_get_inode() for decoding file handles for NFS export. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: factor out ovl_get_index_fh() helperAmir Goldstein2-10/+50
The helper is needed to lookup an index by file handle for NFS export. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: whiteout orphan index entries on mountAmir Goldstein2-4/+40
Orphan index entries are non-dir index entries whose union nlink count dropped to zero. With index=on, orphan index entries are removed on mount. With NFS export feature enabled, orphan index entries are replaced with white out index entries to block future open by handle from opening the lower file. When dir index has a stale 'upper' xattr, we assume that the upper dir was removed and we treat the dir index as orphan entry that needs to be whited out or removed. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: whiteout index when union nlink drops to zeroAmir Goldstein3-29/+48
With NFS export feature enabled, when overlay inode nlink drops to zero, instead of removing the index entry, replace it with a whiteout index entry. This is needed for NFS export in order to prevent future open by handle from opening the lower file directly. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: cleanup dir index when dir nlink drops to zeroAmir Goldstein1-3/+3
When non-dir index union nlink drops to zero the non-dir index is cleaned. Do the same for directory type index entries when union directory is removed. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: index directories on copy up for NFS exportAmir Goldstein2-7/+117
With the NFS export feature enabled, all dirs are indexed on copy up. Non-dir files are copied up directly to indexdir and then hardlinked to upper dir. Directories are copied up to indexdir, then an index entry is created in indexdir with 'upper' xattr pointing to the copied up dir and then the copied up dir is moved to upper dir. Directory index is also used for consistency verification, like detecting multiple redirected dirs to the same lower dir on lookup. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: index all non-dir on copy up for NFS exportAmir Goldstein1-0/+4
With the NFS export feature enabled, all non-dir are indexed on copy up. The copy up origin inode of an indexed non-dir can be used as a unique identifier of the overlay object. The full index is also used for consistency verfication, like detecting multiple non-hardlink uppers with the same 'origin' on lookup. Directory index on copy up will be implemented by following patch. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: create ovl_need_index() helperAmir Goldstein3-9/+22
The helper determines which lower file needs to be indexed on copy up and before nlink changes. For index=on, the helper evaluates to true for lower hardlinks. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: cleanup temp index entriesAmir Goldstein1-0/+12
A previous failed attempt to create or whiteout a directory index may leave index entries named '#%x' in the index dir. Cleanup those temp entries on mount instead of failing the mount. In the future, we may drop 'work' dir and use 'index' dir instead. This change is enough for cleaning up copy up leftovers 'from the future', but it is not enough for cleaning up rmdir leftovers 'from the future' (i.e. temp dir containing whiteouts). Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: verify directory index entries on mountAmir Goldstein2-32/+94
Directory index entries should have 'upper' xattr pointing to the real upper dir. Verifying that the upper dir file handle is not stale is expensive, so only verify stale directory index entries on mount if NFS export feature is enabled. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: verify whiteout index entries on mountAmir Goldstein1-8/+13
Whiteout index entries are used as an indication that an exported overlay file handle should be treated as stale (i.e. after unlink of the overlay inode). Check on mount that whiteout index entries have a name that looks like a valid file handle and cleanup invalid index entries. For whiteout index entries, do not check that they also have valid origin fh and nlink xattr, because those xattr do not exist for a whiteout index entry. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: use directory index entries for consistency verificationAmir Goldstein3-11/+58
A directory index is a directory type entry in index dir with a "trusted.overlay.upper" xattr containing an encoded ovl_fh of the merge directory upper dir inode. On lookup of non-dir files, lower file is followed by origin file handle. On lookup of dir entries, lower dir is found by name and then compared to origin file handle. We only trust dir index if we verified that lower dir matches origin file handle, otherwise index may be inconsistent and we ignore it. If we find an indexed non-upper dir or an indexed merged dir, whose index 'upper' xattr points to a different upper dir, that means that the lower directory may be also referenced by another upper dir via redirect, so we fail the lookup on inconsistency error. To be consistent with directory index entries format, the association of index dir to upper root dir, that was stored by older kernels in "trusted.overlay.origin" xattr is now stored in "trusted.overlay.upper" xattr. This also serves as an indication that overlay was mounted with a kernel that support index directory entries. For backward compatibility, if an 'origin' xattr exists on the index dir we also verify it on mount. Directory index entries are going to be used for NFS export. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: unbless lower st_ino of unverified originAmir Goldstein1-4/+12
On a malformed overlay, several redirected dirs can point to the same dir on a lower layer. This presents a similar challenge as broken hardlinks, because different objects in the overlay can return the same st_ino/st_dev pair from stat(2). For broken hardlinks, we do not provide constant st_ino on copy up to avoid this inconsistency. When NFS export feature is enabled, apply the same logic to files and directories with unverified lower origin. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: verify stored origin fh matches lower dirAmir Goldstein1-0/+12
When the NFS export feature is enabled, overlayfs implicitly enables the feature "verify_lower". When the "verify_lower" feature is enabled, a directory inode found in lower layer by name or by redirect_dir is verified against the file handle of the copy up origin that is stored in the upper layer. This introduces a change of behavior for the case of lower layer modification while overlay is offline. A lower directory created or moved offline under an exisitng upper directory, will not be merged with that upper directory. The NFS export feature should not be used after copying layers, because the new lower directory inodes would fail verification and won't be merged with upper directories. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: add support for "nfs_export" configurationAmir Goldstein5-5/+85
Introduce the "nfs_export" config, module and mount options. The NFS export feature depends on the "index" feature and enables two implicit overlayfs features: "index_all" and "verify_lower". The "index_all" feature creates an index on copy up of every file and directory. The "verify_lower" feature uses the full index to detect overlay filesystems inconsistencies on lookup, like redirect from multiple upper dirs to the same lower dir. NFS export can be enabled for non-upper mount with no index. However, because lower layer redirects cannot be verified with the index, enabling NFS export support on an overlay with no upper layer requires turning off redirect follow (e.g. "redirect_dir=nofollow"). The full index may incur some overhead on mount time, especially when verifying that lower directory file handles are not stale. NFS export support, full index and consistency verification will be implemented by following patches. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: update documentation of inodes index featureAmir Goldstein1-6/+3
Document that inode index feature solves breaking hard links on copy up. Simplify Kconfig backward compatibility disclaimer. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: generalize ovl_verify_origin() and helpersAmir Goldstein4-30/+38
Remove the "origin" language from the functions that handle set, get and verify of "origin" xattr and pass the xattr name as an argument. The same helpers are going to be used for NFS export to get, get and verify the "upper" xattr for directory index entries. ovl_verify_origin() is now a helper used only to verify non upper file handle stored in "origin" xattr of upper inode. The upper root dir file handle is still stored in "origin" xattr on the index dir for backward compatibility. This is going to be changed by the patch that adds directory index entries support. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: simplify arguments to ovl_check_origin_fh()Amir Goldstein4-30/+24
Pass the fs instance with lower_layers array instead of the dentry lowerstack array to ovl_check_origin_fh(), because the dentry members of lowerstack play no role in this helper. This change simplifies the argument list of ovl_check_origin(), ovl_cleanup_index() and ovl_verify_index(). Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: factor out ovl_check_origin_fh()Amir Goldstein1-50/+92
Re-factor ovl_check_origin() and ovl_get_origin(), so origin fh xattr is read from upper inode only once during lookup with multiple lower layers and only once when verifying index entry origin. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: store layer index in ovl_layerAmir Goldstein3-16/+4
Store the fs root layer index inside ovl_layer struct, so we can get the root fs layer index from merge dir lower layer instead of find it with ovl_find_layer() helper. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: force r/o mount when index dir creation failsAmir Goldstein1-3/+9
When work dir creation fails, a warning is emitted and overlay is mounted r/o. Trying to remount r/w will fail with no work dir. When index dir creation fails, the same warning is emitted and overlay is mounted r/o, but trying to remount r/w will succeed. This may cause unintentional corruption of filesystem consistency. Adjust the behavior of index dir creation failure to that of work dir creation failure and do not allow to remount r/w. User needs to state an explicitly intention to work without an index by mounting with option 'index=off' to allow r/w mount with no index dir. When mounting with option 'index=on' and no 'upperdir', index is implicitly disabled, so do not warn about no file handle support. The issue was introduced with inodes index feature in v4.13, but this patch will not apply cleanly before ovl_fill_super() re-factoring in v4.15. Fixes: 02bcd1577400 ("ovl: introduce the inodes index dir feature") Cc: <stable@vger.kernel.org> #v4.13 Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: disable index when no xattr supportAmir Goldstein1-1/+2
Overlayfs falls back to index=off if lower/upper fs does not support file handles. Do the same if upper fs does not support xattr. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: fix inconsistent d_ino for legacy merge dirAmir Goldstein3-2/+37
For a merge dir that was copied up before v4.12 or that was hand crafted offline (e.g. mkdir {upper/lower}/dir), upper dir does not contain the 'trusted.overlay.origin' xattr. In that case, stat(2) on the merge dir returns the lower dir st_ino, but getdents(2) returns the upper dir d_ino. After this change, on merge dir lookup, missing origin xattr on upper dir will be fixed and 'impure' xattr will be fixed on parent of the legacy merge dir. Suggested-by: zhangyi (F) <yi.zhang@huawei.com> Reviewed-by: zhangyi (F) <yi.zhang@huawei.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-19ovl: take mnt_want_write() for removing impure xattrAmir Goldstein1-2/+9
The optimization in ovl_cache_get_impure() that tries to remove an unneeded "impure" xattr needs to take mnt_want_write() on upper fs. Fixes: 4edb83bb1041 ("ovl: constant d_ino for non-merge dirs") Cc: <stable@vger.kernel.org> #v4.14 Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-19ovl: take mnt_want_write() for work/index dir setupAmir Goldstein1-8/+17
There are several write operations on upper fs not covered by mnt_want_write(): - test set/remove OPAQUE xattr - test create O_TMPFILE - set ORIGIN xattr in ovl_verify_origin() - cleanup of index entries in ovl_indexdir_cleanup() Some of these go way back, but this patch only applies over the v4.14 re-factoring of ovl_fill_super(). Cc: <stable@vger.kernel.org> #v4.14 Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-19ovl: fix another overlay: warning prefixAmir Goldstein1-1/+2
Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-19ovl: take lower dir inode mutex outside upper sb_writers lockAmir Goldstein3-65/+58
The functions ovl_lower_positive() and ovl_check_empty_dir() both take inode mutex on the real lower dir under ovl_want_write() which takes the upper_mnt sb_writers lock. While this is not a clear locking order or layering violation, it creates an undesired lock dependency between two unrelated layers for no good reason. This lock dependency materializes to a false(?) positive lockdep warning when calling rmdir() on a nested overlayfs, where both nested and underlying overlayfs both use the same fs type as upper layer. rmdir() on the nested overlayfs creates the lock chain: sb_writers of upper_mnt (e.g. tmpfs) in ovl_do_remove() ovl_i_mutex_dir_key[] of lower overlay dir in ovl_lower_positive() rmdir() on the underlying overlayfs creates the lock chain in reverse order: ovl_i_mutex_dir_key[] of lower overlay dir in vfs_rmdir() sb_writers of nested upper_mnt (e.g. tmpfs) in ovl_do_remove() To rid of the unneeded locking dependency, move both ovl_lower_positive() and ovl_check_empty_dir() to before ovl_want_write() in rmdir() and rename() implementation. This change spreads the pieces of ovl_check_empty_and_clear() directly inside the rmdir()/rename() implementations so the helper is no longer needed and removed. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-19ovl: fix failure to fsync lower dirAmir Goldstein1-1/+5
As a writable mount, it is not expected for overlayfs to return EINVAL/EROFS for fsync, even if dir/file is not changed. This commit fixes the case of fsync of directory, which is easier to address, because overlayfs already implements fsync file operation for directories. The problem reported by Raphael is that new PostgreSQL 10.0 with a database in overlayfs where lower layer in squashfs fails to start. The failure is due to fsync error, when PostgreSQL does fsync on all existing db directories on startup and a specific directory exists lower layer with no changes. Reported-by: Raphael Hertzog <raphael@ouaza.com> Cc: <stable@vger.kernel.org> # v3.18 Signed-off-by: Amir Goldstein <amir73il@gmail.com> Tested-by: Raphaël Hertzog <hertzog@debian.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-19ovl: hash directory inodes for fsnotifyAmir Goldstein3-13/+31
fsnotify pins a watched directory inode in cache, but if directory dentry is released, new lookup will allocate a new dentry and a new inode. Directory events will be notified on the new inode, while fsnotify listener is watching the old pinned inode. Hash all directory inodes to reuse the pinned inode on lookup. Pure upper dirs are hashes by real upper inode, merge and lower dirs are hashed by real lower inode. The reference to lower inode was being held by the lower dentry object in the overlay dentry (oe->lowerstack[0]). Releasing the overlay dentry may drop lower inode refcount to zero. Add a refcount on behalf of the overlay inode to prevent that. As a by-product, hashing directory inodes also detects multiple redirected dirs to the same lower dir and uncovered redirected dir target on and returns -ESTALE on lookup. The reported issue dates back to initial version of overlayfs, but this patch depends on ovl_inode code that was introduced in kernel v4.13. Cc: <stable@vger.kernel.org> #v4.13 Reported-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Tested-by: Niklas Cassel <niklas.cassel@axis.com>
2018-01-06Merge branch 'for-linus' of ↵Linus Torvalds1-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: - untangle sys_close() abuses in xt_bpf - deal with register_shrinker() failures in sget() * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix "netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1'" sget(): handle failures of register_shrinker() mm,vmscan: Make unregister_shrinker() no-op if register_shrinker() failed.
2018-01-05Merge tag 'for-4.15-rc7-tag' of ↵Linus Torvalds2-12/+34
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "We have two more fixes for 4.15, both aimed for stable. The leak fix is obvious, the second patch fixes a bug revealed by the refcount API, when it behaves differently than previous atomic_t and reports refs going from 0 to 1 in one case" * tag 'for-4.15-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix refcount_t usage when deleting btrfs_delayed_nodes btrfs: Fix flush bio leak
2018-01-05Merge tag 'xfs-4.15-fixes-10' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds3-19/+33
Pull XFS fixes from Darrick Wong: "I have just a few fixes for bugs and resource cleanup problems this week: - Fix resource cleanup of failed quota initialization - Fix integer overflow problems wrt s_maxbytes" * tag 'xfs-4.15-fixes-10' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix s_maxbytes overflow problems xfs: quota: check result of register_shrinker() xfs: quota: fix missed destroy of qi_tree_lock
2018-01-04userfaultfd: clear the vma->vm_userfaultfd_ctx if UFFD_EVENT_FORK failsAndrea Arcangeli1-2/+18
The previous fix in commit 384632e67e08 ("userfaultfd: non-cooperative: fix fork use after free") corrected the refcounting in case of UFFD_EVENT_FORK failure for the fork userfault paths. That still didn't clear the vma->vm_userfaultfd_ctx of the vmas that were set to point to the aborted new uffd ctx earlier in dup_userfaultfd. Link: http://lkml.kernel.org/r/20171223002505.593-2-aarcange@redhat.com Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reviewed-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-03Merge branch 'afs-fixes' of ↵Linus Torvalds4-12/+39
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull afs/fscache fixes from David Howells: - Fix the default return of fscache_maybe_release_page() when a cache isn't in use - it prevents a filesystem from releasing pages. This can cause a system to OOM. - Fix a potential uninitialised variable in AFS. - Fix AFS unlink's handling of the nlink count. It needs to use the nlink manipulation functions so that inode structs of deleted inodes actually get scheduled for destruction. - Fix error handling in afs_write_end() so that the page gets unlocked and put if we can't fill the unwritten portion. * 'afs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Fix missing error handling in afs_write_end() afs: Fix unlink afs: Potential uninitialized variable in afs_extract_data() fscache: Fix the default for fscache_maybe_release_page()
2018-01-03exec: Weaken dumpability for secureexecKees Cook1-2/+7
This is a logical revert of commit e37fdb785a5f ("exec: Use secureexec for setting dumpability") This weakens dumpability back to checking only for uid/gid changes in current (which is useless), but userspace depends on dumpability not being tied to secureexec. https://bugzilla.redhat.com/show_bug.cgi?id=1528633 Reported-by: Tom Horsley <horsley1953@gmail.com> Fixes: e37fdb785a5f ("exec: Use secureexec for setting dumpability") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-02xfs: fix s_maxbytes overflow problemsDarrick J. Wong2-3/+3
Fix some integer overflow problems if offset + count happen to be large enough to cause an integer overflow. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-01-02xfs: quota: check result of register_shrinker()Aliaksei Karaliou1-16/+29
xfs_qm_init_quotainfo() does not check result of register_shrinker() which was tagged as __must_check recently, reported by sparse. Signed-off-by: Aliaksei Karaliou <akaraliou.dev@gmail.com> [darrick: move xfs_qm_destroy_quotainos nearer xfs_qm_init_quotainos] Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-01-02xfs: quota: fix missed destroy of qi_tree_lockAliaksei Karaliou1-0/+1
xfs_qm_destroy_quotainfo() does not destroy quotainfo->qi_tree_lock while destroys quotainfo->qi_quotaofflock. Signed-off-by: Aliaksei Karaliou <akaraliou.dev@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-01-02btrfs: fix refcount_t usage when deleting btrfs_delayed_nodesChris Mason1-11/+34
refcounts have a generic implementation and an asm optimized one. The generic version has extra debugging to make sure that once a refcount goes to zero, refcount_inc won't increase it. The btrfs delayed inode code wasn't expecting this, and we're tripping over the warnings when the generic refcounts are used. We ended up with this race: Process A Process B btrfs_get_delayed_node() spin_lock(root->inode_lock) radix_tree_lookup() __btrfs_release_delayed_node() refcount_dec_and_test(&delayed_node->refs) our refcount is now zero refcount_add(2) <--- warning here, refcount unchanged spin_lock(root->inode_lock) radix_tree_delete() With the generic refcounts, we actually warn again when process B above tries to release his refcount because refcount_add() turned into a no-op. We saw this in production on older kernels without the asm optimized refcounts. The fix used here is to use refcount_inc_not_zero() to detect when the object is in the middle of being freed and return NULL. This is almost always the right answer anyway, since we usually end up pitching the delayed_node if it didn't have fresh data in it. This also changes __btrfs_release_delayed_node() to remove the extra check for zero refcounts before radix tree deletion. btrfs_get_delayed_node() was the only path that was allowing refcounts to go from zero to one. Fixes: 6de5f18e7b0da ("btrfs: fix refcount_t usage when deleting btrfs_delayed_node") CC: <stable@vger.kernel.org> # 4.12+ Signed-off-by: Chris Mason <clm@fb.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-02btrfs: Fix flush bio leakNikolay Borisov1-1/+0
Commit e0ae99941423 ("btrfs: preallocate device flush bio") reworked the way the flush bio is allocated and used. Concretely it allocates the bio in __alloc_device and then re-uses it multiple times with a very simple endio routine that just calls complete() without consuming a reference. Allocated bios by default come with a ref count of 1, which is then consumed by the endio routine (or not, in which case they should be bio_put by the caller). The way the impleementation works now is that the flush bio has a refcount of 2 and we only ever bio_put it once, leaving it to hang indefinitely. Fix this by removing the extra bio_get in __alloc_device. Fixes: e0ae99941423 ("btrfs: preallocate device flush bio") Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-02afs: Fix missing error handling in afs_write_end()David Howells1-3/+5
afs_write_end() is missing page unlock and put if afs_fill_page() fails. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com>
2018-01-02afs: Fix unlinkDavid Howells2-8/+33
Repeating creation and deletion of a file on an afs mount will run the box out of memory, e.g.: dd if=/dev/zero of=/afs/scratch/m0 bs=$((1024*1024)) count=512 rm /afs/scratch/m0 The problem seems to be that it's not properly decrementing the nlink count so that the inode can be scrapped. Note that this doesn't fix local creation followed by remote deletion. That's harder to handle and will require a separate patch as we're not told that the file has been deleted - only that the directory has changed. Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com>
2018-01-02afs: Potential uninitialized variable in afs_extract_data()Dan Carpenter1-1/+1
Smatch warns that: fs/afs/rxrpc.c:922 afs_extract_data() error: uninitialized symbol 'remote_abort'. Smatch is right that "remote_abort" might be uninitialized when we pass it to afs_set_call_complete(). I don't know if that function uses the uninitialized variable. Anyway, the comment for rxrpc_kernel_recv_data(), says that "*_abort should also be initialised to 0." and this patch does that. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Howells <dhowells@redhat.com>
2017-12-22Merge tag 'xfs-4.15-fixes-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds19-97/+258
Pull xfs fixes from Darrick Wong: "Here are some XFS fixes for 4.15-rc5. Apologies for the unusually large number of patches this late, but I wanted to make sure the corruption fixes were really ready to go. Changes since last update: - Fix a locking problem during xattr block conversion that could lead to the log checkpointing thread to try to write an incomplete buffer to disk, which leads to a corruption shutdown - Fix a null pointer dereference when removing delayed allocation extents - Remove post-eof speculative allocations when reflinking a block past current inode size so that we don't just leave them there and assert on inode reclaim - Relax an assert which didn't accurately reflect the way locking works and would trigger under heavy io load - Avoid infinite loop when cancelling copy on write extents after a writeback failure - Try to avoid copy on write transaction reservation overflows when remapping after a successful write - Fix various problems with the copy-on-write reservation automatic garbage collection not being cleaned up properly during a ro remount - Fix problems with rmap log items being processed in the wrong order, leading to corruption shutdowns - Fix problems with EFI recovery wherein the "remove any rmapping if present" mechanism wasn't actually doing anything, which would lead to corruption problems later when the extent is reallocated, leading to multiple rmaps for the same extent" * tag 'xfs-4.15-fixes-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: only skip rmap owner checks for unknown-owner rmap removal xfs: always honor OWN_UNKNOWN rmap removal requests xfs: queue deferred rmap ops for cow staging extent alloc/free in the right order xfs: set cowblocks tag for direct cow writes too xfs: remove leftover CoW reservations when remounting ro xfs: don't be so eager to clear the cowblocks tag on truncate xfs: track cowblocks separately in i_flags xfs: allow CoW remap transactions to use reserve blocks xfs: avoid infinite loop when cancelling CoW blocks after writeback failure xfs: relax is_reflink_inode assert in xfs_reflink_find_cow_mapping xfs: remove dest file's post-eof preallocations before reflinking xfs: move xfs_iext_insert tracepoint to report useful information xfs: account for null transactions in bunmapi xfs: hold xfs_buf locked between shortform->leaf conversion and the addition of an attribute xfs: add the ability to join a held buffer to a defer_ops
2017-12-21xfs: only skip rmap owner checks for unknown-owner rmap removalDarrick J. Wong1-24/+52
For rmap removal, refactor the rmap owner checks into a separate function, then skip the checks if we are performing an unknown-owner removal. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>