summaryrefslogtreecommitdiffstats
path: root/fs/overlayfs
AgeCommit message (Collapse)AuthorFilesLines
2022-05-30Merge tag 'ovl-update-5.19' of ↵Linus Torvalds11-324/+529
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs updates from Miklos Szeredi: - Support idmapped layers in overlayfs (Christian Brauner) - Add a fix to exportfs that is relevant to open_by_handle_at(2) as well - Introduce new lookup helpers that allow passing mnt_userns into inode_permission() * tag 'ovl-update-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: support idmapped layers ovl: handle idmappings in ovl_xattr_{g,s}et() ovl: handle idmappings in layer open helpers ovl: handle idmappings in ovl_permission() ovl: use ovl_copy_{real,upper}attr() wrappers ovl: store lower path in ovl_inode ovl: handle idmappings for layer lookup ovl: handle idmappings for layer fileattrs ovl: use ovl_path_getxattr() wrapper ovl: use ovl_lookup_upper() wrapper ovl: use ovl_do_notify_change() wrapper ovl: pass layer mnt to ovl_open_realfile() ovl: pass ofs to setattr operations ovl: handle idmappings in creation operations ovl: add ovl_upper_mnt_userns() wrapper ovl: pass ofs to creation operations ovl: use wrappers to all vfs_*xattr() calls exportfs: support idmapped mounts fs: add two trivial lookup helpers
2022-05-09VFS: add FMODE_CAN_ODIRECT file flagNeilBrown1-9/+4
Currently various places test if direct IO is possible on a file by checking for the existence of the direct_IO address space operation. This is a poor choice, as the direct_IO operation may not be used - it is only used if the generic_file_*_iter functions are called for direct IO and some filesystems - particularly NFS - don't do this. Instead, introduce a new f_mode flag: FMODE_CAN_ODIRECT and change the various places to check this (avoiding pointer dereferences). do_dentry_open() will set this flag if ->direct_IO is present, so filesystems do not need to be changed. NFS *is* changed, to set the flag explicitly and discard the direct_IO entry in the address_space_operations for files. Other filesystems which currently use noop_direct_IO could usefully be changed to set this flag instead. Link: https://lkml.kernel.org/r/164859778128.29473.15189737957277399416.stgit@noble.brown Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: NeilBrown <neilb@suse.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-28ovl: support idmapped layersChristian Brauner2-5/+1
Now that overlay is able to take a layers idmapping into account allow overlay mounts to be created on top of idmapped mounts. Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-04-28ovl: handle idmappings in ovl_xattr_{g,s}et()Christian Brauner1-4/+6
When retrieving xattrs from the upper or lower layers take the relevant mount's idmapping into account. We rely on the previously introduced ovl_i_path_real() helper to retrieve the relevant path. This is needed to support idmapped base layers with overlay. Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-04-28ovl: handle idmappings in layer open helpersChristian Brauner2-4/+7
In earlier patches we already passed down the relevant upper or lower path to ovl_open_realfile(). Now let the open helpers actually take the idmapping of the relevant mount into account when checking permissions. This is needed to support idmapped base layers with overlay. Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-04-28ovl: handle idmappings in ovl_permission()Christian Brauner1-3/+6
Use the previously introduced ovl_i_path_real() helper to retrieve the relevant upper or lower path and take the mount's idmapping into account for the lower layer permission check. This is needed to support idmapped base layers with overlay. Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-04-28ovl: use ovl_copy_{real,upper}attr() wrappersChristian Brauner5-29/+47
When copying inode attributes from the upper or lower layer to ovl inodes we need to take the upper or lower layer's mount's idmapping into account. In a lot of places we call ovl_copyattr() only on upper inodes and in some we call it on either upper or lower inodes. Split this into two separate helpers. The first one should only be called on upper inodes and is thus called ovl_copy_upperattr(). The second one can be called on upper or lower inodes. We add ovl_copy_realattr() for this task. The new helper makes use of the previously added ovl_i_path_real() helper. This is needed to support idmapped base layers with overlay. When overlay copies the inode information from an upper or lower layer to the relevant overlay inode it will apply the idmapping of the upper or lower layer when doing so. The ovl inode ownership will thus always correctly reflect the ownership of the idmapped upper or lower layer. All idmapping helpers are nops when no idmapped base layers are used. Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-04-28ovl: store lower path in ovl_inodeAmir Goldstein5-8/+27
Create some ovl_i_* helpers to get real path from ovl inode. Instead of just stashing struct inode for the lower layer we stash struct path for the lower layer. The helpers allow to retrieve a struct path for the relevant upper or lower layer. This will be used when retrieving information based on struct inode when copying up inode attributes from upper or lower inodes to ovl inodes and when checking permissions in ovl_permission() in following patches. This is needed to support idmapped base layers with overlay. Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-04-28ovl: handle idmappings for layer lookupChristian Brauner3-11/+18
Make the two places where lookup helpers can be called either on lower or upper layers take the mount's idmapping into account. To this end we pass down the mount in struct ovl_lookup_data. It can later also be used to construct struct path for various other helpers. This is needed to support idmapped base layers with overlay. Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-04-28ovl: handle idmappings for layer fileattrsChristian Brauner1-1/+1
Take the upper mount's idmapping into account when setting fileattrs on the upper layer. This is needed to support idmapped base layers with overlay. Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-04-28ovl: use ovl_path_getxattr() wrapperChristian Brauner6-65/+111
Add a helper that allows to retrieve ovl xattrs from either lower or upper layers. To stop passing mnt and dentry separately everywhere use struct path which more accurately reflects the tight coupling between mount and dentry in this helper. Swich over all places to pass a path argument that can operate on either upper or lower layers. This is needed to support idmapped base layers with overlayfs. Some helpers are always called with an upper dentry, which is now utilized by these helpers to create the path. Make this usage explicit by renaming the argument to "upperdentry" and by renaming the function as well in some cases. Also add a check in ovl_do_getxattr() to catch misuse of these functions. Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-04-28ovl: use ovl_lookup_upper() wrapperChristian Brauner6-28/+37
Introduce ovl_lookup_upper() as a simple wrapper around lookup_one(). Make it clear in the helper's name that this only operates on the upper layer. The wrapper will take upper layer's idmapping into account when checking permission in lookup_one(). Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-04-28ovl: use ovl_do_notify_change() wrapperChristian Brauner5-7/+37
Introduce ovl_do_notify_change() as a simple wrapper around notify_change() to support idmapped layers. The helper mirrors other ovl_do_*() helpers that operate on the upper layers. When changing ownership of an upper object the intended ownership needs to be mapped according to the upper layer's idmapping. This mapping is the inverse to the mapping applied when copying inode information from an upper layer to the corresponding overlay inode. So e.g., when an upper mount maps files that are stored on-disk as owned by id 1001 to 1000 this means that calling stat on this object from an idmapped mount will report the file as being owned by id 1000. Consequently in order to change ownership of an object in this filesystem so it appears as being owned by id 1000 in the upper idmapped layer it needs to store id 1001 on disk. The mnt mapping helpers take care of this. All idmapping helpers are nops when no idmapped base layers are used. Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-04-28ovl: pass layer mnt to ovl_open_realfile()Amir Goldstein3-9/+28
Ensure that ovl_open_realfile() takes the mount's idmapping into account. We add a new helper ovl_path_realdata() that can be used to easily retrieve the relevant path which we can pass down. This is needed to support idmapped base layers with overlay. Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-04-28ovl: pass ofs to setattr operationsChristian Brauner3-10/+13
Pass down struct ovl_fs to setattr operations so we can ultimately retrieve the relevant upper mount and take the mount's idmapping into account when creating new filesystem objects. This is needed to support idmapped base layers with overlay. Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-04-28ovl: handle idmappings in creation operationsChristian Brauner1-11/+11
When creating objects in the upper layer we need to pass down the upper idmapping into the respective vfs helpers in order to support idmapped base layers. The vfs helpers will take care of the rest. Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-04-28ovl: add ovl_upper_mnt_userns() wrapperChristian Brauner1-0/+5
Add a tiny wrapper to retrieve the upper mount's idmapping. Have it return the initial idmapping until we have prepared and converted all places to take the relevant idmapping into account. Then we can switch on idmapped layer support by having ovl_upper_mnt_userns() return the upper mount's idmapping. Suggested-by: Miklos Szeredi <mszeredi@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-04-28ovl: pass ofs to creation operationsChristian Brauner6-97/+121
Pass down struct ovl_fs to all creation helpers so we can ultimately retrieve the relevant upper mount and take the mount's idmapping into account when creating new filesystem objects. This is needed to support idmapped base layers with overlay. Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-04-28ovl: use wrappers to all vfs_*xattr() callsAmir Goldstein8-54/+75
Use helpers ovl_*xattr() to access user/trusted.overlay.* xattrs and use helpers ovl_do_*xattr() to access generic xattrs. This is a preparatory patch for using idmapped base layers with overlay. Note that a few of those places called vfs_*xattr() calls directly to reduce the amount of debug output. But as Miklos pointed out since overlayfs has been stable for quite some time the debug output isn't all that relevant anymore and the additional debug in all locations was actually quite helpful when developing this patch series. Cc: <linux-unionfs@vger.kernel.org> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-03-22fs: allocate inode by using alloc_inode_sb()Muchun Song1-1/+1
The inode allocation is supposed to use alloc_inode_sb(), so convert kmem_cache_alloc() of all filesystems to alloc_inode_sb(). Link: https://lkml.kernel.org/r/20220228122126.37293-5-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Theodore Ts'o <tytso@mit.edu> [ext4] Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Alex Shi <alexs@kernel.org> Cc: Anna Schumaker <Anna.Schumaker@Netapp.com> Cc: Chao Yu <chao@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Fam Zheng <fam.zheng@bytedance.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kari Argillander <kari.argillander@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Xiongchun Duan <duanxiongchun@bytedance.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-01Merge tag 'ovl-fixes-5.17-rc3' of ↵Linus Torvalds1-3/+13
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs fixes from Miklos Szeredi: "Fix a regression introduced in v5.15, affecting copy up of files with 'noatime' or 'sync' attributes to a tmpfs upper layer" * tag 'ovl-fixes-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: don't fail copy up if no fileattr support on upper ovl: fix NULL pointer dereference in copy up warning
2022-01-14ovl: don't fail copy up if no fileattr support on upperMiklos Szeredi1-1/+11
Christoph Fritz is reporting that failure to copy up fileattr when upper doesn't support fileattr or xattr results in a regression. Return success in these failure cases; this reverts overlayfs to the old behavior. Add a pr_warn_once() in these cases to still let the user know about the copy up failures. Reported-by: Christoph Fritz <chf.fritz@googlemail.com> Fixes: 72db82115d2b ("ovl: copy up sync/noatime fileattr flags") Cc: <stable@vger.kernel.org> # v5.15 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-01-14ovl: fix NULL pointer dereference in copy up warningChristoph Fritz1-2/+2
This patch is fixing a NULL pointer dereference to get a recently introduced warning message working. Fixes: 5b0a414d06c3 ("ovl: fix filattr copy-up failure") Signed-off-by: Christoph Fritz <chf.fritz@googlemail.com> Cc: <stable@vger.kernel.org> # v5.15 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-12-03fs: add is_idmapped_mnt() helperChristian Brauner1-1/+1
Multiple places open-code the same check to determine whether a given mount is idmapped. Introduce a simple helper function that can be used instead. This allows us to get rid of the fragile open-coding. We will later change the check that is used to determine whether a given mount is idmapped. Introducing a helper allows us to do this in a single place instead of doing it for multiple places. Link: https://lore.kernel.org/r/20211123114227.3124056-2-brauner@kernel.org (v1) Link: https://lore.kernel.org/r/20211130121032.3753852-2-brauner@kernel.org (v2) Link: https://lore.kernel.org/r/20211203111707.3901969-2-brauner@kernel.org Cc: Seth Forshee <sforshee@digitalocean.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> CC: linux-fsdevel@vger.kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Seth Forshee <sforshee@digitalocean.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-11-09Merge tag 'ovl-update-5.16' of ↵Linus Torvalds6-14/+46
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs updates from Miklos Szeredi: - Fix a regression introduced in the last cycle - Fix a use-after-free in the AIO path - Fix a bogus warning reported by syzbot * tag 'ovl-update-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: fix filattr copy-up failure ovl: fix warning in ovl_create_real() ovl: fix use after free in struct ovl_aio_req
2021-11-04ovl: fix filattr copy-up failureMiklos Szeredi2-6/+22
This regression can be reproduced with ntfs-3g and overlayfs: mkdir lower upper work overlay dd if=/dev/zero of=ntfs.raw bs=1M count=2 mkntfs -F ntfs.raw mount ntfs.raw lower touch lower/file.txt mount -t overlay -o lowerdir=lower,upperdir=upper,workdir=work - overlay mv overlay/file.txt overlay/file2.txt mv fails and (misleadingly) prints mv: cannot move 'overlay/file.txt' to a subdirectory of itself, 'overlay/file2.txt' The reason is that ovl_copy_fileattr() is triggered due to S_NOATIME being set on all inodes (by fuse) regardless of fileattr. ovl_copy_fileattr() tries to retrieve file attributes from lower file, but that fails because filesystem does not support this ioctl (this should fail with ENOTTY, but ntfs-3g return EINVAL instead). This failure is propagated to origial operation (in this case rename) that triggered the copy-up. The fix is to ignore ENOTTY and EINVAL errors from fileattr_get() in copy up. This also requires turning the internal ENOIOCTLCMD into ENOTTY. As a further measure to prevent unnecessary failures, only try the fileattr_get/set on upper if there are any flags to copy up. Side note: a number of filesystems set S_NOATIME (and sometimes other inode flags) irrespective of fileattr flags. This causes unnecessary calls during copy up, which might lead to a performance issue, especially if latency is high. To fix this, the kernel would need to differentiate between the two cases. E.g. introduce SB_NOATIME_UPDATE, a per-sb variant of S_NOATIME. SB_NOATIME doesn't work, because that's interpreted as "filesystem doesn't store an atime attribute" Reported-and-tested-by: Kevin Locke <kevin@kevinlocke.name> Fixes: 72db82115d2b ("ovl: copy up sync/noatime fileattr flags") Cc: <stable@vger.kernel.org> # v5.15 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-11-04ovl: fix warning in ovl_create_real()Miklos Szeredi3-6/+10
Syzbot triggered the following warning in ovl_workdir_create() -> ovl_create_real(): if (!err && WARN_ON(!newdentry->d_inode)) { The reason is that the cgroup2 filesystem returns from mkdir without instantiating the new dentry. Weird filesystems such as this will be rejected by overlayfs at a later stage during setup, but to prevent such a warning, call ovl_mkdir_real() directly from ovl_workdir_create() and reject this case early. Reported-and-tested-by: syzbot+75eab84fd0af9e8bf66b@syzkaller.appspotmail.com Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-10-29ovl: fix use after free in struct ovl_aio_reqyangerkun1-2/+14
Example for triggering use after free in a overlay on ext4 setup: aio_read ovl_read_iter vfs_iter_read ext4_file_read_iter ext4_dio_read_iter iomap_dio_rw -> -EIOCBQUEUED /* * Here IO is completed in a separate thread, * ovl_aio_cleanup_handler() frees aio_req which has iocb embedded */ file_accessed(iocb->ki_filp); /**BOOM**/ Fix by introducing a refcount in ovl_aio_req similarly to aio_kiocb. This guarantees that iocb is only freed after vfs_read/write_iter() returns on underlying fs. Fixes: 2406a307ac7d ("ovl: implement async IO routines") Signed-off-by: yangerkun <yangerkun@huawei.com> Link: https://lore.kernel.org/r/20210930032228.3199690-3-yangerkun@huawei.com/ Cc: <stable@vger.kernel.org> # v5.6 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-10-25fs: get rid of the res2 iocb->ki_complete argumentJens Axboe1-2/+2
The second argument was only used by the USB gadget code, yet everyone pays the overhead of passing a zero to be passed into aio, where it ends up being part of the aio res2 value. Now that everybody is passing in zero, kill off the extra argument. Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-28ovl: fix IOCB_DIRECT if underlying fs doesn't support direct IOMiklos Szeredi1-1/+14
Normally the check at open time suffices, but e.g loop device does set IOCB_DIRECT after doing its own checks (which are not sufficent for overlayfs). Make sure we don't call the underlying filesystem read/write method with the IOCB_DIRECT if it's not supported. Reported-by: Huang Jianan <huangjianan@oppo.com> Fixes: 16914e6fc7e1 ("ovl: add ovl_read_iter()") Cc: <stable@vger.kernel.org> # v4.19 Tested-by: Huang Jianan <huangjianan@oppo.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-09-24ovl: fix missing negative dentry check in ovl_rename()Zheng Liang1-3/+7
The following reproducer mkdir lower upper work merge touch lower/old touch lower/new mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merge rm merge/new mv merge/old merge/new & unlink upper/new may result in this race: PROCESS A: rename("merge/old", "merge/new"); overwrite=true,ovl_lower_positive(old)=true, ovl_dentry_is_whiteout(new)=true -> flags |= RENAME_EXCHANGE PROCESS B: unlink("upper/new"); PROCESS A: lookup newdentry in new_upperdir call vfs_rename() with negative newdentry and RENAME_EXCHANGE Fix by adding the missing check for negative newdentry. Signed-off-by: Zheng Liang <zhengliang6@huawei.com> Fixes: e9be9d5e76e3 ("overlay filesystem") Cc: <stable@vger.kernel.org> # v3.18 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-18ovl: enable RCU'd ->get_acl()Miklos Szeredi1-3/+4
Overlayfs does not cache ACL's (to avoid double caching). Instead it just calls the underlying filesystem's i_op->get_acl(), which will return the cached value, if possible. In rcu path walk, however, get_cached_acl_rcu() is employed to get the value from the cache, which will fail on overlayfs resulting in dropping out of rcu walk mode. This can result in a big performance hit in certain situations. Fix by calling ->get_acl() with rcu=true in case of ACL_DONT_CACHE (which indicates pass-through) Reported-by: garyhuang <zjh.20052005@163.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-18vfs: add rcu argument to ->get_acl() callbackMiklos Szeredi2-2/+5
Add a rcu argument to the ->get_acl() callback to allow get_cached_acl_rcu() to call the ->get_acl() method in the next patch. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-17ovl: fix BUG_ON() in may_delete() when called from ovl_cleanup()chenying1-2/+4
If function ovl_instantiate() returns an error, ovl_cleanup will be called and try to remove newdentry from wdir, but the newdentry has been moved to udir at this time. This will causes BUG_ON(victim->d_parent->d_inode != dir) in fs/namei.c:may_delete. Signed-off-by: chenying <chenying.kernel@bytedance.com> Fixes: 01b39dcc9568 ("ovl: use inode_insert5() to hash a newly created inode") Link: https://lore.kernel.org/linux-unionfs/e6496a94-a161-dc04-c38a-d2544633acb4@bytedance.com/ Cc: <stable@vger.kernel.org> # v4.18 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-17ovl: use kvalloc in xattr copy-upMiklos Szeredi1-4/+5
Extended attributes are usually small, but could be up to 64k in size, so use the most efficient method for doing the allocation. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-17ovl: update ctime when changing fileattrChengguang Xu1-0/+3
Currently we keep size, mode and times of overlay inode as the same as upper inode, so should update ctime when changing file attribution as well. Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-17ovl: skip checking lower file's i_writecount on truncateChengguang Xu1-6/+0
It is possible that a directory tree is shared between multiple overlay instances as a lower layer. In this case when one instance executes a file residing on the lower layer, the other instance denies a truncate(2) call on this file. This only happens for truncate(2) and not for open(2) with the O_TRUNC flag. Fix this interference and inconsistency by removing the preliminary i_writecount check before copy-up. This means that unlike on normal filesystems truncate(argv[0]) will now succeed. If this ever causes a regression in a real world use case this needs to be revisited. One way to fix this properly would be to keep a correct i_writecount in the overlay inode, but that is difficult due to memory mapping code only dealing with the real file/inode. Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-17ovl: relax lookup error on mismatch origin ftypeAmir Goldstein1-1/+1
We get occasional reports of lookup errors due to mismatched origin ftype from users that re-format a lower squashfs image. Commit 13c6ad0f45fd ("ovl: document lower modification caveats") tries to discourage the practice of re-formating lower layers and describes the expected behavior as undefined. Commit b0e0f69731cd ("ovl: restrict lower null uuid for "xino=auto"") limits the configurations in which origin file handles are followed. In addition to these measures, change the behavior in case of detecting a mismatch origin ftype in lookup to issue a warning, not follow origin, but not fail the lookup operation either. That should make overall more users happy without any big consequences. Link: https://lore.kernel.org/linux-unionfs/CAOQ4uxgPq9E9xxwU2CDyHy-_yCZZeymg+3n+-6AqkGGE1YtwvQ@mail.gmail.com/ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-17ovl: do not set overlay.opaque for new directoriesVyacheslav Yurkov1-1/+3
Enable optimizations only if user opted-in for any of extended features. If optimization is enabled, it breaks existing use case when a lower layer directory appears after directory was created on a merged layer. If overlay.opaque is applied, new files on lower layer are not visible. Consider the following scenario: - /lower and /upper are mounted to /merged - directory /merged/new-dir is created with a file test1 - overlay is unmounted - directory /lower/new-dir is created with a file test2 - overlay is mounted again If opaque is applied by default, file test2 is not going to be visible without explicitly clearing the overlay.opaque attribute Signed-off-by: Vyacheslav Yurkov <Vyacheslav.Yurkov@bruker.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-17ovl: add ovl_allow_offline_changes() helperVyacheslav Yurkov2-3/+13
Allows to check whether any of extended features are enabled Signed-off-by: Vyacheslav Yurkov <Vyacheslav.Yurkov@bruker.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-17ovl: disable decoding null uuid with redirect_dirVyacheslav Yurkov1-1/+1
Currently decoding origin with lower null uuid is not allowed unless user opted-in to one of the new features that require following the lower inode of non-dir upper (index, xino, metacopy). Now we add redirect_dir too to that feature list. Signed-off-by: Vyacheslav Yurkov <Vyacheslav.Yurkov@bruker.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-17ovl: consistent behavior for immutable/append-only inodesAmir Goldstein4-7/+158
When a lower file has immutable/append-only fileattr flags, the behavior of overlayfs post copy up is inconsistent. Immediattely after copy up, ovl inode still has the S_IMMUTABLE/S_APPEND inode flags copied from lower inode, so vfs code still treats the ovl inode as immutable/append-only. After ovl inode evict or mount cycle, the ovl inode does not have these inode flags anymore. We cannot copy up the immutable and append-only fileattr flags, because immutable/append-only inodes cannot be linked and because overlayfs will not be able to set overlay.* xattr on the upper inodes. Instead, if any of the fileattr flags of interest exist on the lower inode, we store them in overlay.protattr xattr on the upper inode and we read the flags from xattr on lookup and on fileattr_get(). This gives consistent behavior post copy up regardless of inode eviction from cache. When user sets new fileattr flags, we update or remove the overlay.protattr xattr. Storing immutable/append-only fileattr flags in an xattr instead of upper fileattr also solves other non-standard behavior issues - overlayfs can now copy up children of "ovl-immutable" directories and lower aliases of "ovl-immutable" hardlinks. Reported-by: Chengguang Xu <cgxu519@mykernel.net> Link: https://lore.kernel.org/linux-unionfs/20201226104618.239739-1-cgxu519@mykernel.net/ Link: https://lore.kernel.org/linux-unionfs/20210210190334.1212210-5-amir73il@gmail.com/ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-17ovl: copy up sync/noatime fileattr flagsAmir Goldstein3-21/+89
When a lower file has sync/noatime fileattr flags, the behavior of overlayfs post copy up is inconsistent. Immediately after copy up, ovl inode still has the S_SYNC/S_NOATIME inode flags copied from lower inode, so vfs code still treats the ovl inode as sync/noatime. After ovl inode evict or mount cycle, the ovl inode does not have these inode flags anymore. To fix this inconsistency, try to copy the fileattr flags on copy up if the upper fs supports the fileattr_set() method. This gives consistent behavior post copy up regardless of inode eviction from cache. We cannot copy up the immutable/append-only inode flags in a similar manner, because immutable/append-only inodes cannot be linked and because overlayfs will not be able to set overlay.* xattr on the upper inodes. Those flags will be addressed by a followup patch. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-17ovl: pass ovl_fs to ovl_check_setxattr()Amir Goldstein5-15/+16
Instead of passing the overlay dentry. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-10ovl: fix uninitialized pointer read in ovl_lookup_real_one()Miklos Szeredi1-1/+1
One error path can result in release_dentry_name_snapshot() being called before "name" was initialized by take_dentry_name_snapshot(). Fix by moving the release_dentry_name_snapshot() to immediately after the only use. Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-10ovl: fix deadlock in splice writeMiklos Szeredi1-1/+46
There's possibility of an ABBA deadlock in case of a splice write to an overlayfs file and a concurrent splice write to a corresponding real file. The call chain for splice to an overlay file: -> do_splice [takes sb_writers on overlay file] -> do_splice_from -> iter_file_splice_write [takes pipe->mutex] -> vfs_iter_write ... -> ovl_write_iter [takes sb_writers on real file] And the call chain for splice to a real file: -> do_splice [takes sb_writers on real file] -> do_splice_from -> iter_file_splice_write [takes pipe->mutex] Syzbot successfully bisected this to commit 82a763e61e2b ("ovl: simplify file splice"). Fix by reverting the write part of the above commit and by adding missing bits from ovl_write_iter() into ovl_splice_write(). Fixes: 82a763e61e2b ("ovl: simplify file splice") Reported-and-tested-by: syzbot+579885d1a9a833336209@syzkaller.appspotmail.com Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-10ovl: skip stale entries in merge dir cache iterationAmir Goldstein1-0/+5
On the first getdents call, ovl_iterate() populates the readdir cache with a list of entries, but for upper entries with origin lower inode, p->ino remains zero. Following getdents calls traverse the readdir cache list and call ovl_cache_update_ino() for entries with zero p->ino to lookup the entry in the overlay and return d_ino that is consistent with st_ino. If the upper file was unlinked between the first getdents call and the getdents call that lists the file entry, ovl_cache_update_ino() will not find the entry and fall back to setting d_ino to the upper real st_ino, which is inconsistent with how this object was presented to users. Instead of listing a stale entry with inconsistent d_ino, simply skip the stale entry, which is better for users. xfstest overlay/077 is failing without this patch. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/fstests/CAOQ4uxgR_cLnC_vdU5=seP3fwqVkuZM_-WfD6maFTMbMYq=a9w@mail.gmail.com/ Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-04-30Merge tag 'ovl-update-5.13' of ↵Linus Torvalds8-67/+124
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs update from Miklos Szeredi: - Fix a regression introduced in 5.2 that resulted in valid overlayfs mounts being rejected with ELOOP (Too many levels of symbolic links) - Fix bugs found by various tools - Miscellaneous improvements and cleanups * tag 'ovl-update-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: add debug print to ovl_do_getxattr() ovl: invalidate readdir cache on changes to dir with origin ovl: allow upperdir inside lowerdir ovl: show "userxattr" in the mount data ovl: trivial typo fixes in the file inode.c ovl: fix misspellings using codespell tool ovl: do not copy attr several times ovl: remove ovl_map_dev_ino() return value ovl: fix error for ovl_fill_super() ovl: fix missing revert_creds() on error path ovl: fix leaked dentry ovl: restrict lower null uuid for "xino=auto" ovl: check that upperdir path is not on a read-only mount ovl: plumb through flush method
2021-04-27Merge branch 'miklos.fileattr' of ↵Linus Torvalds5-116/+82
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull fileattr conversion updates from Miklos Szeredi via Al Viro: "This splits the handling of FS_IOC_[GS]ETFLAGS from ->ioctl() into a separate method. The interface is reasonably uniform across the filesystems that support it and gives nice boilerplate removal" * 'miklos.fileattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (23 commits) ovl: remove unneeded ioctls fuse: convert to fileattr fuse: add internal open/release helpers fuse: unsigned open flags fuse: move ioctl to separate source file vfs: remove unused ioctl helpers ubifs: convert to fileattr reiserfs: convert to fileattr ocfs2: convert to fileattr nilfs2: convert to fileattr jfs: convert to fileattr hfsplus: convert to fileattr efivars: convert to fileattr xfs: convert to fileattr orangefs: convert to fileattr gfs2: convert to fileattr f2fs: convert to fileattr ext4: convert to fileattr ext2: convert to fileattr btrfs: convert to fileattr ...
2021-04-27Merge branch 'work.inode-type-fixes' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs inode type handling updates from Al Viro: "We should never change the type bits of ->i_mode or the method tables (->i_op and ->i_fop) of a live inode. Unfortunately, not all filesystems took care to prevent that" * 'work.inode-type-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: spufs: fix bogosity in S_ISGID handling 9p: missing chunk of "fs/9p: Don't update file type when updating file attributes" openpromfs: don't do unlock_new_inode() until the new inode is set up hostfs_mknod(): don't bother with init_special_inode() cifs: have cifs_fattr_to_inode() refuse to change type on live inode cifs: have ->mkdir() handle race with another client sanely do_cifs_create(): don't set ->i_mode of something we had not created gfs2: be careful with inode refresh ocfs2_inode_lock_update(): make sure we don't change the type bits of i_mode orangefs_inode_is_stale(): i_mode type bits do *not* form a bitmap... vboxsf: don't allow to change the inode type afs: Fix updating of i_mode due to 3rd party change ceph: don't allow type or device number to change on non-I_NEW inodes ceph: fix up error handling with snapdirs new helper: inode_wrong_type()