summaryrefslogtreecommitdiffstats
path: root/fs/overlayfs/namei.c
AgeCommit message (Collapse)AuthorFilesLines
2018-10-26ovl: fix error handling in ovl_verify_set_fh()Amir Goldstein1-1/+3
We hit a BUG on kfree of an ERR_PTR()... Reported-by: syzbot+ff03fe05c717b82502d0@syzkaller.appspotmail.com Fixes: 8b88a2e64036 ("ovl: verify upper root dir matches lower root dir") Cc: <stable@vger.kernel.org> # v4.13 Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-10-04ovl: fix access beyond unterminated stringsAmir Goldstein1-1/+1
KASAN detected slab-out-of-bounds access in printk from overlayfs, because string format used %*s instead of %.*s. > BUG: KASAN: slab-out-of-bounds in string+0x298/0x2d0 lib/vsprintf.c:604 > Read of size 1 at addr ffff8801c36c66ba by task syz-executor2/27811 > > CPU: 0 PID: 27811 Comm: syz-executor2 Not tainted 4.19.0-rc5+ #36 ... > printk+0xa7/0xcf kernel/printk/printk.c:1996 > ovl_lookup_index.cold.15+0xe8/0x1f8 fs/overlayfs/namei.c:689 Reported-by: syzbot+376cea2b0ef340db3dd4@syzkaller.appspotmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 359f392ca53e ("ovl: lookup index entry for copy up origin") Cc: <stable@vger.kernel.org> # v4.13
2018-07-20ovl: Check redirect on index as wellVivek Goyal1-37/+13
Right now we seem to check redirect only if upperdentry is found. But it is possible that there is no upperdentry but later we found an index. We need to check redirect on index as well and set it in ovl_inode->redirect. Otherwise link code can assume that dentry does not have redirect and place a new one which breaks things. In my testing overlay/033 test started failing in xfstests. Following are the details. For example do following. $ mkdir lower upper work merged - Make lower dir with 4 links. $ echo "foo" > lower/l0.txt $ ln lower/l0.txt lower/l1.txt $ ln lower/l0.txt lower/l2.txt $ ln lower/l0.txt lower/l3.txt - Mount with index on and metacopy on. $ mount -t overlay -o lowerdir=lower,upperdir=upper,workdir=work,\ index=on,metacopy=on none merged - Link lower $ ln merged/l0.txt merged/l4.txt (This will metadata copy up of l0.txt and put an absolute redirect /l0.txt) $ echo 2 > /proc/sys/vm/drop/caches $ ls merged/l1.txt (Now l1.txt will be looked up. There is no upper dentry but there is lower dentry and index will be found. We don't check for redirect on index, hence ovl_inode->redirect will be NULL.) - Link Upper $ ln merged/l4.txt merged/l5.txt (Lookup of l4.txt will use inode from l1.txt lookup which is still in cache. It has ovl_inode->redirect NULL, hence link will put a new redirect and replace /l0.txt with /l4.txt - Drop caches. echo 2 > /proc/sys/vm/drop_caches - List l1.txt and it returns -ESTALE $ ls merged/l0.txt (It returns stale because, we found a metacopy of l0.txt in upper and it has redirect l4.txt but there is no file named l4.txt in lower layer. So lower data copy is not found and -ESTALE is returned.) So problem here is that we did not process redirect on index. Check redirect on index as well and then problem is fixed. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-07-20ovl: Check redirects for metacopy filesVivek Goyal1-2/+2
Right now we rely on path based lookup for data origin of metacopy upper. This will work only if upper has not been renamed. We solved this problem already for merged directories using redirect. Use same logic for metacopy files. This patch just goes on to check redirects for metacopy files. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-07-20ovl: Move some dir related ovl_lookup_single() code in else blockVivek Goyal1-10/+11
Move some directory related code in else block. This is pure code reorganization and no functionality change. Next patch enables redirect processing on metacopy files and needs this change. By keeping non-functional changes in a separate patch, next patch looks much smaller and cleaner. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-07-20ovl: Store lower data inode in ovl_inodeVivek Goyal1-0/+2
Right now ovl_inode stores inode pointer for lower inode. This helps with quickly getting lower inode given overlay inode (ovl_inode_lower()). Now with metadata only copy-up, we can have metacopy inode in middle layer as well and inode containing data can be different from ->lower. I need to be able to open the real file in ovl_open_realfile() and for that I need to quickly find the lower data inode. Hence store lower data inode also in ovl_inode. Also provide an helper ovl_inode_lowerdata() to access this field. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-07-20ovl: Modify ovl_lookup() and friends to lookup metacopy dentryVivek Goyal1-17/+95
This patch modifies ovl_lookup() and friends to lookup metacopy dentries. It also allows for presence of metacopy dentries in lower layer. During lookup, check for presence of OVL_XATTR_METACOPY and if not present, set OVL_UPPERDATA bit in flags. We don't support metacopy feature with nfs_export. So in nfs_export code, we set OVL_UPPERDATA flag set unconditionally if upper inode exists. Do not follow metacopy origin if we find a metacopy only inode and metacopy feature is not enabled for that mount. Like redirect, this can have security implications where an attacker could hand craft upper and try to gain access to file on lower which it should not have to begin with. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-07-20ovl: Initialize ovl_inode->redirect in ovl_get_inode()Vivek Goyal1-7/+1
ovl_inode->redirect is an inode property and should be initialized in ovl_get_inode() only when we are adding a new inode to cache. If inode is already in cache, it is already initialized and we should not be touching ovl_inode->redirect field. As of now this is not a problem as redirects are used only for directories which don't share inode. But soon I want to use redirects for regular files also and there it can become an issue. Hence, move ->redirect initialization in ovl_get_inode(). Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-06-12treewide: kzalloc() -> kcalloc()Kees Cook1-1/+1
The kzalloc() function has a 2-factor argument form, kcalloc(). This patch replaces cases of: kzalloc(a * b, gfp) with: kcalloc(a * b, gfp) as well as handling cases of: kzalloc(a * b * c, gfp) with: kzalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kzalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kzalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kzalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | kzalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kzalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(__u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(char) * COUNT + COUNT , ...) | kzalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kzalloc + kcalloc ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kzalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kzalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kzalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kzalloc(C1 * C2 * C3, ...) | kzalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) | kzalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kzalloc(sizeof(THING) * C2, ...) | kzalloc(sizeof(TYPE) * C2, ...) | kzalloc(C1 * C2 * C3, ...) | kzalloc(C1 * C2, ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - (E1) * E2 + E1, E2 , ...) | - kzalloc + kcalloc ( - (E1) * (E2) + E1, E2 , ...) | - kzalloc + kcalloc ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>
2018-05-31ovl: Pass argument to ovl_get_inode() in a structureVivek Goyal1-2/+8
ovl_get_inode() right now has 5 parameters. Soon this patch series will add 2 more and suddenly argument list starts looking too long. Hence pass arguments to ovl_get_inode() in a structure and it looks little cleaner. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-04-12ovl: consistent i_ino for non-samefs with xinoAmir Goldstein1-3/+1
When overlay layers are not all on the same fs, but all inode numbers of underlying fs do not use the high 'xino' bits, overlay st_ino values are constant and persistent. In that case, set i_ino value to the same value as st_ino for nfsd readdirplus validator. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-04-12ovl: add WARN_ON() for non-dir redirect casesMiklos Szeredi1-0/+11
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-04-12ovl: cleanup setting OVL_INDEXVivek Goyal1-2/+0
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-04-12ovl: set d->is_dir and d->opaque for last path elementVivek Goyal1-2/+6
Certain properties in ovl_lookup_data should be set only for the last element of the path. IOW, if we are calling ovl_lookup_single() for an absolute redirect, then d->is_dir and d->opaque do not make much sense for intermediate path elements. Instead set them only if dentry being lookup is last path element. As of now we do not seem to be making use of d->opaque if it is set for a path/dentry in lower. But just define the semantics so that future code can make use of this assumption. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-04-12ovl: Do not check for redirect if this is last layerVivek Goyal1-1/+4
If we are looking in last layer, then there should not be any need to process redirect. redirect information is used only for lookup in next lower layer and there is no more lower layer to look into. So no need to process redirects. IOW, ignore redirects on lowest layer. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-04-12ovl: do not try to reconnect a disconnected origin dentryAmir Goldstein1-7/+9
On lookup of non directory, we try to decode the origin file handle stored in upper inode. The origin file handle is supposed to be decoded to a disconnected non-dir dentry, which is fine, because we only need the lower inode of a copy up origin. However, if the origin file handle somehow turns out to be a directory we pay the expensive cost of reconnecting the directory dentry, only to get a mismatch file type and drop the dentry. Optimize this case by explicitly opting out of reconnecting the dentry. Opting-out of reconnect is done by passing a NULL acceptable callback to exportfs_decode_fh(). While the case described above is a strange corner case that does not really need to be optimized, the API added for this optimization will be used by a following patch to optimize a more common case of decoding an overlayfs file handle. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-04-12ovl: disambiguate ovl_encode_fh()Amir Goldstein1-5/+5
Rename ovl_encode_fh() to ovl_encode_real_fh() to differentiate from the exportfs function ovl_encode_inode_fh() and change the latter to ovl_encode_fh() to match the exportfs method name. Rename ovl_decode_fh() to ovl_decode_real_fh() for consistency. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-04-12ovl: fix lookup with middle layer opaque dir and absolute path redirectsAmir Goldstein1-0/+9
As of now if we encounter an opaque dir while looking for a dentry, we set d->last=true. This means that there is no need to look further in any of the lower layers. This works fine as long as there are no redirets or relative redircts. But what if there is an absolute redirect on the children dentry of opaque directory. We still need to continue to look into next lower layer. This patch fixes it. Here is an example to demonstrate the issue. Say you have following setup. upper: /redirect (redirect=/a/b/c) lower1: /a/[b]/c ([b] is opaque) (c has absolute redirect=/a/b/d/) lower0: /a/b/d/foo Now "redirect" dir should merge with lower1:/a/b/c/ and lower0:/a/b/d. Note, despite the fact lower1:/a/[b] is opaque, we need to continue to look into lower0 because children c has an absolute redirect. Following is a reproducer. Watch me make foo disappear: $ mkdir lower middle upper work work2 merged $ mkdir lower/origin $ touch lower/origin/foo $ mount -t overlay none merged/ \ -olowerdir=lower,upperdir=middle,workdir=work2 $ mkdir merged/pure $ mv merged/origin merged/pure/redirect $ umount merged $ mount -t overlay none merged/ \ -olowerdir=middle:lower,upperdir=upper,workdir=work $ mv merged/pure/redirect merged/redirect Now you see foo inside a twice redirected merged dir: $ ls merged/redirect foo $ umount merged $ mount -t overlay none merged/ \ -olowerdir=middle:lower,upperdir=upper,workdir=work After mount cycle you don't see foo inside the same dir: $ ls merged/redirect During middle layer lookup, the opaqueness of middle/pure is left in the lookup state and then middle/pure/redirect is wrongly treated as opaque. Fixes: 02b69b284cd7 ("ovl: lookup redirects") Cc: <stable@vger.kernel.org> #v4.10 Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-04-12ovl: Set d->last properly during lookupVivek Goyal1-2/+6
d->last signifies that this is the last layer we are looking into and there is no more. And that means this allows for some optimzation opportunities during lookup. For example, in ovl_lookup_single() we don't have to check for opaque xattr of a directory is this is the last layer we are looking into (d->last = true). But knowing for sure whether we are looking into last layer can be very tricky. If redirects are not enabled, then we can look at poe->numlower and figure out if the lookup we are about to is last layer or not. But if redircts are enabled then it is possible poe->numlower suggests that we are looking in last layer, but there is an absolute redirect present in found element and that redirects us to a layer in root and that means lookup will continue in lower layers further. For example, consider following. /upperdir/pure (opaque=y) /upperdir/pure/foo (opaque=y,redirect=/bar) /lowerdir/bar In this case pure is "pure upper". When we look for "foo", that time poe->numlower=0. But that alone does not mean that we will not search for a merge candidate in /lowerdir. Absolute redirect changes that. IOW, d->last should not be set just based on poe->numlower if redirects are enabled. That can lead to setting d->last while it should not have and that means we will not check for opaque xattr while we should have. So do this. - If redirects are not enabled, then continue to rely on poe->numlower information to determine if it is last layer or not. - If redirects are enabled, then set d->last = true only if this is the last layer in root ovl_entry (roe). Suggested-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 02b69b284cd7 ("ovl: lookup redirects") Cc: <stable@vger.kernel.org> #v4.10
2018-02-26ovl: redirect_dir=nofollow should not follow redirect for opaque lowerVivek Goyal1-3/+3
redirect_dir=nofollow should not follow a redirect. But in a specific configuration it can still follow it. For example try this. $ mkdir -p lower0 lower1/foo upper work merged $ touch lower1/foo/lower-file.txt $ setfattr -n "trusted.overlay.opaque" -v "y" lower1/foo $ mount -t overlay -o lowerdir=lower1:lower0,workdir=work,upperdir=upper,redirect_dir=on none merged $ cd merged $ mv foo foo-renamed $ umount merged # mount again. This time with redirect_dir=nofollow $ mount -t overlay -o lowerdir=lower1:lower0,workdir=work,upperdir=upper,redirect_dir=nofollow none merged $ ls merged/foo-renamed/ # This lists lower-file.txt, while it should not have. Basically, we are doing redirect check after we check for d.stop. And if this is not last lower, and we find an opaque lower, d.stop will be set. ovl_lookup_single() if (!d->last && ovl_is_opaquedir(this)) { d->stop = d->opaque = true; goto out; } To fix this, first check redirect is allowed. And after that check if d.stop has been set or not. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Fixes: 438c84c2f0c7 ("ovl: don't follow redirects if redirect_dir=off") Cc: <stable@vger.kernel.org> #v4.15 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-02-05ovl: fix regression in fsnotify of overlay merge dirAmir Goldstein1-0/+2
A re-factoring patch in NFS export series has passed the wrong argument to ovl_get_inode() causing a regression in the very recent fix to fsnotify of overlay merge dir. The regression has caused merge directory inodes to be hashed by upper instead of lower real inode, when NFS export and directory indexing is disabled. That caused an inotify watch to become obsolete after directory copy up and drop caches. LTP test inotify07 was improved to catch this regression. The regression also caused multiple redirect dirs to same origin not to be detected on lookup with NFS export disabled. An xfstest was added to cover this case. Fixes: 0aceb53e73be ("ovl: do not pass overlay dentry to ovl_get_inode()") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: lookup indexed ancestor of lower dirAmir Goldstein1-6/+14
ovl_lookup_real() in lower layer walks back lower parents to find the topmost indexed parent. If an indexed ancestor is found before reaching lower layer root, ovl_lookup_real() is called recursively with upper layer to walk back from indexed upper to the topmost connected/hashed upper parent (or up to root). ovl_lookup_real() in upper layer then walks forward to connect the topmost upper overlay dir dentry and ovl_lookup_real() in lower layer continues to walk forward to connect the decoded lower overlay dir dentry. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: decode indexed dir file handlesAmir Goldstein1-1/+1
Decoding an indexed dir file handle is done by looking up the file handle in index dir by name and then decoding the upper dir from the index origin file handle. The decoded upper path is used to lookup an overlay dentry of the same path. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: decode lower non-dir file handlesAmir Goldstein1-4/+3
Decoding a lower non-dir file handle is done by decoding the lower dentry from underlying lower fs, finding or allocating an overlay inode that is hashed by the real lower inode and instantiating an overlay dentry with that inode. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: decode pure upper file handlesAmir Goldstein1-2/+2
Decoding an upper file handle is done by decoding the upper dentry from underlying upper fs, finding or allocating an overlay inode that is hashed by the real upper inode and instantiating an overlay dentry with that inode. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: store 'has_upper' and 'opaque' as bit flagsAmir Goldstein1-3/+4
We need to make some room in struct ovl_entry to store information about redirected ancestors for NFS export, so cram two booleans as bit flags. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: use d_splice_alias() in place of d_add() in lookupAmir Goldstein1-3/+1
This is required for NFS export. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: do not pass overlay dentry to ovl_get_inode()Amir Goldstein1-1/+2
This is needed for using ovl_get_inode() for decoding file handles for NFS export. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: factor out ovl_get_index_fh() helperAmir Goldstein1-10/+49
The helper is needed to lookup an index by file handle for NFS export. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: whiteout orphan index entries on mountAmir Goldstein1-2/+17
Orphan index entries are non-dir index entries whose union nlink count dropped to zero. With index=on, orphan index entries are removed on mount. With NFS export feature enabled, orphan index entries are replaced with white out index entries to block future open by handle from opening the lower file. When dir index has a stale 'upper' xattr, we assume that the upper dir was removed and we treat the dir index as orphan entry that needs to be whited out or removed. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: cleanup temp index entriesAmir Goldstein1-0/+12
A previous failed attempt to create or whiteout a directory index may leave index entries named '#%x' in the index dir. Cleanup those temp entries on mount instead of failing the mount. In the future, we may drop 'work' dir and use 'index' dir instead. This change is enough for cleaning up copy up leftovers 'from the future', but it is not enough for cleaning up rmdir leftovers 'from the future' (i.e. temp dir containing whiteouts). Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: verify directory index entries on mountAmir Goldstein1-32/+93
Directory index entries should have 'upper' xattr pointing to the real upper dir. Verifying that the upper dir file handle is not stale is expensive, so only verify stale directory index entries on mount if NFS export feature is enabled. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: verify whiteout index entries on mountAmir Goldstein1-8/+13
Whiteout index entries are used as an indication that an exported overlay file handle should be treated as stale (i.e. after unlink of the overlay inode). Check on mount that whiteout index entries have a name that looks like a valid file handle and cleanup invalid index entries. For whiteout index entries, do not check that they also have valid origin fh and nlink xattr, because those xattr do not exist for a whiteout index entry. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: use directory index entries for consistency verificationAmir Goldstein1-7/+35
A directory index is a directory type entry in index dir with a "trusted.overlay.upper" xattr containing an encoded ovl_fh of the merge directory upper dir inode. On lookup of non-dir files, lower file is followed by origin file handle. On lookup of dir entries, lower dir is found by name and then compared to origin file handle. We only trust dir index if we verified that lower dir matches origin file handle, otherwise index may be inconsistent and we ignore it. If we find an indexed non-upper dir or an indexed merged dir, whose index 'upper' xattr points to a different upper dir, that means that the lower directory may be also referenced by another upper dir via redirect, so we fail the lookup on inconsistency error. To be consistent with directory index entries format, the association of index dir to upper root dir, that was stored by older kernels in "trusted.overlay.origin" xattr is now stored in "trusted.overlay.upper" xattr. This also serves as an indication that overlay was mounted with a kernel that support index directory entries. For backward compatibility, if an 'origin' xattr exists on the index dir we also verify it on mount. Directory index entries are going to be used for NFS export. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: verify stored origin fh matches lower dirAmir Goldstein1-0/+12
When the NFS export feature is enabled, overlayfs implicitly enables the feature "verify_lower". When the "verify_lower" feature is enabled, a directory inode found in lower layer by name or by redirect_dir is verified against the file handle of the copy up origin that is stored in the upper layer. This introduces a change of behavior for the case of lower layer modification while overlay is offline. A lower directory created or moved offline under an exisitng upper directory, will not be merged with that upper directory. The NFS export feature should not be used after copying layers, because the new lower directory inodes would fail verification and won't be merged with upper directories. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: generalize ovl_verify_origin() and helpersAmir Goldstein1-20/+22
Remove the "origin" language from the functions that handle set, get and verify of "origin" xattr and pass the xattr name as an argument. The same helpers are going to be used for NFS export to get, get and verify the "upper" xattr for directory index entries. ovl_verify_origin() is now a helper used only to verify non upper file handle stored in "origin" xattr of upper inode. The upper root dir file handle is still stored in "origin" xattr on the index dir for backward compatibility. This is going to be changed by the patch that adds directory index entries support. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: simplify arguments to ovl_check_origin_fh()Amir Goldstein1-15/+13
Pass the fs instance with lower_layers array instead of the dentry lowerstack array to ovl_check_origin_fh(), because the dentry members of lowerstack play no role in this helper. This change simplifies the argument list of ovl_check_origin(), ovl_cleanup_index() and ovl_verify_index(). Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: factor out ovl_check_origin_fh()Amir Goldstein1-50/+92
Re-factor ovl_check_origin() and ovl_get_origin(), so origin fh xattr is read from upper inode only once during lookup with multiple lower layers and only once when verifying index entry origin. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: store layer index in ovl_layerAmir Goldstein1-16/+1
Store the fs root layer index inside ovl_layer struct, so we can get the root fs layer index from merge dir lower layer instead of find it with ovl_find_layer() helper. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: fix inconsistent d_ino for legacy merge dirAmir Goldstein1-0/+33
For a merge dir that was copied up before v4.12 or that was hand crafted offline (e.g. mkdir {upper/lower}/dir), upper dir does not contain the 'trusted.overlay.origin' xattr. In that case, stat(2) on the merge dir returns the lower dir st_ino, but getdents(2) returns the upper dir d_ino. After this change, on merge dir lookup, missing origin xattr on upper dir will be fixed and 'impure' xattr will be fixed on parent of the legacy merge dir. Suggested-by: zhangyi (F) <yi.zhang@huawei.com> Reviewed-by: zhangyi (F) <yi.zhang@huawei.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-19ovl: fix another overlay: warning prefixAmir Goldstein1-1/+2
Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-19ovl: take lower dir inode mutex outside upper sb_writers lockAmir Goldstein1-0/+3
The functions ovl_lower_positive() and ovl_check_empty_dir() both take inode mutex on the real lower dir under ovl_want_write() which takes the upper_mnt sb_writers lock. While this is not a clear locking order or layering violation, it creates an undesired lock dependency between two unrelated layers for no good reason. This lock dependency materializes to a false(?) positive lockdep warning when calling rmdir() on a nested overlayfs, where both nested and underlying overlayfs both use the same fs type as upper layer. rmdir() on the nested overlayfs creates the lock chain: sb_writers of upper_mnt (e.g. tmpfs) in ovl_do_remove() ovl_i_mutex_dir_key[] of lower overlay dir in ovl_lower_positive() rmdir() on the underlying overlayfs creates the lock chain in reverse order: ovl_i_mutex_dir_key[] of lower overlay dir in vfs_rmdir() sb_writers of nested upper_mnt (e.g. tmpfs) in ovl_do_remove() To rid of the unneeded locking dependency, move both ovl_lower_positive() and ovl_check_empty_dir() to before ovl_want_write() in rmdir() and rename() implementation. This change spreads the pieces of ovl_check_empty_and_clear() directly inside the rmdir()/rename() implementations so the helper is no longer needed and removed. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-12-11ovl: Pass ovl_get_nlink() parameters in right orderVivek Goyal1-1/+1
Right now we seem to be passing index as "lowerdentry" and origin.dentry as "upperdentry". IIUC, we should pass these parameters in reversed order and this looks like a bug. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Amir Goldstein <amir73il@gmail.com> Fixes: caf70cb2ba5d ("ovl: cleanup orphan index entries") Cc: <stable@vger.kernel.org> #v4.13 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-12-11ovl: don't follow redirects if redirect_dir=offMiklos Szeredi1-0/+16
Overlayfs is following redirects even when redirects are disabled. If this is unintentional (probably the majority of cases) then this can be a problem. E.g. upper layer comes from untrusted USB drive, and attacker crafts a redirect to enable read access to otherwise unreadable directories. If "redirect_dir=off", then turn off following as well as creation of redirects. If "redirect_dir=follow", then turn on following, but turn off creation of redirects (which is what "redirect_dir=off" does now). This is a backward incompatible change, so make it dependent on a config option. Reported-by: David Howells <dhowells@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-10ovl: remove unneeded arg from ovl_verify_origin()Amir Goldstein1-2/+2
Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-10ovl: Put upperdentry if ovl_check_origin() failsVivek Goyal1-1/+1
If ovl_check_origin() fails, we should put upperdentry. We have a reference on it by now. So goto out_put_upper instead of out. Fixes: a9d019573e88 ("ovl: lookup non-dir copy-up-origin by file handle") Cc: <stable@vger.kernel.org> #4.12 Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-09ovl: re-structure overlay lower layers in-memoryChandan Rajendra1-21/+31
Define new structures to represent overlay instance lower layers and overlay merge dir lower layers to make room for storing more per layer information in-memory. Instead of keeping the fs instance lower layers in an array of struct vfsmount, keep them in an array of new struct ovl_layer, that has a pointer to struct vfsmount. Instead of keeping the dentry lower layers in an array of struct path, keep them in an array of new struct ovl_path, that has a pointer to struct dentry and to struct ovl_layer. Add a small helper to find the fs layer id that correspopnds to a lower struct ovl_path and use it in ovl_lookup(). [amir: split re-structure from anonymous bdev patch] Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-09ovl: move include of ovl_entry.h into overlayfs.hAmir Goldstein1-1/+0
Most overlayfs c files already explicitly include ovl_entry.h to use overlay entry struct definitions and upcoming changes are going to require even more c files to include this header. All overlayfs c files include overlayfs.h and overlayfs.h itself refers to some structs defined in ovl_entry.h, so it seems more logic to include ovl_entry.h from overlayfs.h than from c files. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-10-24ovl: do not cleanup unsupported index entriesAmir Goldstein1-4/+3
With index=on, ovl_indexdir_cleanup() tries to cleanup invalid index entries (e.g. bad index name). This behavior could result in cleaning of entries created by newer kernels and is therefore undesirable. Instead, abort mount if such entries are encountered. We still cleanup 'stale' entries and 'orphan' entries, both those cases can be a result of offline changes to lower and upper dirs. When encoutering an index entry of type directory or whiteout, kernel was supposed to fallback to read-only mount, but the fill_super() operation returns EROFS in this case instead of returning success with read-only mount flag, so mount fails when encoutering directory or whiteout index entries. Bless this behavior by returning -EINVAL on directory and whiteout index entries as we do for all unsupported index entries. Fixes: 61b674710cd9 ("ovl: do not cleanup directory and whiteout index..") Cc: <stable@vger.kernel.org> # v4.13 Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2017-10-24ovl: handle ENOENT on index lookupAmir Goldstein1-0/+4
Treat ENOENT from index entry lookup the same way as treating a returned negative dentry. Apparently, either could be returned if file is not found, depending on the underlying file system. Fixes: 359f392ca53e ("ovl: lookup index entry for copy up origin") Cc: <stable@vger.kernel.org> # v4.13 Signed-off-by: Amir Goldstein <amir73il@gmail.com>