summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2018-11-11Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds5-31/+51
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "A large number of ext4 bug fixes, mostly buffer and memory leaks on error return cleanup paths" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: missing !bh check in ext4_xattr_inode_write() ext4: fix buffer leak in __ext4_read_dirblock() on error path ext4: fix buffer leak in ext4_expand_extra_isize_ea() on error path ext4: fix buffer leak in ext4_xattr_move_to_block() on error path ext4: release bs.bh before re-using in ext4_xattr_block_find() ext4: fix buffer leak in ext4_xattr_get_block() on error path ext4: fix possible leak of s_journal_flag_rwsem in error path ext4: fix possible leak of sbi->s_group_desc_leak in error path ext4: remove unneeded brelse call in ext4_xattr_inode_update_ref() ext4: avoid possible double brelse() in add_new_gdb() on error path ext4: avoid buffer leak in ext4_orphan_add() after prior errors ext4: avoid buffer leak on shutdown in ext4_mark_iloc_dirty() ext4: fix possible inode leak in the retry loop of ext4_resize_fs() ext4: fix missing cleanup if ext4_alloc_flex_bg_array() fails while resizing ext4: add missing brelse() update_backups()'s error path ext4: add missing brelse() add_new_gdb_meta_bg()'s error path ext4: add missing brelse() in set_flexbg_block_bitmap()'s error path ext4: avoid potential extra brelse in setup_new_flex_group_blocks()
2018-11-10Merge branch 'for-linus' of ↵Linus Torvalds1-5/+17
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace fixes from Eric Biederman: "I believe all of these are simple obviously correct bug fixes. These fall into two groups: - Fixing the implementation of MNT_LOCKED which prevents lesser privileged users from seeing unders mounts created by more privileged users. - Fixing the extended uid and group mapping in user namespaces. As well as ensuring the code looks correct I have spot tested these changes as well and in my testing the fixes are working" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: mount: Prevent MNT_DETACH from disconnecting locked mounts mount: Don't allow copying MNT_UNBINDABLE|MNT_LOCKED mounts mount: Retest MNT_LOCKED in do_umount userns: also map extents in the reverse map to kernel IDs
2018-11-09Merge tag 'ceph-for-4.20-rc2' of https://github.com/ceph/ceph-clientLinus Torvalds3-12/+14
Pull Ceph fixes from Ilya Dryomov: "Two CephFS fixes (copy_file_range and quota) and a small feature bit cleanup" * tag 'ceph-for-4.20-rc2' of https://github.com/ceph/ceph-client: libceph: assume argonaut on the server side ceph: quota: fix null pointer dereference in quota check ceph: add destination file data sync before doing any remote copy
2018-11-09ext4: missing !bh check in ext4_xattr_inode_write()Vasily Averin1-0/+6
According to Ted Ts'o ext4_getblk() called in ext4_xattr_inode_write() should not return bh = NULL The only time that bh could be NULL, then, would be in the case of something really going wrong; a programming error elsewhere (perhaps a wild pointer dereference) or I/O error causing on-disk file system corruption (although that would be highly unlikely given that we had *just* allocated the blocks and so the metadata blocks in question probably would still be in the cache). Fixes: e50e5129f384 ("ext4: xattr-in-inode support") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 4.13
2018-11-08libceph: assume argonaut on the server sideIlya Dryomov1-9/+3
No one is running pre-argonaut. In addition one of the argonaut features (NOSRCADDR) has been required since day one (and a half, 2.6.34 vs 2.6.35) of the kernel client. Allow for the possibility of reusing these feature bits later. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2018-11-08ceph: quota: fix null pointer dereference in quota checkLuis Henriques1-1/+2
This patch fixes a possible null pointer dereference in check_quota_exceeded, detected by the static checker smatch, with the following warning:    fs/ceph/quota.c:240 check_quota_exceeded()     error: we previously assumed 'realm' could be null (see line 188) Fixes: b7a2921765cf ("ceph: quota: support for ceph.quota.max_files") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-11-08ceph: add destination file data sync before doing any remote copyLuis Henriques1-2/+9
If we try to copy into a file that was just written, any data that is remote copied will be overwritten by our buffered writes once they are flushed.  When this happens, the call to invalidate_inode_pages2_range will also return a -EBUSY error. This patch fixes this by also sync'ing the destination file before starting any copy. Fixes: 503f82a9932d ("ceph: support copy_file_range file operation") Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-11-08mount: Prevent MNT_DETACH from disconnecting locked mountsEric W. Biederman1-1/+1
Timothy Baldwin <timbaldwin@fastmail.co.uk> wrote: > As per mount_namespaces(7) unprivileged users should not be able to look under mount points: > > Mounts that come as a single unit from more privileged mount are locked > together and may not be separated in a less privileged mount namespace. > > However they can: > > 1. Create a mount namespace. > 2. In the mount namespace open a file descriptor to the parent of a mount point. > 3. Destroy the mount namespace. > 4. Use the file descriptor to look under the mount point. > > I have reproduced this with Linux 4.16.18 and Linux 4.18-rc8. > > The setup: > > $ sudo sysctl kernel.unprivileged_userns_clone=1 > kernel.unprivileged_userns_clone = 1 > $ mkdir -p A/B/Secret > $ sudo mount -t tmpfs hide A/B > > > "Secret" is indeed hidden as expected: > > $ ls -lR A > A: > total 0 > drwxrwxrwt 2 root root 40 Feb 12 21:08 B > > A/B: > total 0 > > > The attack revealing "Secret": > > $ unshare -Umr sh -c "exec unshare -m ls -lR /proc/self/fd/4/ 4<A" > /proc/self/fd/4/: > total 0 > drwxr-xr-x 3 root root 60 Feb 12 21:08 B > > /proc/self/fd/4/B: > total 0 > drwxr-xr-x 2 root root 40 Feb 12 21:08 Secret > > /proc/self/fd/4/B/Secret: > total 0 I tracked this down to put_mnt_ns running passing UMOUNT_SYNC and disconnecting all of the mounts in a mount namespace. Fix this by factoring drop_mounts out of drop_collected_mounts and passing 0 instead of UMOUNT_SYNC. There are two possible behavior differences that result from this. - No longer setting UMOUNT_SYNC will no longer set MNT_SYNC_UMOUNT on the vfsmounts being unmounted. This effects the lazy rcu walk by kicking the walk out of rcu mode and forcing it to be a non-lazy walk. - No longer disconnecting locked mounts will keep some mounts around longer as they stay because the are locked to other mounts. There are only two users of drop_collected mounts: audit_tree.c and put_mnt_ns. In audit_tree.c the mounts are private and there are no rcu lazy walks only calls to iterate_mounts. So the changes should have no effect except for a small timing effect as the connected mounts are disconnected. In put_mnt_ns there may be references from process outside the mount namespace to the mounts. So the mounts remaining connected will be the bug fix that is needed. That rcu walks are allowed to continue appears not to be a problem especially as the rcu walk change was about an implementation detail not about semantics. Cc: stable@vger.kernel.org Fixes: 5ff9d8a65ce8 ("vfs: Lock in place mounts from more privileged users") Reported-by: Timothy Baldwin <timbaldwin@fastmail.co.uk> Tested-by: Timothy Baldwin <timbaldwin@fastmail.co.uk> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-11-08mount: Don't allow copying MNT_UNBINDABLE|MNT_LOCKED mountsEric W. Biederman1-2/+8
Jonathan Calmels from NVIDIA reported that he's able to bypass the mount visibility security check in place in the Linux kernel by using a combination of the unbindable property along with the private mount propagation option to allow a unprivileged user to see a path which was purposefully hidden by the root user. Reproducer: # Hide a path to all users using a tmpfs root@castiana:~# mount -t tmpfs tmpfs /sys/devices/ root@castiana:~# # As an unprivileged user, unshare user namespace and mount namespace stgraber@castiana:~$ unshare -U -m -r # Confirm the path is still not accessible root@castiana:~# ls /sys/devices/ # Make /sys recursively unbindable and private root@castiana:~# mount --make-runbindable /sys root@castiana:~# mount --make-private /sys # Recursively bind-mount the rest of /sys over to /mnnt root@castiana:~# mount --rbind /sys/ /mnt # Access our hidden /sys/device as an unprivileged user root@castiana:~# ls /mnt/devices/ breakpoint cpu cstate_core cstate_pkg i915 intel_pt isa kprobe LNXSYSTM:00 msr pci0000:00 platform pnp0 power software system tracepoint uncore_arb uncore_cbox_0 uncore_cbox_1 uprobe virtual Solve this by teaching copy_tree to fail if a mount turns out to be both unbindable and locked. Cc: stable@vger.kernel.org Fixes: 5ff9d8a65ce8 ("vfs: Lock in place mounts from more privileged users") Reported-by: Jonathan Calmels <jcalmels@nvidia.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-11-08mount: Retest MNT_LOCKED in do_umountEric W. Biederman1-2/+8
It was recently pointed out that the one instance of testing MNT_LOCKED outside of the namespace_sem is in ksys_umount. Fix that by adding a test inside of do_umount with namespace_sem and the mount_lock held. As it helps to fail fails the existing test is maintained with an additional comment pointing out that it may be racy because the locks are not held. Cc: stable@vger.kernel.org Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Fixes: 5ff9d8a65ce8 ("vfs: Lock in place mounts from more privileged users") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-11-07ext4: fix buffer leak in __ext4_read_dirblock() on error pathVasily Averin1-0/+1
Fixes: dc6982ff4db1 ("ext4: refactor code to read directory blocks ...") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 3.9
2018-11-07ext4: fix buffer leak in ext4_expand_extra_isize_ea() on error pathVasily Averin1-2/+5
Fixes: de05ca852679 ("ext4: move call to ext4_error() into ...") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 4.17
2018-11-07ext4: fix buffer leak in ext4_xattr_move_to_block() on error pathVasily Averin1-0/+2
Fixes: 3f2571c1f91f ("ext4: factor out xattr moving") Fixes: 6dd4ee7cab7e ("ext4: Expand extra_inodes space per ...") Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 2.6.23
2018-11-07ext4: release bs.bh before re-using in ext4_xattr_block_find()Vasily Averin1-0/+2
bs.bh was taken in previous ext4_xattr_block_find() call, it should be released before re-using Fixes: 7e01c8e5420b ("ext3/4: fix uninitialized bs in ...") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 2.6.26
2018-11-07ext4: fix buffer leak in ext4_xattr_get_block() on error pathVasily Averin1-1/+3
Fixes: dec214d00e0d ("ext4: xattr inode deduplication") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 4.13
2018-11-07ext4: fix possible leak of s_journal_flag_rwsem in error pathVasily Averin1-0/+1
Fixes: c8585c6fcaf2 ("ext4: fix races between changing inode journal ...") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 4.7
2018-11-07ext4: fix possible leak of sbi->s_group_desc_leak in error pathTheodore Ts'o1-8/+8
Fixes: bfe0a5f47ada ("ext4: add more mount time checks of the superblock") Reported-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 4.18
2018-11-06ext4: remove unneeded brelse call in ext4_xattr_inode_update_ref()Vasily Averin1-5/+1
Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-11-06ext4: avoid possible double brelse() in add_new_gdb() on error pathTheodore Ts'o1-0/+1
Fixes: b40971426a83 ("ext4: add error checking to calls to ...") Reported-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 2.6.38
2018-11-06ext4: avoid buffer leak in ext4_orphan_add() after prior errorsVasily Averin1-1/+3
Fixes: d745a8c20c1f ("ext4: reduce contention on s_orphan_lock") Fixes: 6e3617e579e0 ("ext4: Handle non empty on-disk orphan link") Cc: Dmitry Monakhov <dmonakhov@gmail.com> Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 2.6.34
2018-11-06ext4: avoid buffer leak on shutdown in ext4_mark_iloc_dirty()Vasily Averin1-2/+3
ext4_mark_iloc_dirty() callers expect that it releases iloc->bh even if it returns an error. Fixes: 0db1ff222d40 ("ext4: add shutdown bit and check for it") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 4.11
2018-11-06ext4: fix possible inode leak in the retry loop of ext4_resize_fs()Vasily Averin1-0/+4
Fixes: 1c6bd7173d66 ("ext4: convert file system to meta_bg if needed ...") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 3.7
2018-11-06ext4: fix missing cleanup if ext4_alloc_flex_bg_array() fails while resizingVasily Averin1-1/+1
Fixes: 117fff10d7f1 ("ext4: grow the s_flex_groups array as needed ...") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 3.7
2018-11-06xfs: fix overflow in xfs_attr3_leaf_verifyDave Chinner1-2/+9
generic/070 on 64k block size filesystems is failing with a verifier corruption on writeback or an attribute leaf block: [ 94.973083] XFS (pmem0): Metadata corruption detected at xfs_attr3_leaf_verify+0x246/0x260, xfs_attr3_leaf block 0x811480 [ 94.975623] XFS (pmem0): Unmount and run xfs_repair [ 94.976720] XFS (pmem0): First 128 bytes of corrupted metadata buffer: [ 94.978270] 000000004b2e7b45: 00 00 00 00 00 00 00 00 3b ee 00 00 00 00 00 00 ........;....... [ 94.980268] 000000006b1db90b: 00 00 00 00 00 81 14 80 00 00 00 00 00 00 00 00 ................ [ 94.982251] 00000000433f2407: 22 7b 5c 82 2d 5c 47 4c bb 31 1c 37 fa a9 ce d6 "{\.-\GL.1.7.... [ 94.984157] 0000000010dc7dfb: 00 00 00 00 00 81 04 8a 00 0a 18 e8 dd 94 01 00 ................ [ 94.986215] 00000000d5a19229: 00 a0 dc f4 fe 98 01 68 f0 d8 07 e0 00 00 00 00 .......h........ [ 94.988171] 00000000521df36c: 0c 2d 32 e2 fe 20 01 00 0c 2d 58 65 fe 0c 01 00 .-2.. ...-Xe.... [ 94.990162] 000000008477ae06: 0c 2d 5b 66 fe 8c 01 00 0c 2d 71 35 fe 7c 01 00 .-[f.....-q5.|.. [ 94.992139] 00000000a4a6bca6: 0c 2d 72 37 fc d4 01 00 0c 2d d8 b8 f0 90 01 00 .-r7.....-...... [ 94.994789] XFS (pmem0): xfs_do_force_shutdown(0x8) called from line 1453 of file fs/xfs/xfs_buf.c. Return address = ffffffff815365f3 This is failing this check: end = ichdr.freemap[i].base + ichdr.freemap[i].size; if (end < ichdr.freemap[i].base) >>>>> return __this_address; if (end > mp->m_attr_geo->blksize) return __this_address; And from the buffer output above, the freemap array is: freemap[0].base = 0x00a0 freemap[0].size = 0xdcf4 end = 0xdd94 freemap[1].base = 0xfe98 freemap[1].size = 0x0168 end = 0x10000 freemap[2].base = 0xf0d8 freemap[2].size = 0x07e0 end = 0xf8b8 These all look valid - the block size is 0x10000 and so from the last check in the above verifier fragment we know that the end of freemap[1] is valid. The problem is that end is declared as: uint16_t end; And (uint16_t)0x10000 = 0. So we have a verifier bug here, not a corruption. Fix the verifier to use uint32_t types for the check and hence avoid the overflow. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=201577 Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-11-06xfs: print buffer offsets when dumping corrupt buffersDarrick J. Wong1-1/+1
Use DUMP_PREFIX_OFFSET when printing hex dumps of corrupt buffers because modern Linux now prints a 32-bit hash of our 64-bit pointer when using DUMP_PREFIX_ADDRESS: 00000000b4bb4297: 00 00 00 00 00 00 00 00 3b ee 00 00 00 00 00 00 ........;....... 00000005ec77e26: 00 00 00 00 02 d0 5a 00 00 00 00 00 00 00 00 00 ......Z......... 000000015938018: 21 98 e8 b4 fd de 4c 07 bc ea 3c e5 ae b4 7c 48 !.....L...<...|H This is totally worthless for a sequential dump since we probably only care about tracking the buffer offsets and afaik there's no way to recover the actual pointer from the hashed value. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-11-06xfs: Fix error code in 'xfs_ioc_getbmap()'Christophe JAILLET1-1/+1
In this function, once 'buf' has been allocated, we unconditionally return 0. However, 'error' is set to some error codes in several error handling paths. Before commit 232b51948b99 ("xfs: simplify the xfs_getbmap interface") this was not an issue because all error paths were returning directly, but now that some cleanup at the end may be needed, we must propagate the error code. Fixes: 232b51948b99 ("xfs: simplify the xfs_getbmap interface") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-11-04Merge tag 'tags/upstream-4.20-rc1' of git://git.infradead.org/linux-ubifsLinus Torvalds21-292/+1982
Pull UBIFS updates from Richard Weinberger: - Full filesystem authentication feature, UBIFS is now able to have the whole filesystem structure authenticated plus user data encrypted and authenticated. - Minor cleanups * tag 'tags/upstream-4.20-rc1' of git://git.infradead.org/linux-ubifs: (26 commits) ubifs: Remove unneeded semicolon Documentation: ubifs: Add authentication whitepaper ubifs: Enable authentication support ubifs: Do not update inode size in-place in authenticated mode ubifs: Add hashes and HMACs to default filesystem ubifs: authentication: Authenticate super block node ubifs: Create hash for default LPT ubfis: authentication: Authenticate master node ubifs: authentication: Authenticate LPT ubifs: Authenticate replayed journal ubifs: Add auth nodes to garbage collector journal head ubifs: Add authentication nodes to journal ubifs: authentication: Add hashes to index nodes ubifs: Add hashes to the tree node cache ubifs: Create functions to embed a HMAC in a node ubifs: Add helper functions for authentication support ubifs: Add separate functions to init/crc a node ubifs: Format changes for authentication support ubifs: Store read superblock node ubifs: Drop write_node ...
2018-11-04Merge tag 'nfs-for-4.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds1-1/+1
Pull NFS client bugfixes from Trond Myklebust: "Highlights include: Bugfix: - Fix build issues on architectures that don't provide 64-bit cmpxchg Cleanups: - Fix a spelling mistake" * tag 'nfs-for-4.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: fix spelling mistake, EACCESS -> EACCES SUNRPC: Use atomic(64)_t for seq_send(64)
2018-11-03ext4: add missing brelse() update_backups()'s error pathVasily Averin1-1/+3
Fixes: ac27a0ec112a ("ext4: initial copy of files from ext3") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 2.6.19
2018-11-03ext4: add missing brelse() add_new_gdb_meta_bg()'s error pathVasily Averin1-2/+1
Fixes: 01f795f9e0d6 ("ext4: add online resizing support for meta_bg ...") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 3.7
2018-11-03ext4: add missing brelse() in set_flexbg_block_bitmap()'s error pathVasily Averin1-2/+4
Fixes: 33afdcc5402d ("ext4: add a function which sets up group blocks ...") Cc: stable@kernel.org # 3.3 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-11-03ext4: avoid potential extra brelse in setup_new_flex_group_blocks() Vasily Averin1-6/+2
Currently bh is set to NULL only during first iteration of for cycle, then this pointer is not cleared after end of using. Therefore rollback after errors can lead to extra brelse(bh) call, decrements bh counter and later trigger an unexpected warning in __brelse() Patch moves brelse() calls in body of cycle to exclude requirement of brelse() call in rollback. Fixes: 33afdcc5402d ("ext4: add a function which sets up group blocks ...") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 3.3+
2018-11-03Merge tag '4.20-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds12-99/+530
Pull cifs fixes and updates from Steve French: "Three small fixes (one Kerberos related, one for stable, and another fixes an oops in xfstest 377), two helpful debugging improvements, three patches for cifs directio and some minor cleanup" * tag '4.20-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix signed/unsigned mismatch on aio_read patch cifs: don't dereference smb_file_target before null check CIFS: Add direct I/O functions to file_operations CIFS: Add support for direct I/O write CIFS: Add support for direct I/O read smb3: missing defines and structs for reparse point handling smb3: allow more detailed protocol info on open files for debugging smb3: on kerberos mount if server doesn't specify auth type use krb5 smb3: add trace point for tree connection cifs: fix spelling mistake, EACCESS -> EACCES cifs: fix return value for cifs_listxattr
2018-11-03bfs: add sanity check at bfs_fill_super()Tetsuo Handa1-3/+6
syzbot is reporting too large memory allocation at bfs_fill_super() [1]. Since file system image is corrupted such that bfs_sb->s_start == 0, bfs_fill_super() is trying to allocate 8MB of continuous memory. Fix this by adding a sanity check on bfs_sb->s_start, __GFP_NOWARN and printf(). [1] https://syzkaller.appspot.com/bug?id=16a87c236b951351374a84c8a32f40edbc034e96 Link: http://lkml.kernel.org/r/1525862104-3407-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+71c6b5d68e91149fc8a4@syzkaller.appspotmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Tigran Aivazian <aivazian.tigran@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-03ocfs2: fix clusters leak in ocfs2_defrag_extent()Larry Chen1-0/+17
ocfs2_defrag_extent() might leak allocated clusters. When the file system has insufficient space, the number of claimed clusters might be less than the caller wants. If that happens, the original code might directly commit the transaction without returning clusters. This patch is based on code in ocfs2_add_clusters_in_btree(). [akpm@linux-foundation.org: include localalloc.h, reduce scope of data_ac] Link: http://lkml.kernel.org/r/20180904041621.16874-3-lchen@suse.com Signed-off-by: Larry Chen <lchen@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <ge.changwei@h3c.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-03ocfs2: dlmglue: clean up timestamp handlingArnd Bergmann1-17/+9
The handling of timestamps outside of the 1970..2038 range in the dlm glue is rather inconsistent: on 32-bit architectures, this has always wrapped around to negative timestamps in the 1902..1969 range, while on 64-bit kernels all timestamps are interpreted as positive 34 bit numbers in the 1970..2514 year range. Now that the VFS code handles 64-bit timestamps on all architectures, we can make the behavior more consistent here, and return the same result that we had on 64-bit already, making the file system y2038 safe in the process. Outside of dlmglue, it already uses 64-bit on-disk timestamps anway, so that part is fine. For consistency, I'm changing ocfs2_pack_timespec() to clamp anything outside of the supported range to the minimum and maximum values. This avoids a possible ambiguity of values before 1970 in particular, which used to be interpreted as times at the end of the 2514 range previously. Link: http://lkml.kernel.org/r/20180619155826.4106487-1-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <ge.changwei@h3c.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-03ocfs2: don't put and assigning null to bh allocated outsideChangwei Ge1-18/+59
ocfs2_read_blocks() and ocfs2_read_blocks_sync() are both used to read several blocks from disk. Currently, the input argument *bhs* can be NULL or NOT. It depends on the caller's behavior. If the function fails in reading blocks from disk, the corresponding bh will be assigned to NULL and put. Obviously, above process for non-NULL input bh is not appropriate. Because the caller doesn't even know its bhs are put and re-assigned. If buffer head is managed by caller, ocfs2_read_blocks and ocfs2_read_blocks_sync() should not evaluate it to NULL. It will cause caller accessing illegal memory, thus crash. Link: http://lkml.kernel.org/r/HK2PR06MB045285E0F4FBB561F9F2F9B3D5680@HK2PR06MB0452.apcprd06.prod.outlook.com Signed-off-by: Changwei Ge <ge.changwei@h3c.com> Reviewed-by: Guozhonghua <guozhonghua@h3c.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <ge.changwei@h3c.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-03ocfs2: fix a misuse a of brelse after failing ocfs2_check_dir_entryChangwei Ge1-2/+1
Somehow, file system metadata was corrupted, which causes ocfs2_check_dir_entry() to fail in function ocfs2_dir_foreach_blk_el(). According to the original design intention, if above happens we should skip the problematic block and continue to retrieve dir entry. But there is obviouse misuse of brelse around related code. After failure of ocfs2_check_dir_entry(), current code just moves to next position and uses the problematic buffer head again and again during which the problematic buffer head is released for multiple times. I suppose, this a serious issue which is long-lived in ocfs2. This may cause other file systems which is also used in a the same host insane. So we should also consider about bakcporting this patch into linux -stable. Link: http://lkml.kernel.org/r/HK2PR06MB045211675B43EED794E597B6D56E0@HK2PR06MB0452.apcprd06.prod.outlook.com Signed-off-by: Changwei Ge <ge.changwei@h3c.com> Suggested-by: Changkuo Shi <shi.changkuo@h3c.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-03ocfs2: don't use iocb when EIOCBQUEUED returnsChangwei Ge1-2/+2
When -EIOCBQUEUED returns, it means that aio_complete() will be called from dio_complete(), which is an asynchronous progress against write_iter. Generally, IO is a very slow progress than executing instruction, but we still can't take the risk to access a freed iocb. And we do face a BUG crash issue. Using the crash tool, iocb is obviously freed already. crash> struct -x kiocb ffff881a350f5900 struct kiocb { ki_filp = 0xffff881a350f5a80, ki_pos = 0x0, ki_complete = 0x0, private = 0x0, ki_flags = 0x0 } And the backtrace shows: ocfs2_file_write_iter+0xcaa/0xd00 [ocfs2] aio_run_iocb+0x229/0x2f0 do_io_submit+0x291/0x540 SyS_io_submit+0x10/0x20 system_call_fastpath+0x16/0x75 Link: http://lkml.kernel.org/r/1523361653-14439-1-git-send-email-ge.changwei@h3c.com Signed-off-by: Changwei Ge <ge.changwei@h3c.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-03ocfs2: without quota support, avoid calling quota recoveryGuozhonghua1-17/+34
During one dead node's recovery by other node, quota recovery work will be queued. We should avoid calling quota when it is not supported, so check the quota flags. Link: http://lkml.kernel.org/r/71604351584F6A4EBAE558C676F37CA401071AC9FB@H3CMLB12-EX.srv.huawei-3com.com Signed-off-by: guozhonghua <guozhonghua@h3c.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <ge.changwei@h3c.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-03ocfs2: remove ocfs2_is_o2cb_active()Gang He3-10/+1
Remove ocfs2_is_o2cb_active(). We have similar functions to identify which cluster stack is being used via osb->osb_cluster_stack. Secondly, the current implementation of ocfs2_is_o2cb_active() is not totally safe. Based on the design of stackglue, we need to get ocfs2_stack_lock before using ocfs2_stack related data structures, and that active_stack pointer can be NULL in the case of mount failure. Link: http://lkml.kernel.org/r/1495441079-11708-1-git-send-email-ghe@suse.com Signed-off-by: Gang He <ghe@suse.com> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Reviewed-by: Eric Ren <zren@suse.com> Acked-by: Changwei Ge <ge.changwei@h3c.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-02cifs: fix signed/unsigned mismatch on aio_read patchSteve French1-6/+11
The patch "CIFS: Add support for direct I/O read" had a signed/unsigned mismatch (ssize_t vs. size_t) in the return from one function. Similar trivial change in aio_write Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reported-by: Julia Lawall <julia.lawall@lip6.fr>
2018-11-02cifs: don't dereference smb_file_target before null checkColin Ian King1-2/+5
There is a null check on dst_file->private data which suggests it can be potentially null. However, before this check, pointer smb_file_target is derived from dst_file->private and dereferenced in the call to tlink_tcon, hence there is a potential null pointer deference. Fix this by assigning smb_file_target and target_tcon after the null pointer sanity checks. Detected by CoverityScan, CID#1475302 ("Dereference before null check") Fixes: 04b38d601239 ("vfs: pull btrfs clone API to vfs layer") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-11-02CIFS: Add direct I/O functions to file_operationsLong Li1-6/+4
With direct read/write functions implemented, add them to file_operations. Dircet I/O is used under two conditions: 1. When mounting with "cache=none", CIFS uses direct I/O for all user file data transfer. 2. When opening a file with O_DIRECT, CIFS uses direct I/O for all data transfer on this file. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-11-02CIFS: Add support for direct I/O writeLong Li2-41/+164
With direct I/O write, user supplied buffers are pinned to the memory and data are transferred directly from user buffers to the transport layer. Change in v3: add support for kernel AIO Change in v4: Refactor common write code to __cifs_writev for direct and non-direct I/O. Retry on direct I/O failure. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-11-02CIFS: Add support for direct I/O readLong Li3-39/+192
With direct I/O read, we transfer the data directly from transport layer to the user data buffer. Change in v3: add support for kernel AIO Change in v4: Refactor common read code to __cifs_readv for direct and non-direct I/O. Retry on direct I/O failure. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-11-02smb3: missing defines and structs for reparse point handlingSteve French2-0/+38
We were missing some structs from MS-FSCC relating to reparse point handling. Add them to protocol defines in smb2pdu.h Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com>
2018-11-02smb3: allow more detailed protocol info on open files for debuggingSteve French4-0/+65
In order to debug complex problems it is often helpful to have detailed information on the client and server view of the open file information. Add the ability for root to view the list of smb3 open files and dump the persistent handle and other info so that it can be more easily correlated with server logs. Sample output from "cat /proc/fs/cifs/open_files" # Version:1 # Format: # <tree id> <persistent fid> <flags> <count> <pid> <uid> <filename> <mid> 0x5 0x800000378 0x8000 1 7704 0 some-file 0x14 0xcb903c0c 0x84412e67 0x8000 1 7754 1001 rofile 0x1a6d 0xcb903c0c 0x9526b767 0x8000 1 7720 1000 file 0x1a5b 0xcb903c0c 0x9ce41a21 0x8000 1 7715 0 smallfile 0xd67 Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-11-02smb3: on kerberos mount if server doesn't specify auth type use krb5Steve French1-2/+4
Some servers (e.g. Azure) do not include a spnego blob in the SMB3 negotiate protocol response, so on kerberos mounts ("sec=krb5") we can fail, as we expected the server to list its supported auth types (OIDs in the spnego blob in the negprot response). Change this so that on krb5 mounts we default to trying krb5 if the server doesn't list its supported protocol mechanisms. Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> CC: Stable <stable@vger.kernel.org>
2018-11-02smb3: add trace point for tree connectionSteve French2-1/+44
In debugging certain scenarios, especially reconnect cases, it can be helpful to have a dynamic trace point for the result of tree connect. See sample output below from a reconnect event. The new event is 'smb3_tcon' TASK-PID CPU# |||| TIMESTAMP FUNCTION | | | |||| | | cifsd-6071 [001] .... 2659.897923: smb3_reconnect: server=localhost current_mid=0xa kworker/1:1-71 [001] .... 2666.026342: smb3_cmd_done: sid=0x0 tid=0x0 cmd=0 mid=0 kworker/1:1-71 [001] .... 2666.026576: smb3_cmd_err: sid=0xc49e1787 tid=0x0 cmd=1 mid=1 status=0xc0000016 rc=-5 kworker/1:1-71 [001] .... 2666.031677: smb3_cmd_done: sid=0xc49e1787 tid=0x0 cmd=1 mid=2 kworker/1:1-71 [001] .... 2666.031921: smb3_cmd_done: sid=0xc49e1787 tid=0x6e78f05f cmd=3 mid=3 kworker/1:1-71 [001] .... 2666.031923: smb3_tcon: xid=0 sid=0xc49e1787 tid=0x0 unc_name=\\localhost\test rc=0 kworker/1:1-71 [001] .... 2666.032097: smb3_cmd_done: sid=0xc49e1787 tid=0x6e78f05f cmd=11 mid=4 kworker/1:1-71 [001] .... 2666.032265: smb3_cmd_done: sid=0xc49e1787 tid=0x7912332f cmd=3 mid=5 kworker/1:1-71 [001] .... 2666.032266: smb3_tcon: xid=0 sid=0xc49e1787 tid=0x0 unc_name=\\localhost\IPC$ rc=0 kworker/1:1-71 [001] .... 2666.032386: smb3_cmd_done: sid=0xc49e1787 tid=0x7912332f cmd=11 mid=6 Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>