linux - Linux Kernel (branches are rebased on master from time to time)

Age	Commit message (Collapse)	Author	Files	Lines
2022-03-22	fs: allocate inode by using alloc_inode_sb()	Muchun Song	1	-1/+1
	The inode allocation is supposed to use alloc_inode_sb(), so convert kmem_cache_alloc() of all filesystems to alloc_inode_sb(). Link: https://lkml.kernel.org/r/20220228122126.37293-5-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Theodore Ts'o <tytso@mit.edu> [ext4] Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Alex Shi <alexs@kernel.org> Cc: Anna Schumaker <Anna.Schumaker@Netapp.com> Cc: Chao Yu <chao@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Fam Zheng <fam.zheng@bytedance.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kari Argillander <kari.argillander@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Xiongchun Duan <duanxiongchun@bytedance.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-24	fsnotify: fix fsnotify hooks in pseudo filesystems	Amir Goldstein	1	-2/+2
	Commit 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of d_delete()") moved the fsnotify delete hook before d_delete() so fsnotify will have access to a positive dentry. This allowed a race where opening the deleted file via cached dentry is now possible after receiving the IN_DELETE event. To fix the regression in pseudo filesystems, convert d_delete() calls to d_drop() (see commit 46c46f8df9aa ("devpts_pty_kill(): don't bother with d_delete()") and move the fsnotify hook after d_drop(). Add a missing fsnotify_unlink() hook in nfsdfs that was found during the audit of fsnotify hooks in pseudo filesystems. Note that the fsnotify hooks in simple_recursive_removal() follow d_invalidate(), so they require no change. Link: https://lore.kernel.org/r/20220120215305.282577-2-amir73il@gmail.com Reported-by: Ivan Delalande <colona@arista.com> Link: https://lore.kernel.org/linux-fsdevel/YeNyzoDM5hP5LtGW@visor/ Fixes: 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of d_delete()") Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2021-08-09	SUNRPC: Convert rpc_client refcount to use refcount_t	Trond Myklebust	1	-1/+1
	There are now tools in the refcount library that allow us to convert the client shutdown code. Reported-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-02-01	SUNRPC: Fix fall-through warnings for Clang	Gustavo A. R. Silva	1	-0/+1
	In preparation to enable -Wimplicit-fallthrough for Clang, fix multiple warnings by explicitly adding multiple break statements instead of letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-11-17	net: datagram: fix some kernel-doc markups	Mauro Carvalho Chehab	1	-1/+2
	Some identifiers have different names between their prototypes and the kernel-doc markup. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-17	sunrpc: destroy rpc_inode_cachep after unregister_filesystem	Dan Aloni	1	-1/+1
	Better to unregister the file system before destroying the kmem_cache cache of the inodes, so that the inodes are freed before we are trying to destroy it. Otherwise, kmem_cache yells that some objects are live. Signed-off-by: Dan Aloni <dan@kernelim.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-06-26	sunrpc: fixed rollback in rpc_gssd_dummy_populate()	Vasily Averin	1	-0/+1
	__rpc_depopulate(gssd_dentry) was lost on error path cc: stable@vger.kernel.org Fixes: commit 4b9a445e3eeb ("sunrpc: create a new dummy pipe for gssd to hold open") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-12-04	kernel/notifier.c: remove blocking_notifier_chain_cond_register()	Xiaoming Ni	1	-1/+1
	blocking_notifier_chain_cond_register() does not consider system_booting state, which is the only difference between this function and blocking_notifier_cain_register(). This can be a bug and is a piece of duplicate code. Delete blocking_notifier_chain_cond_register() Link: http://lkml.kernel.org/r/1568861888-34045-4-git-send-email-nixiaoming@huawei.com Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Anna Schumaker <anna.schumaker@netapp.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: David S. Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@kernel.org> Cc: J. Bruce Fields <bfields@fieldses.org> Cc: Jeff Layton <jlayton@kernel.org> Cc: Nadia Derbey <Nadia.Derbey@bull.net> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Sam Protsenko <semen.protsenko@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Vasily Averin <vvs@virtuozzo.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-05	new helper: get_tree_keyed()	Al Viro	1	-2/+1
	For vfs_get_keyed_super users. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-07-19	Merge branch 'work.mount0' of ↵	Linus Torvalds	1	-8/+26
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs mount updates from Al Viro: "The first part of mount updates. Convert filesystems to use the new mount API" * 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) mnt_init(): call shmem_init() unconditionally constify ksys_mount() string arguments don't bother with registering rootfs init_rootfs(): don't bother with init_ramfs_fs() vfs: Convert smackfs to use the new mount API vfs: Convert selinuxfs to use the new mount API vfs: Convert securityfs to use the new mount API vfs: Convert apparmorfs to use the new mount API vfs: Convert openpromfs to use the new mount API vfs: Convert xenfs to use the new mount API vfs: Convert gadgetfs to use the new mount API vfs: Convert oprofilefs to use the new mount API vfs: Convert ibmasmfs to use the new mount API vfs: Convert qib_fs/ipathfs to use the new mount API vfs: Convert efivarfs to use the new mount API vfs: Convert configfs to use the new mount API vfs: Convert binfmt_misc to use the new mount API convenience helper: get_tree_single() convenience helper get_tree_nodev() vfs: Kill sget_userns() ...
2019-06-20	rpc_pipefs: call fsnotify_{unlink,rmdir}() hooks	Amir Goldstein	1	-0/+4
	This will allow generating fsnotify delete events after the fsnotify_nameremove() hook is removed from d_delete(). Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Anna Schumaker <anna.schumaker@netapp.com> Reviewed-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2019-05-25	vfs: Convert rpc_pipefs to use the new mount API	David Howells	1	-8/+26
	Convert the rpc_pipefs filesystem to the new internal mount API as the old one will be obsoleted and removed. This allows greater flexibility in communication of mount parameters between userspace, the VFS and the filesystem. See Documentation/filesystems/mount_api.txt for more information. Signed-off-by: David Howells <dhowells@redhat.com> cc: Trond Myklebust <trond.myklebust@hammerspace.com> cc: Anna Schumaker <anna.schumaker@netapp.com> cc: "J. Bruce Fields" <bfields@fieldses.org> cc: Jeff Layton <jlayton@kernel.org> cc: linux-nfs@vger.kernel.org
2019-05-21	treewide: Add SPDX license identifier for missed files	Thomas Gleixner	1	-0/+1
	Add SPDX license identifiers to all files which: - Have no license information of any form - Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-01	rpcpipe: switch to ->free_inode()	Al Viro	1	-9/+2
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-01-02	sunrpc: convert to DEFINE_SHOW_ATTRIBUTE	Yangtao Li	1	-16/+3
	Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-06-04	Merge branch 'work.misc' of ↵	Linus Torvalds	1	-16/+0
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs updates from Al Viro: "Misc bits and pieces not fitting into anything more specific" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: delete unnecessary assignment in vfs_listxattr Documentation: filesystems: update filesystem locking documentation vfs: namei: use path_equal() in follow_dotdot() fs.h: fix outdated comment about file flags __inode_security_revalidate() never gets NULL opt_dentry make xattr_getsecurity() static vfat: simplify checks in vfat_lookup() get rid of dead code in d_find_alias() it's SB_BORN, not MS_BORN... msdos_rmdir(): kill BS comment remove rpc_rmdir() fs: avoid fdput() after failed fdget() in vfs_dedupe_file_range()
2018-04-16	remove rpc_rmdir()	Al Viro	1	-16/+0
	no users since 2014... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-04-15	rpc_pipefs: fix double-dput()	Al Viro	1	-0/+1
	if we ever hit rpc_gssd_dummy_depopulate() dentry passed to it has refcount equal to 1. __rpc_rmpipe() drops it and dput() done after that hits an already freed dentry. Cc: stable@kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-03-26	net: Use octal not symbolic permissions	Joe Perches	1	-21/+21
	Prefer the direct use of octal for permissions. Done with checkpatch -f --types=SYMBOLIC_PERMS --fix-inplace and some typing. Miscellanea: o Whitespace neatening around these conversions. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-11	vfs: do bulk POLL* -> EPOLL* replacement	Linus Torvalds	1	-3/+3
	This is the mindless scripted replacement of kernel use of POLL* variables as described by Al, done by this script: for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do L=`git grep -l -w POLL$V \| grep -v '^t' \| grep -v /um/ \| grep -v '^sa' \| grep -v '/poll.h$'\|grep -v '^D'` for f in $L; do sed -i "-es/^$[^\"]$$\<POLL$V\>$/\\1E\\2/" $f; done done with de-mangling cleanups yet to come. NOTE! On almost all architectures, the EPOLL constants have the same values as the POLL* constants do. But they keyword here is "almost". For various bad reasons they aren't the same, and epoll() doesn't actually work quite correctly in some cases due to this on Sparc et al. The next patch from Al will sort out the final differences, and we should be all done. Scripted-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-27	net: annotate ->poll() instances	Al Viro	1	-2/+2
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-11-17	sunrpc: remove net pointer from messages	Vasily Averin	1	-4/+4
	Publishing of net pointer is not safe, use net->ns.inum as net ID [ 171.391947] RPC: created new rpcb local clients (rpcb_local_clnt: ..., rpcb_local_clnt4: ...) for net f00001e7 [ 171.767188] NFSD: starting 90-second grace period (net f00001e7) Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-27	fs: Replace CURRENT_TIME with current_time() for inode timestamps	Deepa Dinamani	1	-1/+1
	CURRENT_TIME macro is not appropriate for filesystems as it doesn't use the right granularity for filesystem timestamps. Use current_time() instead. CURRENT_TIME is also not y2038 safe. This is also in preparation for the patch that transitions vfs timestamps to use 64 bit time and hence make them y2038 safe. As part of the effort current_time() will be extended to do range checks. Hence, it is necessary for all file system timestamps to use current_time(). Also, current_time() will be transitioned along with vfs to be y2038 safe. Note that whenever a single call to current_time() is used to change timestamps in different inodes, it is because they share the same time granularity. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Felipe Balbi <balbi@kernel.org> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Acked-by: David Sterba <dsterba@suse.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-06-23	vfs: Pass data, ns, and ns->userns to mount_ns	Eric W. Biederman	1	-4/+4
	Today what is normally called data (the mount options) is not passed to fill_super through mount_ns. Pass the mount options and the namespace separately to mount_ns so that filesystems such as proc that have mount options, can use mount_ns. Pass the user namespace to mount_ns so that the standard permission check that verifies the mounter has permissions over the namespace can be performed in mount_ns instead of in each filesystems .mount method. Thus removing the duplication between mqueuefs and proc in terms of permission checks. The extra permission check does not currently affect the rpc_pipefs filesystem and the nfsd filesystem as those filesystems do not currently allow unprivileged mounts. Without unpvileged mounts it is guaranteed that the caller has already passed capable(CAP_SYS_ADMIN) which guarantees extra permission check will pass. Update rpc_pipefs and the nfsd filesystem to ensure that the network namespace reference is always taken in fill_super and always put in kill_sb so that the logic is simpler and so that errors originating inside of fill_super do not cause a network namespace leak. Acked-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-04-04	mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros	Kirill A. Shutemov	1	-2/+2
	PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced long time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-22	wrappers for ->i_mutex access	Al Viro	1	-30/+30
	parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested}, inode_foo(inode) being mutex_foo(&inode->i_mutex). Please, use those for access to ->i_mutex; over the coming cycle ->i_mutex will become rwsem, with ->lookup() done with it held only shared. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-14	kmemcg: account certain kmem allocations to memcg	Vladimir Davydov	1	-1/+1
	Mark those kmem allocations that are known to be easily triggered from userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to memcg. For the list, see below: - threadinfo - task_struct - task_delay_info - pid - cred - mm_struct - vm_area_struct and vm_region (nommu) - anon_vma and anon_vma_chain - signal_struct - sighand_struct - fs_struct - files_struct - fdtable and fdtable->full_fds_bits - dentry and external_name - inode for all filesystems. This is the most tedious part, because most filesystems overwrite the alloc_inode method. The list is far from complete, so feel free to add more objects. Nevertheless, it should be close to "account everything" approach and keep most workloads within bounds. Malevolent users will be able to breach the limit, but this was possible even with the former "account everything" approach (simply because it did not account everything in fact). [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Tejun Heo <tj@kernel.org> Cc: Greg Thelen <gthelen@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-15	VFS: net/: d_inode() annotations	David Howells	1	-16/+16
	socket inodes and sunrpc filesystems - inodes owned by that code Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-07-12	rpc_pipe: Drop memory allocation cast	Himangi Saraogi	1	-1/+1
	Drop cast on the result of kmalloc and similar functions. The semantic patch that makes this change is as follows: // <smpl> @@ type T; @@ - (T *) ($kmalloc\\|kzalloc\\|kcalloc\\|kmem_cache_alloc\\|kmem_cache_zalloc\\| kmem_cache_alloc_node\\|kmalloc_node\\|kzalloc_node$(...)) // </smpl> Signed-off-by: Himangi Saraogi <himangi774@gmail.com> Acked-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2013-12-10	rpc_pipe: fix cleanup of dummy gssd directory when notification fails	Jeff Layton	1	-1/+13
	Currently, it could leak dentry references in some cases. Make sure we clean up properly. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-12-06	sunrpc: add an "info" file for the dummy gssd pipe	Jeff Layton	1	-1/+49
	rpc.gssd expects to see an "info" file in each clntXX dir. Since adding the dummy gssd pipe, users that run rpc.gssd see a lot of these messages spamming the logs: rpc.gssd[508]: ERROR: can't open /var/lib/nfs/rpc_pipefs/gssd/clntXX/info: No such file or directory rpc.gssd[508]: ERROR: failed to read service info Add a dummy gssd/clntXX/info file to help silence these messages. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-12-06	rpc_pipe: remove the clntXX dir if creating the pipe fails	Jeff Layton	1	-0/+2
	In the event that we create the gssd/clntXX dir, but the pipe creation subsequently fails, then we should remove the clntXX dir before returning. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-12-06	sunrpc: replace sunrpc_net->gssd_running flag with a more reliable check	Jeff Layton	1	-4/+10
	Now that we have a more reliable method to tell if gssd is running, we can replace the sn->gssd_running flag with a function that will query to see if it's up and running. There's also no need to attempt an upcall that we know will fail, so just return -EACCES if gssd isn't running. Finally, fix the warn_gss() message not to claim that that the upcall timed out since we don't necesarily perform one now when gssd isn't running, and remove the extraneous newline from the message. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-12-06	sunrpc: create a new dummy pipe for gssd to hold open	Jeff Layton	1	-3/+90
	rpc.gssd will naturally hold open any pipe named /clnt/gssd that shows up under rpc_pipefs. That behavior gives us a reliable mechanism to tell whether it's actually running or not. Create a new toplevel "gssd" directory in rpc_pipefs when it's mounted. Under that directory create another directory called "clntXX", and then within that a pipe called "gssd". We'll never send an upcall along that pipe, and any downcall written to it will just return -EINVAL. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-11-15	consolidate simple ->d_delete() instances	Al Viro	1	-10/+1
	Rename simple_delete_dentry() to always_delete_dentry() and export it. Export simple_dentry_operations, while we are at it, and get rid of their duplicates Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-10-24	sunrpc: switch to %pd	Al Viro	1	-6/+6
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-09-01	SUNRPC: Add a helper to allow sharing of rpc_pipefs directory objects	Trond Myklebust	1	-0/+35
	Add support for looking up existing objects and creating new ones if there is no match. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-09-01	SUNRPC: Remove the rpc_client->cl_dentry	Trond Myklebust	1	-6/+7
	It is now redundant. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-08-30	SUNRPC: Add a framework to clean up management of rpc_pipefs directories	Trond Myklebust	1	-2/+132
	The current system requires everyone to set up notifiers, manage directory locking, etc. What we really want to do is have the rpc_client create its directory, and then create all the entries. This patch will allow the RPCSEC_GSS and NFS code to register all the objects that they want to have appear in the directory, and then have the sunrpc code call them back to actually create/destroy their pipefs dentries when the rpc_client creates/destroys the parent. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-08-30	SUNRPC: Deprecate rpc_client->cl_protname	Trond Myklebust	1	-1/+1
	It just duplicates the cl_program->name, and is not used in any fast paths where the extra dereference will cause a hit. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-07-23	rpc_pipe: convert back to simple_dir_inode_operations	Jeff Layton	1	-18/+1
	Now that Al has fixed simple_lookup to account for the case where sb->s_d_op is set, there's no need to keep our own special lookup op. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-07-14	Merge branch 'for-linus' of ↵	Linus Torvalds	1	-27/+13
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs stuff from Al Viro: "O_TMPFILE ABI changes, Oleg's fput() series, misc cleanups, including making simple_lookup() usable for filesystems with non-NULL s_d_op, which allows us to get rid of quite a bit of ugliness" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: sunrpc: now we can just set ->s_d_op cgroup: we can use simple_lookup() now efivarfs: we can use simple_lookup() now make simple_lookup() usable for filesystems that set ->s_d_op configfs: don't open-code d_alloc_name() __rpc_lookup_create_exclusive: pass string instead of qstr rpc_create_*_dir: don't bother with qstr llist: llist_add() can use llist_add_batch() llist: fix/simplify llist_add() and llist_add_batch() fput: turn "list_head delayed_fput_list" into llist_head fs/file_table.c:fput(): add comment Safer ABI for O_TMPFILE
2013-07-14	sunrpc: now we can just set ->s_d_op	Al Viro	1	-3/+2
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-14	__rpc_lookup_create_exclusive: pass string instead of qstr	Al Viro	1	-25/+9
	... and use d_hash_and_lookup() instead of open-coding it, for fsck sake... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-14	rpc_create_*_dir: don't bother with qstr	Al Viro	1	-6/+8
	just pass the name Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-09	rpc_pipe: rpc_dir_inode_operations can be static	Fengguang Wu	1	-1/+1
	Hi Jeff, FYI, there are new sparse warnings show up in tree: git://git.linux-nfs.org/projects/trondmy/linux-nfs.git nfs-for-next head: 296afe1f58d55fd56ed85daaafafcfee39f59ece commit: 76fa66657900071016f2bae61de28f059f3f2abf [2/5] rpc_pipe: set dentry operations at d_alloc time >> net/sunrpc/rpc_pipe.c:496:31: sparse: symbol 'rpc_dir_inode_operations' was not declared. Should it be static? Please consider folding the attached diff :-) Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-07-09	rpc_pipe: set dentry operations at d_alloc time	Jeff Layton	1	-5/+20
	Currently the way these get set is a little convoluted. If the dentry is allocated via lookup from userland, then it gets set by simple_lookup. If it gets allocated when the kernel is populating the directory, then it gets set via __rpc_lookup_create_exclusive, which has to check whether they might already be set. Between both of these, this ensures that all dentries have their d_op pointer set. Instead of doing that, just have them set at d_alloc time by pointing sb->s_d_op at them. With that change, we no longer want the lookup op to set them, so we must move to using our own lookup routine. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-28	SUNRPC: fix races on PipeFS UMOUNT notifications	Stanislav Kinsbursky	1	-1/+1
	CPU#0 CPU#1 ----------------------------- ----------------------------- rpc_kill_sb sn->pipefs_sb = NULL rpc_release_client (UMOUNT_EVENT) rpc_free_auth rpc_pipefs_event rpc_get_client_for_event !atomic_inc_not_zero(cl_count) <skip the client> atomic_inc(cl_count) rpc_free_client rpc_clnt_remove_pipedir <skip client dir removing> To fix this, this patch does the following: 1) Calls RPC_PIPEFS_UMOUNT notification with sn->pipefs_sb_lock being held. 2) Removes SUNRPC client from the list AFTER pipes destroying. 3) Doesn't hold RPC client on notification: if client in the list, then it can't be destroyed while sn->pipefs_sb_lock in hold by notification caller. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-28	SUNRPC: fix races on PipeFS MOUNT notifications	Stanislav Kinsbursky	1	-0/+3
	Below are races, when RPC client can be created without PiepFS dentries CPU#0 CPU#1 ----------------------------- ----------------------------- rpc_new_client rpc_fill_super rpc_setup_pipedir mutex_lock(&sn->pipefs_sb_lock) rpc_get_sb_net == NULL (no per-net PipeFS superblock) sn->pipefs_sb = sb; notifier_call_chain(MOUNT) (client is not in the list) rpc_register_client (client without pipes dentries) To fix this patch: 1) makes PipeFS mount notification call with pipefs_sb_lock being held. 2) releases pipefs_sb_lock on new SUNRPC client creation only after registration. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-18	rpc_pipefs: only set rpc_dentry_ops if d_op isn't already set	Jeff Layton	1	-1/+2
	We had a report of a reproducible WARNING: [ 1360.039358] ------------[ cut here ]------------ [ 1360.043978] WARNING: at fs/dcache.c:1355 d_set_d_op+0x8d/0xc0() [ 1360.049880] Hardware name: HP Z200 Workstation [ 1360.054308] Modules linked in: nfsv4 nfs dns_resolver fscache nfsd auth_rpcgss nfs_acl lockd sunrpc sg acpi_cpufreq mperf coretemp kvm_intel kvm snd_hda_codec_realtek snd_hda_intel snd_hda_codec hp_wmi crc32c_intel snd_hwdep e1000e snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd sparse_keymap rfkill soundcore serio_raw ptp iTCO_wdt pps_core pcspkr iTCO_vendor_support mei microcode lpc_ich mfd_core wmi xfs libcrc32c sr_mod sd_mod cdrom crc_t10dif radeon i2c_algo_bit drm_kms_helper ttm ahci libahci drm i2c_core libata dm_mirror dm_region_hash dm_log dm_mod [last unloaded: auth_rpcgss] [ 1360.107406] Pid: 8814, comm: mount.nfs4 Tainted: G I -------------- 3.9.0-0.55.el7.x86_64 #1 [ 1360.116771] Call Trace: [ 1360.119219] [<ffffffff810610c0>] warn_slowpath_common+0x70/0xa0 [ 1360.125208] [<ffffffff810611aa>] warn_slowpath_null+0x1a/0x20 [ 1360.131025] [<ffffffff811af46d>] d_set_d_op+0x8d/0xc0 [ 1360.136159] [<ffffffffa05a7d6f>] __rpc_lookup_create_exclusive+0x4f/0x80 [sunrpc] [ 1360.143710] [<ffffffffa05a8cc6>] rpc_mkpipe_dentry+0x86/0x170 [sunrpc] [ 1360.150311] [<ffffffffa062a7b6>] nfs_idmap_new+0x96/0x130 [nfsv4] [ 1360.156475] [<ffffffffa062e7cd>] nfs4_init_client+0xad/0x2d0 [nfsv4] [ 1360.162902] [<ffffffff812f02df>] ? idr_get_empty_slot+0x16f/0x3c0 [ 1360.169062] [<ffffffff812f0582>] ? idr_mark_full+0x52/0x60 [ 1360.174615] [<ffffffff812f0699>] ? idr_alloc+0x79/0xe0 [ 1360.179826] [<ffffffffa0598081>] ? __rpc_init_priority_wait_queue+0x81/0xc0 [sunrpc] [ 1360.187635] [<ffffffffa05980f3>] ? rpc_init_wait_queue+0x13/0x20 [sunrpc] [ 1360.194493] [<ffffffffa05d05da>] nfs_get_client+0x27a/0x350 [nfs] [ 1360.200666] [<ffffffffa062e438>] nfs4_set_client.isra.8+0x78/0x100 [nfsv4] [ 1360.207624] [<ffffffffa062f2f3>] nfs4_create_server+0xf3/0x3a0 [nfsv4] [ 1360.214222] [<ffffffffa06284be>] nfs4_remote_mount+0x2e/0x60 [nfsv4] [ 1360.220644] [<ffffffff8119ea79>] mount_fs+0x39/0x1b0 [ 1360.225691] [<ffffffff81153880>] ? __alloc_percpu+0x10/0x20 [ 1360.231348] [<ffffffff811b7ccf>] vfs_kern_mount+0x5f/0xf0 [ 1360.236822] [<ffffffffa0628396>] nfs_do_root_mount+0x86/0xc0 [nfsv4] [ 1360.243246] [<ffffffffa06287b4>] nfs4_try_mount+0x44/0xc0 [nfsv4] [ 1360.249410] [<ffffffffa05d1457>] ? get_nfs_version+0x27/0x80 [nfs] [ 1360.255659] [<ffffffffa05db985>] nfs_fs_mount+0x5c5/0xd10 [nfs] [ 1360.261650] [<ffffffffa05dc550>] ? nfs_clone_super+0x140/0x140 [nfs] [ 1360.268074] [<ffffffffa05da8e0>] ? param_set_portnr+0x60/0x60 [nfs] [ 1360.274406] [<ffffffff8119ea79>] mount_fs+0x39/0x1b0 [ 1360.279443] [<ffffffff81153880>] ? __alloc_percpu+0x10/0x20 [ 1360.285088] [<ffffffff811b7ccf>] vfs_kern_mount+0x5f/0xf0 [ 1360.290556] [<ffffffff811b9f5d>] do_mount+0x1fd/0xa00 [ 1360.295677] [<ffffffff81137dee>] ? __get_free_pages+0xe/0x50 [ 1360.301405] [<ffffffff811b9be6>] ? copy_mount_options+0x36/0x170 [ 1360.307479] [<ffffffff811ba7e3>] sys_mount+0x83/0xc0 [ 1360.312515] [<ffffffff8160ad59>] system_call_fastpath+0x16/0x1b [ 1360.318503] ---[ end trace 8fa1f4cbc36094a7 ]--- The problem is that we're ending up in __rpc_lookup_create_exclusive with a negative dentry that already has d_op set. A little debugging has shown that when we hit this, the d_ops are already set to simple_dentry_operations. I believe that what's happening is that during a mount, idmapd is racing in and doing a lookup of /var/lib/nfs/rpc_pipefs/nfs/clnt???/idmap. Before that dentry reference is released, the kernel races in to create that file and finds the new negative dentry, which already has the d_op set. This patch just avoids setting the d_op if it's already set. simple_dentry_operations and rpc_dentry_operations are functionally equivalent so it shouldn't matter which one it's set to. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>