linux - Linux Kernel (branches are rebased on master from time to time)

Age	Commit message (Collapse)	Author	Files	Lines
2011-07-20	vfs: increase shrinker batch size	Dave Chinner	1	-0/+1
	Now that the per-sb shrinker is responsible for shrinking 2 or more caches, increase the batch size to keep econmies of scale for shrinking each cache. Increase the shrinker batch size to 1024 objects. To allow for a large increase in batch size, add a conditional reschedule to prune_icache_sb() so that we don't hold the LRU spin lock for too long. This mirrors the behaviour of the __shrink_dcache_sb(), and allows us to increase the batch size without needing to worry about problems caused by long lock hold times. To ensure that filesystems using the per-sb shrinker callouts don't cause problems, document that the object freeing method must reschedule appropriately inside loops. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	superblock: add filesystem shrinker operations	Dave Chinner	1	-12/+33
	Now we have a per-superblock shrinker implementation, we can add a filesystem specific callout to it to allow filesystem internal caches to be shrunk by the superblock shrinker. Rather than perpetuate the multipurpose shrinker callback API (i.e. nr_to_scan == 0 meaning "tell me how many objects freeable in the cache), two operations will be added. The first will return the number of objects that are freeable, the second is the actual shrinker call. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	inode: remove iprune_sem	Dave Chinner	1	-21/+0
	Now that we have per-sb shrinkers with a lifecycle that is a subset of the superblock lifecycle and can reliably detect a filesystem being unmounted, there is not longer any race condition for the iprune_sem to protect against. Hence we can remove it. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	superblock: introduce per-sb cache shrinker infrastructure	Dave Chinner	3	-218/+71
	With context based shrinkers, we can implement a per-superblock shrinker that shrinks the caches attached to the superblock. We currently have global shrinkers for the inode and dentry caches that split up into per-superblock operations via a coarse proportioning method that does not batch very well. The global shrinkers also have a dependency - dentries pin inodes - so we have to be very careful about how we register the global shrinkers so that the implicit call order is always correct. With a per-sb shrinker callout, we can encode this dependency directly into the per-sb shrinker, hence avoiding the need for strictly ordering shrinker registrations. We also have no need for any proportioning code for the shrinker subsystem already provides this functionality across all shrinkers. Allowing the shrinker to operate on a single superblock at a time means that we do less superblock list traversals and locking and reclaim should batch more effectively. This should result in less CPU overhead for reclaim and potentially faster reclaim of items from each filesystem. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	xfs: add size update tracepoint to IO completion	Dave Chinner	2	-4/+9
	For improving insight into IO completion behaviour. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2011-07-20	xfs: convert AIL cursors to use struct list_head	Dave Chinner	2	-55/+28
	The list of active AIL cursors uses a roll-your-own linked list with special casing for the AIL push cursor. Simplify this code by replacing the list with standard struct list_head lists, and use a separate list_head to track the active cursors. This allows us to treat the AIL push cursor as a generic cursor rather than as a special case, further simplifying the code. Further, fix the duplicate push cursor initialisation that the special case handling was hiding, and clean up all the comments around the active cursor list handling. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2011-07-20	xfs: remove confusing ail cursor wrapper	Dave Chinner	1	-31/+19
	xfs_trans_ail_cursor_set() doesn't set the cursor to the current log item, it sets it to the next item. There is already a function for doing this - xfs_trans_ail_cursor_next() - and the _set function is simply a two line wrapper. Remove it and open code the setting of the cursor in the two locations that call it to remove the confusion. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2011-07-20	xfs: use a cursor for bulk AIL insertion	Dave Chinner	3	-28/+118
	Delayed logging can insert tens of thousands of log items into the AIL at the same LSN. When the committing of log commit records occur, we can get insertions occurring at an LSN that is not at the end of the AIL. If there are thousands of items in the AIL on the tail LSN, each insertion has to walk the AIL to find the correct place to insert the new item into the AIL. This can consume large amounts of CPU time and block other operations from occurring while the traversals are in progress. To avoid this repeated walk, use a AIL cursor to record where we should be inserting the new items into the AIL without having to repeat the walk. The cursor infrastructure already provides this functionality for push walks, so is a simple extension of existing code. While this will not avoid the initial walk, it will avoid repeating it tens of thousands of times during a single checkpoint commit. This version includes logic improvements from Christoph Hellwig. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2011-07-20	xfs: failure mapping nfs fh to inode should return ESTALE	J. Bruce Fields	1	-2/+2
	On xfs exports, nfsd is incorrectly returning ENOENT instead of ESTALE on attempts to use a filehandle of a deleted file (spotted with pynfs test PUTFH3). The ENOENT was coming from xfs_iget. (It's tempting to wonder whether we should just map all xfs_iget errors to ESTALE, but I don't believe so--xfs_iget can also return ENOMEM at least, which we wouldn't want mapped to ESTALE.) While we're at it, the other return of ENOENT in xfs_nfs_get_inode() also looks wrong. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
2011-07-20	xfs: Remove the second parameter to xfs_sb_count()	Chandra Seetharaman	3	-12/+7
	Remove the second parameter to xfs_sb_count() since all callers of the function set them. Also, fix the header comment regarding it being called periodically. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Alex Elder <aelder@sgi.com>
2011-07-20	superblock: move pin_sb_for_writeback() to fs/super.c	Dave Chinner	3	-27/+35
	The per-sb shrinker has the same requirement as the writeback threads of ensuring that the superblock is usable and pinned for the time it takes to run the work. Both need to take a passive reference to the sb, take a read lock on the s_umount lock and then only continue if an unmount is not in progress. pin_sb_for_writeback() does this exactly, so move it to fs/super.c and rename it to grab_super_passive() and exporting it via fs/internal.h for all the VFS code to be able to use. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	inode: move to per-sb LRU locks	Dave Chinner	2	-14/+14
	With the inode LRUs moving to per-sb structures, there is no longer a need for a global inode_lru_lock. The locking can be made more fine-grained by moving to a per-sb LRU lock, isolating the LRU operations of different filesytsems completely from each other. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	inode: Make unused inode LRU per superblock	Dave Chinner	2	-11/+81
	The inode unused list is currently a global LRU. This does not match the other global filesystem cache - the dentry cache - which uses per-superblock LRU lists. Hence we have related filesystem object types using different LRU reclaimation schemes. To enable a per-superblock filesystem cache shrinker, both of these caches need to have per-sb unused object LRU lists. Hence this patch converts the global inode LRU to per-sb LRUs. The patch only does rudimentary per-sb propotioning in the shrinker infrastructure, as this gets removed when the per-sb shrinker callouts are introduced later on. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	inode: convert inode_stat.nr_unused to per-cpu counters	Dave Chinner	1	-5/+11
	Before we split up the inode_lru_lock, the unused inode counter needs to be made independent of the global inode_lru_lock. Convert it to per-cpu counters to do this. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	make d_splice_alias(ERR_PTR(err), dentry) = ERR_PTR(err)	Al Viro	15	-94/+39
	... and simplify the living hell out of callers Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	deuglify squashfs_lookup()	Al Viro	1	-4/+1
	d_splice_alias(NULL, dentry) is equivalent to d_add(dentry, NULL), NULL so no need for that if (inode) ... in there (or ERR_PTR(0), for that matter) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	nfsd4_list_rec_dir(): don't bother with reopening rec_file	Al Viro	1	-31/+21
	just rewind it to the beginning before vfs_readdir() and be done with that... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	kill useless checks for sb->s_op == NULL	Al Viro	2	-2/+1
	never is... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	btrfs: kill magical embedded struct superblock	Al Viro	4	-22/+29
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	get rid of pointless checks for dentry->sb == NULL	Al Viro	1	-1/+0
	it never is... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	Make ->d_sb assign-once and always non-NULL	Al Viro	3	-39/+47
	New helper (non-exported, fs/internal.h-only): __d_alloc(sb, name). Allocates dentry, sets its ->d_sb to given superblock and sets ->d_op accordingly. Old d_alloc(NULL, name) callers are converted to that (all of them know what superblock they want). d_alloc() itself is left only for parent != NULl case; uses __d_alloc(), inserts result into the list of parent's children. Note that now ->d_sb is assign-once and never NULL and ->d_parent is never NULL either. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	unexport kern_path_parent()	Al Viro	1	-1/+0
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	switch vfs_path_lookup() to struct path	Al Viro	3	-21/+20
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	kill lookup_create()	Al Viro	1	-36/+18
	folded into the only caller (kern_path_create()) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	new helpers: kern_path_create/user_path_create	Al Viro	2	-116/+87
	combination of kern_path_parent() and lookup_create(). Does not expose struct nameidata to caller. Syscalls converted to that... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	kill LOOKUP_CONTINUE	Al Viro	1	-8/+3
	LOOKUP_PARENT is equivalent to it now Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	nfs: LOOKUP_{OPEN,CREATE,EXCL} is set only on the last step	Al Viro	1	-4/+2
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	cifs_lookup(): LOOKUP_OPEN is set only on the last component	Al Viro	1	-1/+1
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	ceph: LOOKUP_OPEN is set only when it's the last component	Al Viro	1	-1/+0
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	jfs_ci_revalidate() is safe from RCU mode	Al Viro	1	-2/+0
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	LOOKUP_CREATE and LOOKUP_RENAME_TARGET can be set only on the last step	Al Viro	3	-12/+6
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	no need to check for LOOKUP_OPEN in ->create() instances	Al Viro	5	-10/+10
	... it will be set in nd->flag for all cases with non-NULL nd (i.e. when called from do_last()). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	don't pass nameidata to vfs_create() from ecryptfs_create()	Al Viro	1	-28/+5
	Instead of playing with removal of LOOKUP_OPEN, mangling (and restoring) nd->path, just pass NULL to vfs_create(). The whole point of what's being done there is to suppress any attempts to open file by underlying fs, which is what nd == NULL indicates. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	don't transliterate lower bits of ->intent.open.flags to FMODE_...	Al Viro	6	-30/+23
	->create() instances are much happier that way... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	Don't pass nameidata when calling vfs_create() from mknod()	Al Viro	1	-1/+1
	All instances can cope with that now (and ceph one actually starts working properly). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	fix mknod() on nfs4 (hopefully)	Al Viro	1	-12/+12
	a) check the right flags in ->create() (LOOKUP_OPEN, not LOOKUP_CREATE) b) default (!LOOKUP_OPEN) open_flags is O_CREAT\|O_EXCL\|FMODE_READ, not 0 c) lookup_instantiate_filp() should be done only with LOOKUP_OPEN; otherwise we need to issue CLOSE, lest we leak stateid on server. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	nameidata_to_nfs_open_context() doesn't need nameidata, actually...	Al Viro	1	-6/+7
	just open flags; switched to passing just those and renamed to create_nfs_open_context() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	nfs_open_context doesn't need struct path either	Al Viro	7	-42/+40
	just dentry, please... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	nfs4_opendata doesn't need struct path either	Al Viro	1	-23/+22
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	nfs4_closedata doesn't need to mess with struct path	Al Viro	3	-22/+21
	instead of path_get()/path_put(), we can just use nfs_sb_{,de}active() to pin the superblock down. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	cifs: fix the type of cifs_demultiplex_thread()	Al Viro	1	-2/+3
	... and get rid of a bogus typecast, while we are at it; it's not just that we want a function returning int and not void, but cast to pointer to function taking void * and returning void would be (void ()(void )) and not (void )(void ), TYVM... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	ecryptfs_inode_permission() doesn't need to bail out on RCU	Al Viro	1	-2/+0
	... now that inode_permission() can take MAY_NOT_BLOCK and handle it properly. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	merge do_revalidate() into its only caller	Al Viro	1	-24/+18
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	no reason to keep exec_permission() separate now	Al Viro	1	-41/+4
	cache footprint alone makes it a bad idea... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	massage generic_permission() to treat directories on a separate path	Al Viro	1	-4/+13
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	->permission() sanitizing: don't pass flags to exec_permission()	Al Viro	1	-10/+7
	pass mask instead; kill security_inode_exec_permission() since we can use security_inode_permission() instead. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	->permission() sanitizing: don't pass flags to ->permission()	Al Viro	27	-50/+50
	not used by the instances anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	->permission() sanitizing: don't pass flags to generic_permission()	Al Viro	15	-18/+17
	redundant; all callers get it duplicated in mask & MAY_NOT_BLOCK and none of them removes that bit. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	->permission() sanitizing: don't pass flags to ->check_acl()	Al Viro	23	-27/+27
	not used in the instances anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20	->permission() sanitizing: pass MAY_NOT_BLOCK to ->check_acl()	Al Viro	13	-15/+14
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>