summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2013-02-27idr: remove MAX_IDR_MASK and move left MAX_IDR_* into idr.cTejun Heo1-1/+1
MAX_IDR_MASK is another weirdness in the idr interface. As idr covers whole positive integer range, it's defined as 0x7fffffff or INT_MAX. Its usage in idr_find(), idr_replace() and idr_remove() is bizarre. They basically mask off the sign bit and operate on the rest, so if the caller, by accident, passes in a negative number, the sign bit will be masked off and the remaining part will be used as if that was the input, which is worse than crashing. The constant is visible in idr.h and there are several users in the kernel. * drivers/i2c/i2c-core.c:i2c_add_numbered_adapter() Basically used to test if adap->nr is a negative number which isn't -1 and returns -EINVAL if so. idr_alloc() already has negative @start checking (w/ WARN_ON_ONCE), so this can go away. * drivers/infiniband/core/cm.c:cm_alloc_id() drivers/infiniband/hw/mlx4/cm.c:id_map_alloc() Used to wrap cyclic @start. Can be replaced with max(next, 0). Note that this type of cyclic allocation using idr is buggy. These are prone to spurious -ENOSPC failure after the first wraparound. * fs/super.c:get_anon_bdev() The ID allocated from ida is masked off before being tested whether it's inside valid range. ida allocated ID can never be a negative number and the masking is unnecessary. Update idr_*() functions to fail with -EINVAL when negative @id is specified and update other MAX_IDR_MASK users as described above. This leaves MAX_IDR_MASK without any user, remove it and relocate other MAX_IDR_* constants to lib/idr.c. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jean Delvare <khali@linux-fr.org> Cc: Roland Dreier <roland@kernel.org> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: "Marciniszyn, Mike" <mike.marciniszyn@intel.com> Cc: Jack Morgenstein <jackm@dev.mellanox.co.il> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Wolfram Sang <wolfram@the-dreams.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27nfs4client: convert to idr_alloc()Tejun Heo1-7/+6
Convert to the much saner new idr interface. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27ocfs2: convert to idr_alloc()Tejun Heo1-19/+13
Convert to the much saner new idr interface. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27inotify: convert to idr_alloc()Tejun Heo1-13/+11
Convert to the much saner new idr interface. Note that the adhoc cyclic id allocation is buggy. If wraparound happens, the previous code with idr_get_new_above() may segfault and the converted code will trigger WARN and return -EINVAL. Even if it's fixed to wrap to zero, the code will be prone to unnecessary -ENOSPC failures after the first wraparound. We probably need to implement proper cyclic support in idr. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: John McCutchan <john@johnmccutchan.com> Cc: Robert Love <rlove@rlove.org> Cc: Eric Paris <eparis@parisplace.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27dlm: convert to idr_alloc()Tejun Heo2-26/+19
Convert to the much saner new idr interface. Error return values from recover_idr_add() mix -1 and -errno. The conversion doesn't change that but it looks iffy. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27inotify: don't use idr_remove_all()Tejun Heo1-1/+0
idr_destroy() can destroy idr by itself and idr_remove_all() is being deprecated. Drop its usage. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: John McCutchan <john@johnmccutchan.com> Cc: Robert Love <rlove@rlove.org> Cc: Eric Paris <eparis@parisplace.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27nfs: idr_destroy() no longer needs idr_remove_all()Tejun Heo1-1/+0
idr_destroy() can destroy idr by itself and idr_remove_all() is being deprecated. Drop reference to idr_remove_all(). Note that the code wasn't completely correct before because idr_remove() on all entries doesn't necessarily release all idr_layers which could lead to memory leak. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27dlm: don't use idr_remove_all()Tejun Heo2-2/+1
idr_destroy() can destroy idr by itself and idr_remove_all() is being deprecated. The conversion isn't completely trivial for recover_idr_clear() as it's the only place in kernel which makes legitimate use of idr_remove_all() w/o idr_destroy(). Replace it with idr_remove() call inside idr_for_each_entry() loop. It goes on top so that it matches the operation order in recover_idr_del(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Christine Caulfield <ccaulfie@redhat.com> Cc: David Teigland <teigland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27dlm: use idr_for_each_entry() in recover_idr_clear() error pathTejun Heo1-13/+10
Convert recover_idr_clear() to use idr_for_each_entry() instead of idr_for_each(). It's somewhat less efficient this way but it shouldn't matter in an error path. This is to help with deprecation of idr_remove_all(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Christine Caulfield <ccaulfie@redhat.com> Cc: David Teigland <teigland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27fs/seq_file.c:seq_lseek(): fix switch statement indentingAndrew Morton1-20/+20
[akpm@linux-foundation.org: checkpatch fixes] Cc: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27seq-file: use SEEK_ macros instead of hardcoded numbersCyrill Gorcunov1-2/+2
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27fs/proc/vmcore.c: put if tests in the top of the while loop to reduce ↵Zhang Yanfei1-13/+7
duplication In read_vmcore() two `if' tests are duplicated. Change the position of them could reduce the duplication. This change does not affect the behaviour of the function. [akpm@linux-foundation.org: avoid `if (foo = bar)' thing, use min_t()] [akpm@linux-foundation.org: s/max_t/min_t/] Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27fs/proc: clean up printksAndrew Morton7-46/+39
- use pr_foo() throughout - remove a couple of duplicated KERN_WARNINGs, via WARN(KERN_WARNING "...") - nuke a few warnings which I've never seen happen, ever. Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27coredump: remove redundant defines for dumpable statesKees Cook3-7/+8
The existing SUID_DUMP_* defines duplicate the newer SUID_DUMPABLE_* defines introduced in 54b501992dd2 ("coredump: warn about unsafe suid_dumpable / core_pattern combo"). Remove the new ones, and use the prior values instead. Signed-off-by: Kees Cook <keescook@chromium.org> Reported-by: Chen Gang <gang.chen@asianux.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alan Cox <alan@linux.intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: James Morris <james.l.morris@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27fat: mark fs as dirty on mount and clean on umountOleksij Rempel2-0/+68
There is no documented methods to mark FAT as dirty. Unofficially MS started to use reserved Byte in boot sector for this purpose, at least since Win 2000. With Win 7 user is warned if fs is dirty and asked to clean it. Different versions of Win, handle it in different ways, but always have same meaning: - Win 2000 and XP, set it on write operations and remove it after operation was finnished - Win 7, set dirty flag on first write and remove it on umount. We will do it as follows: - set dirty flag on mount. If fs was initially dirty, warn user, remember it and do not do any changes to boot sector. - clean it on umount. If fs was initially dirty, leave it dirty. - do not do any thing if fs mounted read-only. - TODO: leave fs dirty if we found some error after mount. Signed-off-by: Oleksij Rempel <bug-track@fisher-privat.net> Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27fat: add extended fileds to struct fat_boot_sectorOleksij Rempel1-4/+4
Later we will need "state" field to check if volume was cleanly unmounted. Signed-off-by: Oleksij Rempel <bug-track@fisher-privat.net> Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27hfsplus: fix issue with unzeroed unused b-tree nodesVyacheslav Dubeyko1-0/+2
The fsck_hfs (under MacOS X) complains about unzeroed unused b-tree nodes after deletion of folders' tree under Linux. SYMPTOMS: Running Disk Utiltiy's "Verify Disk" on "test" gives the following: Verifying volume “Test” Checking file systemChecking Journaled HFS Plus volume. Checking extents overflow file. Checking catalog file. Unused node is not erased (node = 3111) Checking multi-linked files. Checking catalog hierarchy. Checking extended attributes file. Checking volume bitmap. Checking volume information. The volume Test was found corrupt and needs to be repaired. Error: This disk needs to be repaired. Click Repair Disk. REPRODUCING PATH: 1. Prepare HFS+ (non-case sensitive) partition (for example, 5GB) under MacOS X. 2. Copy linux kernel source tree (for example, 3.7-rc6 version) on this partition under MacOS X. 3. Then switch to Linux and mount this prepared partition. 4. Execute `sudo rm -r` under prepared directory with linux kernel source tree. 5. Unmount and boot back into OS X. 6. Open up Disk Utility and verify partition. REPRODUCIBILITY: 100% FIX: It is added code of node clearing in hfs_bnode_put() method for the case when node has flag HFS_BNODE_DELETED. Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Reported-by: Kyle Laracey <kalaracey@gmail.com> Acked-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27hfsplus: add support of manipulation by attributes fileVyacheslav Dubeyko13-183/+287
Add support of manipulation by attributes file. Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27hfsplus: rework functionality of getting, setting and deleting of extended ↵Vyacheslav Dubeyko5-0/+999
attributes Rework functionality of getting, setting and deleting of extended attributes. Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27hfsplus: add functionality of manipulating by records in attributes treeVyacheslav Dubeyko1-0/+399
Add functionality of manipulating by records in attributes tree. Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27hfsplus: add on-disk layout declarations related to attributes treeVyacheslav Dubeyko1-2/+66
Add all necessary on-disk layout declarations related to attributes file. Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27ocfs2: ac->ac_allow_chain_relink=0 won't disable group relinkXiaowei.Hu2-5/+4
ocfs2_block_group_alloc_discontig() disables chain relink by setting ac->ac_allow_chain_relink = 0 because it grabs clusters from multiple cluster groups. It doesn't keep the credits for all chain relink,but ocfs2_claim_suballoc_bits overrides this in this call trace: ocfs2_block_group_claim_bits()->ocfs2_claim_clusters()-> __ocfs2_claim_clusters()->ocfs2_claim_suballoc_bits() ocfs2_claim_suballoc_bits set ac->ac_allow_chain_relink = 1; then call ocfs2_search_chain() one time and disable it again, and then we run out of credits. Fix is to allow relink by default and disable it in ocfs2_block_group_alloc_discontig. Without this patch, End-users will run into a crash due to run out of credits, backtrace like this: RIP: 0010:[<ffffffffa0808b14>] [<ffffffffa0808b14>] jbd2_journal_dirty_metadata+0x164/0x170 [jbd2] RSP: 0018:ffff8801b919b5b8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88022139ddc0 RCX: ffff880159f652d0 RDX: ffff880178aa3000 RSI: ffff880159f652d0 RDI: ffff880087f09bf8 RBP: ffff8801b919b5e8 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000001e00 R11: 00000000000150b0 R12: ffff880159f652d0 R13: ffff8801a0cae908 R14: ffff880087f09bf8 R15: ffff88018d177800 FS: 00007fc9b0b6b6e0(0000) GS:ffff88022fd40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 000000000040819c CR3: 0000000184017000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process dd (pid: 9945, threadinfo ffff8801b919a000, task ffff880149a264c0) Call Trace: ocfs2_journal_dirty+0x2f/0x70 [ocfs2] ocfs2_relink_block_group+0x111/0x480 [ocfs2] ocfs2_search_chain+0x455/0x9a0 [ocfs2] ... Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27ocfs2: fix ocfs2_init_security_and_acl() to initialize acl correctlyJeff Liu1-1/+1
We need to re-initialize the security for a new reflinked inode with its parent dirs if it isn't specified to be preserved for ocfs2_reflink(). However, the code logic is broken at ocfs2_init_security_and_acl() although ocfs2_init_security_get() succeed. As a result, ocfs2_acl_init() does not involked and therefore the default ACL of parent dir was missing on the new inode. Note this was introduced by 9d8f13ba3 ("security: new security_inode_init_security API adds function callback") To reproduce: set default ACL for the parent dir(ocfs2 in this case): $ setfacl -m default:user:jeff:rwx ../ocfs2/ $ getfacl ../ocfs2/ # file: ../ocfs2/ # owner: jeff # group: jeff user::rwx group::r-x other::r-x default:user::rwx default:user:jeff:rwx default:group::r-x default:mask::rwx default:other::r-x $ touch a $ getfacl a # file: a # owner: jeff # group: jeff user::rw- group::rw- other::r-- Before patching, create reflink file b from a, the user default ACL entry(user:jeff:rwx)was missing: $ ./ocfs2_reflink a b $ getfacl b # file: b # owner: jeff # group: jeff user::rw- group::rw- other::r-- In this case, the end user can also observed an error message at syslog: (ocfs2_reflink,3229,2):ocfs2_init_security_and_acl:7193 ERROR: status = 0 After applying this patch, create reflink file c from a: $ ./ocfs2_reflink a c $ getfacl c # file: c # owner: jeff # group: jeff user::rw- user:jeff:rwx #effective:rw- group::r-x #effective:r-- mask::rw- other::r-- Test program: /* Usage: reflink <source> <dest> */ #include <stdio.h> #include <stdint.h> #include <stdbool.h> #include <string.h> #include <errno.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <sys/ioctl.h> static int reflink_file(char const *src_name, char const *dst_name, bool preserve_attrs) { int fd; #ifndef REFLINK_ATTR_NONE # define REFLINK_ATTR_NONE 0 #endif #ifndef REFLINK_ATTR_PRESERVE # define REFLINK_ATTR_PRESERVE 1 #endif #ifndef OCFS2_IOC_REFLINK struct reflink_arguments { uint64_t old_path; uint64_t new_path; uint64_t preserve; }; # define OCFS2_IOC_REFLINK _IOW ('o', 4, struct reflink_arguments) #endif struct reflink_arguments args = { .old_path = (unsigned long) src_name, .new_path = (unsigned long) dst_name, .preserve = preserve_attrs ? REFLINK_ATTR_PRESERVE : REFLINK_ATTR_NONE, }; fd = open(src_name, O_RDONLY); if (fd < 0) { fprintf(stderr, "Failed to open %s: %s\n", src_name, strerror(errno)); return -1; } if (ioctl(fd, OCFS2_IOC_REFLINK, &args) < 0) { fprintf(stderr, "Failed to reflink %s to %s: %s\n", src_name, dst_name, strerror(errno)); return -1; } } int main(int argc, char *argv[]) { if (argc != 3) { fprintf(stdout, "Usage: %s source dest\n", argv[0]); return 1; } return reflink_file(argv[1], argv[2], 0); } Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reviewed-by: Tao Ma <boyu.mt@taobao.com> Cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-26Merge branch 'for-linus' of ↵Linus Torvalds199-775/+793
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile (part one) from Al Viro: "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent locking violations, etc. The most visible changes here are death of FS_REVAL_DOT (replaced with "has ->d_weak_revalidate()") and a new helper getting from struct file to inode. Some bits of preparation to xattr method interface changes. Misc patches by various people sent this cycle *and* ocfs2 fixes from several cycles ago that should've been upstream right then. PS: the next vfs pile will be xattr stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) saner proc_get_inode() calling conventions proc: avoid extra pde_put() in proc_fill_super() fs: change return values from -EACCES to -EPERM fs/exec.c: make bprm_mm_init() static ocfs2/dlm: use GFP_ATOMIC inside a spin_lock ocfs2: fix possible use-after-free with AIO ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero target: writev() on single-element vector is pointless export kernel_write(), convert open-coded instances fs: encode_fh: return FILEID_INVALID if invalid fid_type kill f_vfsmnt vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op nfsd: handle vfs_getattr errors in acl protocol switch vfs_getattr() to struct path default SET_PERSONALITY() in linux/elf.h ceph: prepopulate inodes only when request is aborted d_hash_and_lookup(): export, switch open-coded instances 9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate() 9p: split dropping the acls from v9fs_set_create_acl() ...
2013-02-26Merge tag 'ext4_for_linus' of ↵Linus Torvalds31-1622/+2105
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Theodore Ts'o: "The one new feature added in this patch series is the ability to use the "punch hole" functionality for inodes that are not using extent maps. In the bug fix category, we fixed some races in the AIO and fstrim code, and some potential NULL pointer dereferences and memory leaks in error handling code paths. In the optimization category, we fixed a performance regression in the jbd2 layer introduced by commit d9b01934d56a ("jbd: fix fsync() tid wraparound bug", introduced in v3.0) which shows up in the AIM7 benchmark. We also further optimized jbd2 by minimize the amount of time that transaction handles are held active. This patch series also features some additional enhancement of the extent status tree, which is now used to cache extent information in a more efficient/compact form than what we use on-disk." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (65 commits) ext4: fix free clusters calculation in bigalloc filesystem ext4: no need to remove extent if len is 0 in ext4_es_remove_extent() ext4: fix xattr block allocation/release with bigalloc ext4: reclaim extents from extent status tree ext4: adjust some functions for reclaiming extents from extent status tree ext4: remove single extent cache ext4: lookup block mapping in extent status tree ext4: track all extent status in extent status tree ext4: let ext4_ext_map_blocks return EXT4_MAP_UNWRITTEN flag ext4: rename and improbe ext4_es_find_extent() ext4: add physical block and status member into extent status tree ext4: refine extent status tree ext4: use ERR_PTR() abstraction for ext4_append() ext4: refactor code to read directory blocks into ext4_read_dirblock() ext4: add debugging context for warning in ext4_da_update_reserve_space() ext4: use KERN_WARNING for warning messages jbd2: use module parameters instead of debugfs for jbd_debug ext4: use module parameters instead of debugfs for mballoc_debug ext4: start handle at the last possible moment when creating inodes ext4: fix the number of credits needed for acl ops with inline data ...
2013-02-26Merge branch 'for_linus' of ↵Linus Torvalds16-78/+177
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext2, ext3, udf updates from Jan Kara: "Several UDF fixes, a support for UDF extent cache, and couple of ext2 and ext3 cleanups and minor fixes" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: Ext2: remove the static function release_blocks to optimize the kernel Ext2: mark inode dirty after the function dquot_free_block_nodirty is called Ext2: remove the overhead check about sb in the function ext2_new_blocks udf: Remove unused s_extLength from udf_bitmap udf: Make s_block_bitmap standard array udf: Fix bitmap overflow on large filesystems with small block size udf: add extent cache support in case of file reading udf: Write LVID to disk after opening / closing Ext3: return ENOMEM rather than EIO if sb_getblk fails Ext2: return ENOMEM rather than EIO if sb_getblk fails Ext3: use unlikely to improve the efficiency of the kernel Ext2: use unlikely to improve the efficiency of the kernel Ext3: add necessary check in case IO error happens Ext2: free memory allocated and forget buffer head when io error happens ext3: Fix memory leak when quota options are specified multiple times ext3, ext4, ocfs2: remove unused macro NAMEI_RA_INDEX
2013-02-26Merge tag 'upstream-3.9-rc1' of git://git.infradead.org/linux-ubifsLinus Torvalds5-15/+27
Pull ubifs updates from Artem Bityutskiy: "It's been quite silent and we have only a couple of bug-fixes for the orphans handling code plus one cosmetic change." * tag 'upstream-3.9-rc1' of git://git.infradead.org/linux-ubifs: UBIFS: fix double free of ubifs_orphan objects UBIFS: fix use of freed ubifs_orphan objects UBIFS: rename random32() to prandom_u32()
2013-02-26Merge tag 'f2fs-for-3.9' of ↵Linus Torvalds13-261/+292
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs update from Jaegeuk Kim: "[Major bug fixes] o Store device file information correctly o Fix -EIO handling with respect to power-off-recovery o Allocate blocks with global locks o Fix wrong calculation of the SSR cost [Cleanups] o Get rid of fake on-stack dentries [Enhancement] o Support (un)freeze_fs o Enhance the f2fs_gc flow o Support 32-bit binary execution on 64-bit kernel" * tag 'f2fs-for-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (29 commits) f2fs: avoid build warning f2fs: add compat_ioctl to provide backward compatability f2fs: fix calculation of max. gc cost in the SSR case f2fs: clarify and enhance the f2fs_gc flow f2fs: optimize the return condition for has_not_enough_free_secs f2fs: make an accessor to get sections for particular block type f2fs: mark gc_thread as NULL when thread creation is failed f2fs: name gc task as per the block device f2fs: remove unnecessary gc option check and balance_fs f2fs: remove repeated F2FS_SET_SB_DIRT call f2fs: when check superblock failed, try to check another superblock f2fs: use F2FS_BLKSIZE to judge bloksize and page_cache_size f2fs: add device name in debugfs f2fs: stop repeated checking if cp is needed f2fs: avoid balanc_fs during evict_inode f2fs: remove the use of page_cache_release f2fs: fix typo mistake for data_version description f2fs: reorganize code for ra_node_page f2fs: avoid redundant call to has_not_enough_free_secs in f2fs_gc f2fs: add un/freeze_fs into super_operations ...
2013-02-26saner proc_get_inode() calling conventionsAl Viro2-21/+10
Make it drop the pde in *all* cases when no new reference to it is put into an inode - both when an inode had already been set up (as we were already doing) and when inode allocation has failed. Makes for simpler logics in callers... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26proc: avoid extra pde_put() in proc_fill_super()Maxim Patlasov1-6/+15
If proc_get_inode() succeeded, but d_make_root() failed, pde_put() for proc_root will be called twice: the first time due to iput() called from d_make_root() and the second time directly in the end of proc_fill_super(). Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26fs: change return values from -EACCES to -EPERMZhao Hongjiang6-16/+16
According to SUSv3: [EACCES] Permission denied. An attempt was made to access a file in a way forbidden by its file access permissions. [EPERM] Operation not permitted. An attempt was made to perform an operation limited to processes with appropriate privileges or to the owner of a file or other resource. So -EPERM should be returned if capability checks fails. Strictly speaking this is an API change since the error code user sees is altered. Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: Jan Kara <jack@suse.cz> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26fs/exec.c: make bprm_mm_init() staticYuanhan Liu1-1/+1
There is only one user of bprm_mm_init, and it's inside the same file. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26ocfs2/dlm: use GFP_ATOMIC inside a spin_lockDan Carpenter1-1/+1
My static checker complains that this is called with a spin_lock held in dlm_master_requery_handler() from dlmrecovery.c. Probably the reason we have not received any bug reports about this is that recovery is not a common operation. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Joel Becker <jlbec@evilplan.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26ocfs2: fix possible use-after-free with AIOJan Kara1-1/+1
Running AIO is pinning inode in memory using file reference. Once AIO is completed using aio_complete(), file reference is put and inode can be freed from memory. So we have to be sure that calling aio_complete() is the last thing we do with the inode. Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: Jeff Moyer <jmoyer@redhat.com> Acked-by: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code pathSunil Mushran1-1/+1
Commit ea022dfb3c2a4680483b00eb2fecc9fc4f6091d1 was missing a var init. Reported-and-Tested-by: Vincent Etienne <vetienne@aprogsys.com> Signed-off-by: Sunil Mushran <sunil.mushran@gmail.com> Signed-off-by: Joel Becker <jlbec@evilplan.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zeroAl Viro2-3/+0
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26export kernel_write(), convert open-coded instancesAl Viro2-7/+4
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26fs: encode_fh: return FILEID_INVALID if invalid fid_typeNamjae Jeon10-19/+19
This patch is a follow up on below patch: [PATCH] exportfs: add FILEID_INVALID to indicate invalid fid_type commit: 216b6cbdcbd86b1db0754d58886b466ae31f5a63 Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Vivek Trivedi <t.vivek@samsung.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Sage Weil <sage@inktank.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26kill f_vfsmntAl Viro3-4/+4
very few users left... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry opJeff Layton7-13/+53
The following set of operations on a NFS client and server will cause server# mkdir a client# cd a server# mv a a.bak client# sleep 30 # (or whatever the dir attrcache timeout is) client# stat . stat: cannot stat `.': Stale NFS file handle Obviously, we should not be getting an ESTALE error back there since the inode still exists on the server. The problem is that the lookup code will call d_revalidate on the dentry that "." refers to, because NFS has FS_REVAL_DOT set. nfs_lookup_revalidate will see that the parent directory has changed and will try to reverify the dentry by redoing a LOOKUP. That of course fails, so the lookup code returns ESTALE. The problem here is that d_revalidate is really a bad fit for this case. What we really want to know at this point is whether the inode is still good or not, but we don't really care what name it goes by or whether the dcache is still valid. Add a new d_op->d_weak_revalidate operation and have complete_walk call that instead of d_revalidate. The intent there is to allow for a "weaker" d_revalidate that just checks to see whether the inode is still good. This is also gives us an opportunity to kill off the FS_REVAL_DOT special casing. [AV: changed method name, added note in porting, fixed confusion re having it possibly called from RCU mode (it won't be)] Cc: NeilBrown <neilb@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26nfsd: handle vfs_getattr errors in acl protocolJ. Bruce Fields4-9/+24
We're currently ignoring errors from vfs_getattr. The correct thing to do is to do the stat in the main service procedure not in the response encoding. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26switch vfs_getattr() to struct pathAl Viro9-30/+34
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26ceph: prepopulate inodes only when request is abortedSage Weil1-2/+38
If r_aborted is true, we do not hold the dir i_mutex, and cannot touch the dcache. However, we still need to update the inodes with the state returned by the MDS. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sage Weil <sage@inktank.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26d_hash_and_lookup(): export, switch open-coded instancesAl Viro4-24/+18
* calling conventions change - ERR_PTR() is returned on ->d_hash() errors; NULL is just for dcache miss now. * exported, open-coded instances in ncpfs and cifs converted. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-269p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()Al Viro3-35/+33
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-269p: split dropping the acls from v9fs_set_create_acl()Al Viro3-21/+28
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-269p: switch v9fs_acl_chmod() from dentry to inode+fidAl Viro3-16/+13
caller has both, might as well pass them explicitly. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-269p: switch v9fs_set_acl() from dentry to fidAl Viro1-5/+12
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-269p: lift the call of set_cached_acl() into the callers of v9fs_set_acl()Al Viro1-4/+3
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-269p: add fid-based variant of v9fs_xattr_set()Al Viro2-15/+20
... making v9fs_xattr_set() a wrapper for it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>