summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2008-09-02mm: show quicklist usage in /proc/meminfoKOSAKI Motohiro1-2/+5
Quicklists can consume several GB of memory. We should provide a means of monitoring this. After this patch is applied, /proc/meminfo will output the following: % cat /proc/meminfo MemTotal: 7715392 kB MemFree: 5401600 kB Buffers: 80384 kB Cached: 300800 kB SwapCached: 0 kB Active: 235584 kB Inactive: 262656 kB SwapTotal: 2031488 kB SwapFree: 2031488 kB Dirty: 3520 kB Writeback: 0 kB AnonPages: 117696 kB Mapped: 38528 kB Slab: 1589952 kB SReclaimable: 23104 kB SUnreclaim: 1566848 kB PageTables: 14656 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 5889152 kB Committed_AS: 393152 kB VmallocTotal: 17592177655808 kB VmallocUsed: 29056 kB VmallocChunk: 17592177626432 kB Quicklists: 130944 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 262144 kB Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Keiichiro Tokunaga <tokunaga.keiich@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-02NTFS: update homepageAdrian Bunk1-2/+2
Update the location of the NTFS homepage in several files. Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-02Merge branch 'for-2.6.27' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2-7/+7
* 'for-2.6.27' of git://linux-nfs.org/~bfields/linux: nfsd: fix buffer overrun decoding NFSv4 acl sunrpc: fix possible overrun on read of /proc/sys/sunrpc/transports nfsd: fix compound state allocation error handling svcrdma: Fix race between svc_rdma_recvfrom thread and the dto_tasklet
2008-09-01nfsd: fix buffer overrun decoding NFSv4 aclJ. Bruce Fields1-1/+1
The array we kmalloc() here is not large enough. Thanks to Johann Dahm and David Richter for bug report and testing. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: David Richter <richterd@citi.umich.edu> Tested-by: Johann Dahm <jdahm@umich.edu>
2008-09-01nfsd: fix compound state allocation error handlingAndy Adamson1-6/+6
Move the cstate_alloc call so that if it fails, the response is setup to encode the NFS error. The out label now means that the nfsd4_compound_state has not been allocated. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-08-28[CIFS] Turn off Unicode during session establishment for plaintext ↵Steve French1-0/+2
authentication LANMAN session setup did not support Unicode (after session setup, unicode can still be used though). Fixes samba bug# 5319 CC: Jeff Layton <jlayton@redhat.com> CC: Stable Kernel <stable@vger.kernel.org> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-08-28[CIFS] update cifs change logSteve French2-3/+16
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-08-28cifs: fix O_APPEND on directio mountsJeff Layton1-0/+4
The direct I/O write codepath for CIFS is done through cifs_user_write(). That function does not currently call generic_write_checks() so the file position isn't being properly set when the file is opened with O_APPEND. It's also not doing the other "normal" checks that should be done for a write call. The problem is currently that when you open a file with O_APPEND on a mount with the directio mount option, the file position is set to the beginning of the file. This makes any subsequent writes clobber the data in the file starting at the beginning. This seems to fix the problem in cursory testing. It is, however important to note that NFS disallows the combination of (O_DIRECT|O_APPEND). If my understanding is correct, the concern is races with multiple clients appending to a file clobbering each others' data. Since the write model for CIFS and NFS is pretty similar in this regard, CIFS is probably subject to the same sort of races. What's unclear to me is why this is a particular problem with O_DIRECT and not with buffered writes... Regardless, disallowing O_APPEND on an entire mount is probably not reasonable, so we'll probably just have to deal with it and reevaluate this flag combination when we get proper support for O_DIRECT. In the meantime this patch at least fixes the existing problem. Signed-off-by: Jeff Layton <jlayton@redhat.com> Cc: Stable Tree <stable@kernel.org> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-08-28Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6Steve French23-194/+158
2008-08-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6Linus Torvalds3-11/+19
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] Add destroy routine for dns_resolver [CIFS] Reorder cifs config item for better clarity [CIFS] Correct keys dependency for cifs kerberos support
2008-08-27Merge branch 'for-linus' of ↵Linus Torvalds14-97/+34
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: [PATCH] deal with the first call of ->show() generating no output [PATCH] fix ->llseek() for a bunch of directories [PATCH] fix regular readdir() and friends [PATCH] fix hpux_getdents() [PATCH] fix osf_getdirents() [PATCH] ntfs: use d_add_ci [PATCH] change d_add_ci argument ordering [PATCH] fix efs_lookup() [PATCH] proc: inode number fixlet
2008-08-27[CIFS] Fix plaintext authenticationSteve French1-0/+1
The last eight bytes of the password field were not cleared when doing lanman plaintext password authentication. This patch fixes that. I tested it with Samba by setting password encryption to no in the server's smb.conf. Other servers also can be configured to force plaintext authentication. Note that plaintexti authentication requires setting /proc/fs/cifs/SecurityFlags to 0x30030 on the client (enabling both LANMAN and also plaintext password support). Also note that LANMAN support (and thus plaintext password support) requires CONFIG_CIFS_WEAK_PW_HASH to be enabled in menuconfig. CC: Jeff Layton <jlayton@redhat.com> CC: Stable Kernel <stable@vger.kernel.org> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-08-27[CIFS] Add destroy routine for dns_resolverJeff Layton2-1/+9
Otherwise, we're leaking the payload memory. CC: Stable Kernel <stable@vger.kernel.org> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-08-27Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds2-20/+41
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: block: remove blk_queue_tag_depth() and blk_queue_tag_queue() block: remove unused ->busy part of the block queue tag map bio: fix __bio_copy_iov() handling of bio->bv_len bio: fix bio_copy_kern() handling of bio->bv_len block: submit_bh() inadvertently discards barrier flag on a sync write block: clean up cmdfilter sysfs interface block: rename blk_scsi_cmd_filter to blk_cmd_filter sg: restore command permission for TYPE_SCANNER block: move cmdfilter from gendisk to request_queue
2008-08-27Merge branch 'upstream-linus' of ↵Linus Torvalds7-77/+83
git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: ocfs2: Increment the reference count of an already-active stack. [PATCH] configfs: Consolidate locking around configfs_detach_prep() in configfs_rmdir() ocfs2: correctly set i_blocks after inline dir gets expanded ocfs2: Jump to correct label in ocfs2_expand_inline_dir() ocfs2: Fix sleep-with-spinlock recovery regression [PATCH] ocfs2/cluster/netdebug.c: fix warning [PATCH] ocfs2/cluster/tcp.c: make some functions static
2008-08-27bio: fix __bio_copy_iov() handling of bio->bv_lenFUJITA Tomonori1-5/+5
The commit c5dec1c3034f1ae3503efbf641ff3b0273b64797 introduced __bio_copy_iov() to add bounce support to blk_rq_map_user_iov. __bio_copy_iov() uses bio->bv_len to copy data for READ commands after the completion but it doesn't work with a request that partially completed. SCSI always completes a PC request as a whole but seems some don't. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-08-27bio: fix bio_copy_kern() handling of bio->bv_lenFUJITA Tomonori1-10/+28
The commit 68154e90c9d1492d570671ae181d9a8f8530da55 introduced bio_copy_kern() to add bounce support to blk_rq_map_kern. bio_copy_kern() uses bio->bv_len to copy data for READ commands after the completion but it doesn't work with a request that partially completed. SCSI always completes a PC request as a whole but seems some don't. This patch fixes bio_copy_kern to handle the above case. As bio_copy_user does, bio_copy_kern uses struct bio_map_data to store struct bio_vec. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Reported-by: Nix <nix@esperi.org.uk> Tested-by: Nix <nix@esperi.org.uk> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-08-27block: submit_bh() inadvertently discards barrier flag on a sync writeJens Axboe1-5/+8
Reported by Milan Broz <mbroz@redhat.com>, commit 18ce3751 inadvertently made submit_bh() discard the barrier bit for a WRITE_SYNC request. Fix that up. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-08-26[CIFS] Reorder cifs config item for better claritySteve French1-10/+10
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-08-26[CIFS] Correct keys dependency for cifs kerberos supportSteve French1-1/+1
Must also depend on CIFS ... Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-08-26Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6Steve French21-304/+508
2008-08-26[CIFS] check version in spnego upcall responseSteve French3-2/+15
Currently, we don't check the version in the SPNEGO upcall response even though one is provided. Jeff and Q have made the corresponding change to the Samba client (cifs.upcall). Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-08-25ocfs2: Increment the reference count of an already-active stack.Joel Becker1-3/+4
The ocfs2_stack_driver_request() function failed to increment the refcount of an already-active stack. It only did the increment on the first reference. Whoops. Signed-off-by: Joel Becker <joel.becker@oracle.com> Tested-by: Marcos Matsunaga <marcos.matsunaga@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-08-25[PATCH] deal with the first call of ->show() generating no outputAl Viro1-2/+9
seq_read() has a subtle bug - we want the first loop there to go until at least one *non-empty* record had fit entirely into buffer. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-08-25[PATCH] fix ->llseek() for a bunch of directoriesAl Viro6-0/+7
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-08-25[PATCH] fix regular readdir() and friendsAl Viro2-4/+12
Handling of -EOVERFLOW. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-08-25[PATCH] ntfs: use d_add_ciChristoph Hellwig1-87/+2
d_add_ci was lifted 1:1 from ntfs. Change ntfs to use the common version. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-08-25[PATCH] change d_add_ci argument orderingChristoph Hellwig2-2/+2
As pointed out during review d_add_ci argument order should match d_add, so switch the dentry and inode arguments. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-08-25[PATCH] fix efs_lookup()Al Viro1-2/+1
it needs to use d_splice_alias(), not d_add() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-08-25[PATCH] proc: inode number fixletAlexey Dobriyan1-0/+1
Ouch, if number taken from IDA is too big, the intent was to signal an error, not check for overflow and still do overflowing addition. One still needs 2^28 proc entries to notice this. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-08-23removed unused #include <linux/version.h>'sAdrian Bunk2-2/+0
This patch lets the files using linux/version.h match the files that #include it. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-22[PATCH] configfs: Consolidate locking around configfs_detach_prep() in ↵Louis Rilling1-10/+7
configfs_rmdir() It appears that configfs_rmdir() can protect configfs_detach_prep() retries with less calls to {spin,mutex}_{lock,unlock}, and a cleaner code. This patch does not change any behavior, except that it removes two useless lock/unlock pairs having nothing inside to protect and providing a useless barrier. Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com> Signed-off-by: Joel Becker <Joel.Becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-08-22ocfs2: correctly set i_blocks after inline dir gets expandedMark Fasheh1-1/+6
We were setting i_blocks based on allocation before the extent insert, which is wrong as the value is a calculation based on ip_clusters which gets updated as a result of the insert. This patch moves the line in question to just after the call to ocfs2_insert_extent(). Without this fix, inline directories were temporarily having an i_blocks value of zero immediately after expansion to extents. Reported-and-tested-by: Tristan Ye <tristan.ye@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-08-22ocfs2: Jump to correct label in ocfs2_expand_inline_dir()Tao Ma1-2/+2
When we fail to insert extent in ocfs2_expand_inline_dir(), we should go to out_commit, not out. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-08-22ocfs2: Fix sleep-with-spinlock recovery regressionMark Fasheh1-9/+14
This fixes a bug introduced with 539d8264093560b917ee3afe4c7f74e5da09d6a5: [PATCH 2/2] ocfs2: Fix race between mount and recovery ocfs2_mark_dead_nodes() was reading journal inodes while holding the spinlock protecting our in-memory recovery state. The fix is very simple - the disk state is protected by a cluster lock that's already held, so we just move the spinlock down past the read. Reviewed-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-08-22[PATCH] ocfs2/cluster/netdebug.c: fix warningAlexander Beregalov1-13/+13
ocfs2/cluster/netdebug.c: fix warning fs/ocfs2/cluster/netdebug.c:154: warning: format '%lu' expects type 'long unsigned int', but argument 17 has type 'suseconds_t' Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-08-22[PATCH] ocfs2/cluster/tcp.c: make some functions staticAdrian Bunk2-39/+37
Commit 0f475b2abed6cbccee1da20a0bef2895eb2a0edd (ocfs2/net: Silence build warnings) made sense as far as it fixed compile warnings, but it was not required that it made the functions global. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-08-22Merge branch 'for_linus' of ↵Linus Torvalds12-238/+454
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: Update documentation to remind users to update mke2fs.conf ext4: Fix small file fragmentation ext4: Initialize writeback_index to 0 when allocating a new inode ext4: make sure ext4_has_free_blocks returns 0 for ENOSPC ext4: journal credit fix for the delayed allocation's writepages() function ext4: Rework the ext4_da_writepages() function ext4: journal credits reservation fixes for DIO, fallocate ext4: journal credits reservation fixes for extent file writepage ext4: journal credits calulation cleanup and fix for non-extent writepage ext4: Fix bug where we return ENOSPC even though we have plenty of inodes ext4: don't try to resize if there are no reserved gdt blocks left ext4: Use ext4_discard_reservations instead of mballoc-specific call ext4: Fix ext4_dx_readdir hash collision handling ext4: Fix delalloc release block reservation for truncate ext4: Fix potential truncate BUG due to i_prealloc_list being non-empty ext4: Handle unwritten extent properly with delayed allocation
2008-08-20cramfs: fix named-pipe handlingAl Viro1-46/+38
After commit a97c9bf33f4612e2aed6f000f6b1d268b6814f3c (fix cramfs making duplicate entries in inode cache) in kernel 2.6.14, named-pipe on cramfs does not work properly. It seems the commit make all named-pipe on cramfs share their inode (and named-pipe buffer). Make ..._test() refuse to merge inodes with ->i_ino == 1, take inode setup back to get_cramfs_inode() and make ->drop_inode() evict ones with ->i_ino == 1 immediately. Reported-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@kernel.org> [2.6.14 and later] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-20fix setpriority(PRIO_PGRP) thread iterator breakageKen Chen1-4/+4
When user calls sys_setpriority(PRIO_PGRP ...) on a NPTL style multi-LWP process, only the task leader of the process is affected, all other sibling LWP threads didn't receive the setting. The problem was that the iterator used in sys_setpriority() only iteartes over one task for each process, ignoring all other sibling thread. Introduce a new macro do_each_pid_thread / while_each_pid_thread to walk each thread of a process. Convert 4 call sites in {set/get}priority and ioprio_{set/get}. Signed-off-by: Ken Chen <kenchen@google.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-20binfmt_misc: fix false -ENOEXEC when coupled with other binary handlersPavel Emelyanov1-2/+2
In case the binfmt_misc binary handler is registered *before* the e.g. script one (when for example being compiled as a module) the following situation may occur: 1. user launches a script, whose interpreter is a misc binary; 2. the load_misc_binary sets the misc_bang and returns -ENOEVEC, since the binary is a script; 3. the load_script_binary loads one and calls for search_binary_hander to run the interpreter; 4. the load_misc_binary is called again, but refuses to load the binary due to misc_bang bit set. The fix is to move the misc_bang setting lower - prior to the actual call to the search_binary_handler. Caused by the commit 3a2e7f47 (binfmt_misc.c: avoid potential kernel stack overflow) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Reported-by: Kirill A. Shutemov <kirill@shutemov.name> Tested-by: Kirill A. Shutemov <kirill@shutemov.name> Cc: <stable@kernel.org> [2.6.26.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-20/proc/self/maps doesn't display the real file offsetClement Calmels2-4/+4
This addresses http://bugzilla.kernel.org/show_bug.cgi?id=11318 In function show_map (file: fs/proc/task_mmu.c), if vma->vm_pgoff > 2^20 than (vma->vm_pgoff << PAGE_SIZE) is greater than 2^32 (with PAGE_SIZE equal to 4096 (i.e. 2^12). The next seq_printf use an unsigned long for the conversion of (vma->vm_pgoff << PAGE_SIZE), as a result the offset value displayed in /proc/self/maps is truncated if the page offset is greater than 2^20. A test that shows this issue: #define _GNU_SOURCE #include <sys/types.h> #include <sys/stat.h> #include <sys/mman.h> #include <stdlib.h> #include <stdio.h> #include <fcntl.h> #include <unistd.h> #include <string.h> #define PAGE_SIZE (getpagesize()) #if __i386__ # define U64_STR "%llx" #elif __x86_64 # define U64_STR "%lx" #else # error "Architecture Unsupported" #endif int main(int argc, char *argv[]) { int fd; char *addr; off64_t offset = 0x10000000; char *filename = "/dev/zero"; fd = open(filename, O_RDONLY); if (fd < 0) { perror("open"); return 1; } offset *= 0x10; printf("offset = " U64_STR "\n", offset); addr = (char*)mmap64(NULL, PAGE_SIZE, PROT_READ, MAP_PRIVATE, fd, offset); if ((void*)addr == MAP_FAILED) { perror("mmap64"); return 1; } { FILE *fmaps; char *line = NULL; size_t len = 0; ssize_t read; size_t filename_len = strlen(filename); fmaps = fopen("/proc/self/maps", "r"); if (!fmaps) { perror("fopen"); return 1; } while ((read = getline(&line, &len, fmaps)) != -1) { if ((read > filename_len + 1) && (strncmp(&line[read - filename_len - 1], filename, filename_len) == 0)) printf("%s", line); } if (line) free(line); fclose(fmaps); } close(fd); return 0; } [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Clement Calmels <cboulte@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-20Merge branch 'sh/for-2.6.27' of ↵Linus Torvalds1-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 * 'sh/for-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: sh: Provide a FLAT_PLAT_INIT() definition. binfmt_flat: Stub in a FLAT_PLAT_INIT(). video: export sh_mobile_lcdc panel size sh: select memchunk size using kernel cmdline sh: export sh7723 VEU as VEU2H input: migor_ts compile and detection fix sh: remove MSTPCR defines from Migo-R header file sh: Update sh7763rdp defconfig sh: Add support sh7760fb to sh7763rdp board sh: Add support sh_eth to sh7763rdp board sh: Disable 64kB hugetlbpage size when using 64kB PAGE_SIZE. sh: Don't export __{s,u}divsi3_i4i from SH-2 libgcc. fix SH7705_CACHE_32KB compilation sh: mach-x3proto: Fix up smc91x platform data.
2008-08-20vfat: fix 'sync' mount deadlock due to BKL->lock_super conversionLinus Torvalds1-7/+3
There was another FAT BKL conversion deadlock reported by Bart Trojanowski due to the BKL being used as a recursive lock by FAT, which was missed because it only triggers with 'sync' (or 'dirsync') mounts. The recursion worked for the BKL, but after the conversion to lock_super (which uses a mutex), it just deadlocks. Thanks to Bart for debugging this and testing the fix. The lock debugging information from the original report: ============================================= [ INFO: possible recursive locking detected ] 2.6.27-rc3-bisect-00448-ga7f5aaf #16 --------------------------------------------- mv/4020 is trying to acquire lock: (&type->s_lock_key#9){--..}, at: [<c01a90fe>] lock_super+0x1e/0x20 but task is already holding lock: (&type->s_lock_key#9){--..}, at: [<c01a90fe>] lock_super+0x1e/0x20 other info that might help us debug this: 3 locks held by mv/4020: #0: (&sb->s_type->i_mutex_key#9/1){--..}, at: [<c01b2336>] do_unlinkat+0x66/0x140 #1: (&sb->s_type->i_mutex_key#9){--..}, at: [<c01b0954>] vfs_unlink+0x84/0x110 #2: (&type->s_lock_key#9){--..}, at: [<c01a90fe>] lock_super+0x1e/0x20 stack backtrace: Pid: 4020, comm: mv Not tainted 2.6.27-rc3-bisect-00448-ga7f5aaf #16 [<c014e694>] validate_chain+0x984/0xea0 [<c0108d70>] ? native_sched_clock+0x0/0xf0 [<c014ee9c>] __lock_acquire+0x2ec/0x9b0 [<c014f5cf>] lock_acquire+0x6f/0x90 [<c01a90fe>] ? lock_super+0x1e/0x20 [<c044e5fd>] mutex_lock_nested+0xad/0x300 [<c01a90fe>] ? lock_super+0x1e/0x20 [<c01a90fe>] ? lock_super+0x1e/0x20 [<c01a90fe>] lock_super+0x1e/0x20 [<f8b3a700>] fat_write_inode+0x60/0x2b0 [fat] [<c0450878>] ? _spin_unlock_irqrestore+0x48/0x80 [<f8b3a953>] ? fat_sync_inode+0x3/0x20 [fat] [<f8b3a962>] fat_sync_inode+0x12/0x20 [fat] [<f8b37c7e>] fat_remove_entries+0xbe/0x120 [fat] [<f8b422ef>] vfat_unlink+0x5f/0x90 [vfat] [<f8b42290>] ? vfat_unlink+0x0/0x90 [vfat] [<c01b0968>] vfs_unlink+0x98/0x110 [<c01b2400>] do_unlinkat+0x130/0x140 [<c016a8f5>] ? audit_syscall_entry+0x105/0x150 [<c01b253b>] sys_unlinkat+0x3b/0x40 [<c01040d3>] sysenter_do_call+0x12/0x3f ======================= where the deadlock is due to the nesting of lock_super from vfat_unlink to fat_write_inode: - do_unlinkat - vfs_unlink - vfat_unlink * lock_super - fat_remove_entries - fat_sync_inode - fat_write_inode * lock_super and the fix is to simply remove the use of lock_super() in fat_write_inode. The lock_super() there had been just an automatic conversion of the kernel lock to the superblock lock, but no locking was actually needed there, since the code in fat_write_inode already protected all relevant accesses with a spinlock (sbi->inode_hash_lock to be exact). The only code inside the BKL (and thus the superblock lock) was accesses tp local variables or calls to functions that have long been SMP-safe (i.e. sb_bread, mark_buffe_dirty and brlese). Bart reports: "Looks good. I ran 10 parallel processes creating 1M files truncating them, writing to them again and then deleting them. This patch fixes the issue I ran into. Signed-off-by: Bart Trojanowski <bart@jukie.net>" Reported-and-tested-by: Bart Trojanowski <bart@jukie.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-19[CIFS] Kerberos support not considered experimental anymoreSteve French2-5/+26
Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-08-19[CIFS] distinguish between Kerberos and MSKerberos in upcallSteve French4-6/+14
Properly handle MSKRB5 by passing sec=mskrb5 to the upcall so that the spengo blob can be generated appropriately. Also, make decode_negTokenInit prefer whichever mechanism is first in the list. Needed for some NetApp servers, and possibly some older versions of Windows which treat the two KRB5 mechanisms differently. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-08-19cifs: add local server pointer to cifs_setup_sessionJeff Layton1-16/+17
cifs_setup_session references pSesInfo->server several times. That pointer shouldn't change during the life of the function so grab it once and store it in a local var. This makes the code look a little cleaner too. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-08-19[CIFS] reindent misindented statementIlpo Järvinen1-1/+2
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-08-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6Linus Torvalds2-0/+3
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] mount of IPC$ breaks with iget patch [CIFS] remove trailing whitespace [CIFS] if get root inode fails during mount, cleanup tree connection
2008-08-15Merge branch 'linux-next' of git://git.infradead.org/~dedekind/ubifs-2.6Linus Torvalds17-243/+328
* 'linux-next' of git://git.infradead.org/~dedekind/ubifs-2.6: (29 commits) UBIFS: xattr bugfixes UBIFS: remove unneeded check UBIFS: few commentary fixes UBIFS: fix budgeting request alignment in xattr code UBIFS: improve arguments checking in debugging messages UBIFS: always set i_generation to 0 UBIFS: correct spelling of "thrice". UBIFS: support splice_write UBIFS: minor tweaks in commit UBIFS: reserve more space for index UBIFS: print pid in dump function UBIFS: align inode data to eight UBIFS: improve budgeting checks UBIFS: correct orphan deletion order UBIFS: fix typos in comments UBIFS: do not union creat_sqnum and del_cmtno UBIFS: optimize deletions UBIFS: increment commit number earlier UBIFS: remove another unneeded function parameter UBIFS: remove unneeded function parameter ...