summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2013-09-03Merge tag 'please-pull-pstore' of ↵Linus Torvalds5-34/+242
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux Pull pstore changes from Tony Luck: "A big part of this is the addition of compression to the generic pstore layer so that all backends can use the pitiful amounts of storage they control more effectively. Three other small fixes/cleanups too. * tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux: pstore/ram: (really) fix undefined usage of rounddown_pow_of_two pstore/ram: Read and write to the 'compressed' flag of pstore efi-pstore: Read and write to the 'compressed' flag of pstore erst: Read and write to the 'compressed' flag of pstore powerpc/pseries: Read and write to the 'compressed' flag of pstore pstore: Add file extension to pstore file if compressed pstore: Add decompression support to pstore pstore: Introduce new argument 'compressed' in the read callback pstore: Add compression support to pstore pstore/Kconfig: Select ZLIB_DEFLATE and ZLIB_INFLATE when PSTORE is selected pstore: Add new argument 'compressed' in pstore write callback powerpc/pseries: Remove (de)compression in nvram with pstore enabled pstore: d_alloc_name() doesn't return an ERR_PTR acpi/apei/erst: Add missing iounmap() on error in erst_exec_move_data()
2013-09-03Merge branch 'for-3.12' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "A lot of activities on the cgroup front. Most changes aren't visible to userland at all at this point and are laying foundation for the planned unified hierarchy. - The biggest change is decoupling the lifetime management of css (cgroup_subsys_state) from that of cgroup's. Because controllers (cpu, memory, block and so on) will need to be dynamically enabled and disabled, css which is the association point between a cgroup and a controller may come and go dynamically across the lifetime of a cgroup. Till now, css's were created when the associated cgroup was created and stayed till the cgroup got destroyed. Assumptions around this tight coupling permeated through cgroup core and controllers. These assumptions are gradually removed, which consists bulk of patches, and css destruction path is completely decoupled from cgroup destruction path. Note that decoupling of creation path is relatively easy on top of these changes and the patchset is pending for the next window. - cgroup has its own event mechanism cgroup.event_control, which is only used by memcg. It is overly complex trying to achieve high flexibility whose benefits seem dubious at best. Going forward, new events will simply generate file modified event and the existing mechanism is being made specific to memcg. This pull request contains prepatory patches for such change. - Various fixes and cleanups" Fixed up conflict in kernel/cgroup.c as per Tejun. * 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (69 commits) cgroup: fix cgroup_css() invocation in css_from_id() cgroup: make cgroup_write_event_control() use css_from_dir() instead of __d_cgrp() cgroup: make cgroup_event hold onto cgroup_subsys_state instead of cgroup cgroup: implement CFTYPE_NO_PREFIX cgroup: make cgroup_css() take cgroup_subsys * instead and allow NULL subsys cgroup: rename cgroup_css_from_dir() to css_from_dir() and update its syntax cgroup: fix cgroup_write_event_control() cgroup: fix subsystem file accesses on the root cgroup cgroup: change cgroup_from_id() to css_from_id() cgroup: use css_get() in cgroup_create() to check CSS_ROOT cpuset: remove an unncessary forward declaration cgroup: RCU protect each cgroup_subsys_state release cgroup: move subsys file removal to kill_css() cgroup: factor out kill_css() cgroup: decouple cgroup_subsys_state destruction from cgroup destruction cgroup: replace cgroup->css_kill_cnt with ->nr_css cgroup: bounce cgroup_subsys_state ref kill confirmation to a work item cgroup: move cgroup->subsys[] assignment to online_css() cgroup: reorganize css init / exit paths cgroup: add __rcu modifier to cgroup->subsys[] ...
2013-09-03Merge tag 'driver-core-3.12-rc1' of ↵Linus Torvalds9-120/+180
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core patches from Greg KH: "Here's the big driver core pull request for 3.12-rc1. Lots of tiny changes here fixing up the way sysfs attributes are created, to try to make drivers simpler, and fix a whole class race conditions with creations of device attributes after the device was announced to userspace. All the various pieces are acked by the different subsystem maintainers" * tag 'driver-core-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (119 commits) firmware loader: fix pending_fw_head list corruption drivers/base/memory.c: introduce help macro to_memory_block dynamic debug: line queries failing due to uninitialized local variable sysfs: sysfs_create_groups returns a value. debugfs: provide debugfs_create_x64() when disabled rbd: convert bus code to use bus_groups firmware: dcdbas: use binary attribute groups sysfs: add sysfs_create/remove_groups for when SYSFS is not enabled driver core: add #include <linux/sysfs.h> to core files. HID: convert bus code to use dev_groups Input: serio: convert bus code to use drv_groups Input: gameport: convert bus code to use drv_groups driver core: firmware: use __ATTR_RW() driver core: core: use DEVICE_ATTR_RO driver core: bus: use DRIVER_ATTR_WO() driver core: create write-only attribute macros for devices and drivers sysfs: create __ATTR_WO() driver-core: platform: convert bus code to use dev_groups workqueue: convert bus code to use dev_groups MEI: convert bus code to use dev_groups ...
2013-09-03Merge branch 'lockref' (locked reference counts)Linus Torvalds2-27/+80
Merge lockref infrastructure code by me and Waiman Long. I already merged some of the preparatory patches that didn't actually do any semantic changes earlier, but this merges the actual _reason_ for those preparatory patches. The "lockref" structure is a combination "spinlock and reference count" that allows optimized reference count accesses. In particular, it guarantees that the reference count will be updated AS IF the spinlock was held, but using atomic accesses that cover both the reference count and the spinlock words, we can often do the update without actually having to take the lock. This allows us to avoid the nastiest cases of spinlock contention on large machines under heavy pathname lookup loads. When updating the dentry reference counts on a large system, we'll still end up with the cache line bouncing around, but that's much less noticeable than actually having to spin waiting for the lock. * lockref: lockref: implement lockless reference count updates using cmpxchg() lockref: uninline lockref helper functions vfs: reimplement d_rcu_to_refcount() using lockref_get_or_lock() vfs: use lockref_get_not_zero() for optimistic lockless dget_parent() lockref: add 'lockref_get_or_lock() helper
2013-09-02vfs: reimplement d_rcu_to_refcount() using lockref_get_or_lock()Linus Torvalds2-27/+65
This moves __d_rcu_to_refcount() from <linux/dcache.h> into fs/namei.c and re-implements it using the lockref infrastructure instead. It also adds a lot of comments about what is actually going on, because turning a dentry that was looked up using RCU into a long-lived reference counted entry is one of the more subtle parts of the rcu walk. We also used to be _particularly_ subtle in unlazy_walk() where we re-validate both the dentry and its parent using the same sequence count. We used to do it by nesting the locks and then verifying the sequence count just once. That was silly, because nested locking is expensive, but the sequence count check is not. So this just re-validates the dentry and the parent separately, avoiding the nested locking, and making the lockref lookup possible. Acked-by: Waiman Long <waiman.long@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-02vfs: use lockref_get_not_zero() for optimistic lockless dget_parent()Waiman Long1-0/+15
A valid parent pointer is always going to have a non-zero reference count, but if we look up the parent optimistically without locking, we have to protect against the (very unlikely) race against renaming changing the parent from under us. We do that by using lockref_get_not_zero(), and then re-checking the parent pointer after getting a valid reference. [ This is a re-implementation of a chunk from the original patch by Waiman Long: "dcache: Enable lockless update of dentry's refcount". I've completely rewritten the patch-series and split it up, but I'm attributing this part to Waiman as it's close enough to his earlier patch - Linus ] Signed-off-by: Waiman Long <Waiman.Long@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-30pstore/ram: (really) fix undefined usage of rounddown_pow_of_twoMaxime Bizon1-3/+3
Previous attempt to fix was b042e47491ba5f487601b5141a3f1d8582304170 Suggested use of is_power_of_2() was bogus because is_power_of_2(0) is false (documented behaviour). Signed-off-by: Maxime Bizon <mbizon@freebox.fr> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-08-28Merge branch 'akpm' (patches from Andrew Morton)Linus Torvalds1-1/+1
Merge fixes from Andrew Morton: "Five fixes. err, make that six. let me try again" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: fs/ocfs2/super.c: Use bigger nodestr to accomodate 32-bit node numbers memcg: check that kmem_cache has memcg_params before accessing it drivers/base/memory.c: fix show_mem_removable() to handle missing sections IPC: bugfix for msgrcv with msgtyp < 0 Omnikey Cardman 4000: pull in ioctl.h in user header timer_list: correct the iterator for timer_list
2013-08-28fs/ocfs2/super.c: Use bigger nodestr to accomodate 32-bit node numbersGoldwyn Rodrigues1-1/+1
While using pacemaker/corosync, the node numbers are generated using IP address as opposed to serial node number generation. This may not fit in a 8-byte string. Use a bigger string to print the complete node number. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-28vfs: make the dentry cache use the lockref infrastructureWaiman Long2-37/+26
This just replaces the dentry count/lock combination with the lockref structure that contains both a count and a spinlock, and does the mechanical conversion to use the lockref infrastructure. There are no semantic changes here, it's purely syntactic. The reference lockref implementation uses the spinlock exactly the same way that the old dcache code did, and the bulk of this patch is just expanding the internal "d_count" use in the dcache code to use "d_lockref.count" instead. This is purely preparation for the real change to make the reference count updates be lockless during the 3.12 merge window. [ As with the previous commit, this is a rewritten version of a concept originally from Waiman, so credit goes to him, blame for any errors goes to me. Waiman's patch had some semantic differences for taking advantage of the lockless update in dget_parent(), while this patch is intentionally a pure search-and-replace change with no semantic changes. - Linus ] Signed-off-by: Waiman Long <Waiman.Long@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-28Revert "fs: Allow unprivileged linkat(..., AT_EMPTY_PATH) aka flink"Linus Torvalds1-3/+7
This reverts commit bb2314b47996491bbc5add73633905c3120b6268. It wasn't necessarily wrong per se, but we're still busily discussing the exact details of this all, so I'm going to revert it for now. It's true that you can already do flink() through /proc and that flink() isn't new. But as Brad Spengler points out, some secure environments do not mount proc, and flink adds a new interface that can avoid path lookup of the source for those kinds of environments. We may re-do this (and even mark it for stable backporting back in 3.11 and possibly earlier) once the whole discussion about the interface is done. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Brad Spengler <spender@grsecurity.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-26Merge tag 'jfs-3.11-rc8' of git://github.com/kleikamp/linux-shaggyLinus Torvalds1-8/+23
Pull jfs fix from Dave Kleikamp: "One JFS patch to fix an incompatibility with NFSv4 resulting in the nfs client reporting a readdir loop" * tag 'jfs-3.11-rc8' of git://github.com/kleikamp/linux-shaggy: jfs: fix readdir cookie incompatibility with NFSv4
2013-08-25Merge branch 'for-linus' of ↵Linus Torvalds6-12/+15
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "Assorted fixes from the last week or so" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: VFS: collect_mounts() should return an ERR_PTR bfs: iget_locked() doesn't return an ERR_PTR efs: iget_locked() doesn't return an ERR_PTR() proc: kill the extra proc_readfd_common()->dir_emit_dots() cope with potentially long ->d_dname() output for shmem/hugetlb
2013-08-24Merge tag 'scsi-fixes' of ↵Linus Torvalds1-5/+15
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is a set of small bug fixes for lpfc and zfcp and a fix for a fairly nasty bug in sg where a process which cancels I/O completes in a kernel thread which would then try to write back to the now gone userspace and end up writing to a random kernel address instead" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: [SCSI] zfcp: remove access control tables interface (keep sysfs files) [SCSI] zfcp: fix schedule-inside-lock in scsi_device list loops [SCSI] zfcp: fix lock imbalance by reworking request queue locking [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal [SCSI] lpfc: Don't force CONFIG_GENERIC_CSUM on
2013-08-24VFS: collect_mounts() should return an ERR_PTRDan Carpenter1-1/+1
This should actually be returning an ERR_PTR on error instead of NULL. That was how it was designed and all the callers expect it. [AV: actually, that's what "VFS: Make clone_mnt()/copy_tree()/collect_mounts() return errors" missed - originally collect_mounts() was expected to return NULL on failure] Cc: <stable@vger.kernel.org> # 3.10+ Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-08-24bfs: iget_locked() doesn't return an ERR_PTRDan Carpenter1-1/+1
iget_locked() returns a NULL on error, it doesn't return an ERR_PTR. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-08-24efs: iget_locked() doesn't return an ERR_PTR()Dan Carpenter1-1/+1
The iget_locked() function returns NULL on error and never an ERR_PTR. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-08-24proc: kill the extra proc_readfd_common()->dir_emit_dots()Oleg Nesterov1-2/+0
proc_readfd_common() does dir_emit_dots() twice in a row, we need to do this only once. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-08-24cope with potentially long ->d_dname() output for shmem/hugetlbAl Viro2-7/+12
dynamic_dname() is both too much and too little for those - the output may be well in excess of 64 bytes dynamic_dname() assumes to be enough (thanks to ashmem feeding really long names to shmem_file_setup()) and vsnprintf() is an overkill for those guys. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-08-23nilfs2: fix issue with counting number of bio requests for BIO_EOPNOTSUPP ↵Vyacheslav Dubeyko1-1/+1
error detection Fix the issue with improper counting number of flying bio requests for BIO_EOPNOTSUPP error detection case. The sb_nbio must be incremented exactly the same number of times as complete() function was called (or will be called) because nilfs_segbuf_wait() will call wail_for_completion() for the number of times set to sb_nbio: do { wait_for_completion(&segbuf->sb_bio_event); } while (--segbuf->sb_nbio > 0); Two functions complete() and wait_for_completion() must be called the same number of times for the same sb_bio_event. Otherwise, wait_for_completion() will hang or leak. Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-23nilfs2: remove double bio_put() in nilfs_end_bio_write() for BIO_EOPNOTSUPP ↵Vyacheslav Dubeyko1-2/+1
error Remove double call of bio_put() in nilfs_end_bio_write() for the case of BIO_EOPNOTSUPP error detection. The issue was found by Dan Carpenter and he suggests first version of the fix too. Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-22sysfs: group.c: fix up kerneldocGreg Kroah-Hartman1-2/+2
Fix up the wording of sysfs_create/remove_groups() a bit. Reported-by: Anthony Foiani <tkil@scrye.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21sysfs: sysfs.h: fix coding style issuesGreg Kroah-Hartman1-8/+10
This fixes up the remaining coding style issues in sysfs.h Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21sysfs: file.c: fix up broken string warningsGreg Kroah-Hartman1-4/+6
This fixes the coding style warnings in fs/sysfs/file.c for broken strings across lines. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21sysfs: dir.c: fix up odd do/while indentationGreg Kroah-Hartman1-7/+8
This fixes up the odd do/while after an if statement warning in dir.c Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21sysfs: fix up uaccess.h coding style warningsGreg Kroah-Hartman2-3/+2
This fixes the uaccess.h warnings in the sysfs.c files. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21sysfs: fix up 80 column coding style issuesGreg Kroah-Hartman4-10/+15
This fixes up the 80 column coding style issues in the sysfs .c files. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21sysfs: fix up space coding style issuesGreg Kroah-Hartman6-36/+36
This fixes up all of the space-related coding style issues for the sysfs code. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21sysfs: remove trailing whitespaceGreg Kroah-Hartman4-15/+13
This removes all trailing whitespace errors in the sysfs code. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21sysfs: fix placement of EXPORT_SYMBOL()Greg Kroah-Hartman3-20/+8
The export should happen after the function, not at the bottom of the file, so fix that up. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21sysfs: group: update copyright to add myself and the LFGreg Kroah-Hartman1-0/+2
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21sysfs: group.c: add kerneldoc for sysfs_remove_groupGreg Kroah-Hartman1-2/+10
sysfs_remove_group() never had kerneldoc, so add it, and fix up the kerneldoc for sysfs_remove_groups() which didn't specify the parameters properly. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21sysfs: group.c: fix up broken string coding styleGreg Kroah-Hartman1-2/+3
checkpatch complains about the broken string in the file, and it's correct, so fix it up. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21sysfs: group.c: fix up some * coding style issuesGreg Kroah-Hartman1-6/+6
This fixes up the * coding style warnings for the group.c sysfs file. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21sysfs: group.c: fix trailing whitespaceGreg Kroah-Hartman1-2/+2
There was some trailing spaces in the file, fix that up. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21sysfs: group.c: move EXPORT_SYMBOL_GPL() to the proper locationGreg Kroah-Hartman1-6/+3
This fixes up the coding style issue of incorrectly placing the EXPORT_SYMBOL_GPL() macro, it should be right after the function itself, not at the end of the file. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21sysfs: add sysfs_create/remove_groups()Greg Kroah-Hartman1-0/+54
These functions are being open-coded in 3 different places in the driver core, and other driver subsystems will want to start doing this as well, so move it to the sysfs core to keep it all in one place, where we know it is written properly. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21[SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signalRoland Dreier1-5/+15
There is a nasty bug in the SCSI SG_IO ioctl that in some circumstances leads to one process writing data into the address space of some other random unrelated process if the ioctl is interrupted by a signal. What happens is the following: - A process issues an SG_IO ioctl with direction DXFER_FROM_DEV (ie the underlying SCSI command will transfer data from the SCSI device to the buffer provided in the ioctl) - Before the command finishes, a signal is sent to the process waiting in the ioctl. This will end up waking up the sg_ioctl() code: result = wait_event_interruptible(sfp->read_wait, (srp_done(sfp, srp) || sdp->detached)); but neither srp_done() nor sdp->detached is true, so we end up just setting srp->orphan and returning to userspace: srp->orphan = 1; write_unlock_irq(&sfp->rq_list_lock); return result; /* -ERESTARTSYS because signal hit process */ At this point the original process is done with the ioctl and blithely goes ahead handling the signal, reissuing the ioctl, etc. - Eventually, the SCSI command issued by the first ioctl finishes and ends up in sg_rq_end_io(). At the end of that function, we run through: write_lock_irqsave(&sfp->rq_list_lock, iflags); if (unlikely(srp->orphan)) { if (sfp->keep_orphan) srp->sg_io_owned = 0; else done = 0; } srp->done = done; write_unlock_irqrestore(&sfp->rq_list_lock, iflags); if (likely(done)) { /* Now wake up any sg_read() that is waiting for this * packet. */ wake_up_interruptible(&sfp->read_wait); kill_fasync(&sfp->async_qp, SIGPOLL, POLL_IN); kref_put(&sfp->f_ref, sg_remove_sfp); } else { INIT_WORK(&srp->ew.work, sg_rq_end_io_usercontext); schedule_work(&srp->ew.work); } Since srp->orphan *is* set, we set done to 0 (assuming the userspace app has not set keep_orphan via an SG_SET_KEEP_ORPHAN ioctl), and therefore we end up scheduling sg_rq_end_io_usercontext() to run in a workqueue. - In workqueue context we go through sg_rq_end_io_usercontext() -> sg_finish_rem_req() -> blk_rq_unmap_user() -> ... -> bio_uncopy_user() -> __bio_copy_iov() -> copy_to_user(). The key point here is that we are doing copy_to_user() on a workqueue -- that is, we're on a kernel thread with current->mm equal to whatever random previous user process was scheduled before this kernel thread. So we end up copying whatever data the SCSI command returned to the virtual address of the buffer passed into the original ioctl, but it's quite likely we do this copying into a different address space! As suggested by James Bottomley <James.Bottomley@hansenpartnership.com>, add a check for current->mm (which is NULL if we're on a kernel thread without a real userspace address space) in bio_uncopy_user(), and skip the copy if we're on a kernel thread. There's no reason that I can think of for any caller of bio_uncopy_user() to want to do copying on a kernel thread with a random active userspace address space. Huge thanks to Costa Sapuntzakis <costa@purestorage.com> for the original pointer to this bug in the sg code. Signed-off-by: Roland Dreier <roland@purestorage.com> Tested-by: David Milburn <dmilburn@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: <stable@vger.kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-08-19proc: more readdir conversion bug-fixesLinus Torvalds1-1/+1
In the previous commit, Richard Genoud fixed proc_root_readdir(), which had lost the check for whether all of the non-process /proc entries had been returned or not. But that in turn exposed _another_ bug, namely that the original readdir conversion patch had yet another problem: it had lost the return value of proc_readdir_de(), so now checking whether it had completed successfully or not didn't actually work right anyway. This reinstates the non-zero return for the "end of base entries" that had also gotten lost in commit f0c3b5093add ("[readdir] convert procfs"). So now you get all the base entries *and* you get all the process entries, regardless of getdents buffer size. (Side note: the Linux "getdents" manual page actually has a nice example application for testing getdents, which can be easily modified to use different buffers. Who knew? Man-pages can be useful) Reported-by: Emmanuel Benisty <benisty.e@gmail.com> Reported-by: Marc Dionne <marc.c.dionne@gmail.com> Cc: Richard Genoud <richard.genoud@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-19pstore/ram: Read and write to the 'compressed' flag of pstoreAruna Balakrishnaiah1-8/+28
In pstore write, add character 'C'(compressed) or 'D'(decompressed) in the header while writing to Ram persistent buffer. In pstore read, read the header and update the 'compressed' flag accordingly. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-08-19pstore: Add file extension to pstore file if compressedAruna Balakrishnaiah3-6/+10
In case decompression fails, add a ".enc.z" to indicate the file has compressed data. This will help user space utilities to figure out the file contents. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-08-19pstore: Add decompression support to pstoreAruna Balakrishnaiah1-2/+51
Based on the flag 'compressed' set or not, pstore will decompress the data returning a plain text file. If decompression fails for a particular record it will have the compressed data in the file which can be decompressed with 'openssl' command line tool. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-08-19pstore: Introduce new argument 'compressed' in the read callbackAruna Balakrishnaiah2-2/+5
Backends will set the flag 'compressed' after reading the log from persistent store to indicate the data being returned to pstore is compressed or not. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-08-19pstore: Add compression support to pstoreAruna Balakrishnaiah1-9/+139
Add compression support to pstore which will help in capturing more data. Initially, pstore will make a call to kmsg_dump with a bigger buffer and will pass the size of bigger buffer to kmsg_dump and then compress the data to registered buffer of registered size. In case compression fails, pstore will capture the uncompressed data by making a call again to kmsg_dump with registered_buffer of registered size. Pstore will indicate the data is compressed or not with a flag in the write callback. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-08-19pstore/Kconfig: Select ZLIB_DEFLATE and ZLIB_INFLATE when PSTORE is selectedAruna Balakrishnaiah1-0/+2
Pstore will make use of deflate and inflate algorithm to compress and decompress the data. So when Pstore is enabled select zlib_deflate and zlib_inflate. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-08-19pstore: Add new argument 'compressed' in pstore write callbackAruna Balakrishnaiah2-4/+5
Addition of new argument 'compressed' in the write call back will help the backend to know if the data passed from pstore is compressed or not (In case where compression fails.). If compressed, the backend can add a tag indicating the data is compressed while writing to persistent store. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-08-19pstore: d_alloc_name() doesn't return an ERR_PTRDan Carpenter1-2/+1
d_alloc_name() returns NULL on error. Also I changed the error code from -ENOSPC to -ENOMEM to reflect that we were short on RAM not disk space. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-08-19proc: return on proc_readdir errorRichard Genoud1-1/+3
Commit f0c3b5093add ("[readdir] convert procfs") introduced a bug on the listing of the proc file-system. The return value of proc_readdir() isn't tested anymore in the proc_root_readdir function. This lead to an "interesting" behaviour when we are using the getdents() system call with a buffer too small: instead of failing, it returns the first entries of /proc (enough to fill the given buffer), plus the PID directories. This is not triggered on glibc (as getdents is called with a 32KB buffer), but on uclibc, the buffer size is only 1KB, thus some proc entries are missing. See https://lkml.org/lkml/2013/8/12/288 for more background. Signed-off-by: Richard Genoud <richard.genoud@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixesLinus Torvalds4-11/+23
Pull gfs2 fixes from Steven Whitehouse: "Out of these five patches, the one for ensuring that the number of revokes is not exceeded, and the one for checking the glock is not already held in gfs2_getxattr are the two most important. The latter can be triggered by selinux. The other three patches are very small and fix mostly fairly trivial issues" * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes: GFS2: Check for glock already held in gfs2_getxattr GFS2: alloc_workqueue() doesn't return an ERR_PTR GFS2: don't overrun reserved revokes GFS2: WQ_NON_REENTRANT is meaningless and going away GFS2: Fix typo in gfs2_create_inode()
2013-08-19GFS2: Check for glock already held in gfs2_getxattrSteven Whitehouse1-0/+4
Since the introduction of atomic_open, gfs2_getxattr can be called with the glock already held, so we need to allow for this. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Reported-by: David Teigland <teigland@redhat.com> Tested-by: David Teigland <teigland@redhat.com>