linux - Linux Kernel (branches are rebased on master from time to time)

Age	Commit message (Collapse)	Author	Files	Lines
2013-10-29	perf/x86: Further optimize copy_from_user_nmi()	Peter Zijlstra	2	-48/+36
	Now that we can deal with nested NMI due to IRET re-enabling NMIs and can deal with faults from NMI by making sure we preserve CR2 over NMIs we can in fact simply access user-space memory from NMI context. So rewrite copy_from_user_nmi() to use __copy_from_user_inatomic() and rework the fault path to do the minimal required work before taking the in_atomic() fault handler. In particular avoid perf_sw_event() which would make perf recurse on itself (it should be harmless as our recursion protections should be able to deal with this -- but why tempt fate). Also rename notify_page_fault() to kprobes_fault() as that is a much better name; there is no notifier in it and its specific to kprobes. Don measured that his worst case NMI path shrunk from ~300K cycles to ~150K cycles. Cc: Stephane Eranian <eranian@google.com> Cc: jmario@redhat.com Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: dave.hansen@linux.intel.com Tested-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131024105206.GM2490@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29	perf: Change zero-padding of strings in perf_event_mmap_event()	Peter Zijlstra	1	-6/+11
	Oleg complained about the excessive 0-ing in perf_event_mmap_event(), so try and be smarter about it while keeping it fairly fool proof and avoid leaking random bits out to userspace. Suggested-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-8jirlm99m6if2z13wd6rbyu6@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29	perf: Do not waste PAGE_SIZE bytes for ALIGN(8) in perf_event_mmap_event()	Oleg Nesterov	1	-7/+8
	perf_event_mmap_event() does kzalloc(PATH_MAX + sizeof(u64)) to ensure we can align the size later. However this means that we actually allocate PAGE_SIZE * 2 buffer, seems too much. Change this code to allocate PATH_MAX==PAGE_SIZE bytes, but tell d_path() to not use the last sizeof(u64) bytes. Note: it is not clear why do we need __GFP_ZERO, see the next patch. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131016201004.GC23214@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29	perf: Kill the dead !vma->vm_mm code in perf_event_mmap_event()	Oleg Nesterov	1	-8/+6
	1. perf_event_mmap(vma) is never called with a gate_vma-like arg, remove the "if (!vma->vm_mm)" code. 2. arch_vma_name() can use the chached value of mmap_event->vma. 3. Change the code to not call arch_vma_name() twice. 4. Purely cosmetic, but since we use "goto got_name" all the time remove "else" from "[stack]" branch just for symmetry. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131016200945.GB23214@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29	perf: Remove useless atomic_t	Peter Zijlstra	1	-9/+9
	There's nothing atomic about atomic_set vs atomic_read; so remove the atomic_t usage. Also, make running_sample_length static as it really is (and should be) local to this translation unit. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: eranian@google.com Cc: Don Zickus <dzickus@redhat.com> Cc: jmario@redhat.com Cc: acme@infradead.org Cc: dave.hansen@linux.intel.com Link: http://lkml.kernel.org/n/tip-vw9lg588x1ic248whybjon0c@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29	Merge branch 'perf/urgent' into perf/core	Ingo Molnar	406	-2363/+3490
	Conflicts: tools/perf/builtin-record.c tools/perf/builtin-top.c tools/perf/util/hist.h
2013-10-29	Merge tag 'perf-urgent-for-mingo' of ↵	Ingo Molnar	20	-67/+151
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: * Add color overhead for stdio output buffer, which fixes --stdio output being chopped up on the hot (red) entries, fix from Jiri Olsa. * Get 'perf record -g -a sleep 1' working again, removing the need for -- separating perf options from the workload, restoring ages old behaviour, fix from Jiri Olsa. More patches allowing ~/.perfconfig setting up of default callchain collecting method ("fp" or "dwarf") left for next merge window. * Fixup mmap event consumption, where we were acking the consumption by writing the tail before actually accessing the event, which could lead to using overwritten records in things like 'perf record --call-graph'. From Zhouyi Zhou. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-28	perf tools: Fixup mmap event consumption	Zhouyi Zhou	14	-16/+49
	The tail position of the event buffer should only be modified after actually use that event. If not the event buffer could be invalid before use, and segment fault occurs when invoking perf top -G. Signed-off-by: Zhouyi Zhou <yizhouzhou@ict.ac.cn> Cc: David Ahern <dsahern@gmail.com> Cc: Zhouyi Zhou <yizhouzhou@ict.ac.cn> Link: http://lkml.kernel.org/r/1382600613-32177-1-git-send-email-zhouzhouyi@gmail.com [ Simplified the logic using exit gotos and renamed write_tail method to mmap_consume ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-28	perf top: Split -G and --call-graph	Jiri Olsa	2	-23/+18
	Splitting -G and --call-graph for record command, so we could use '-G' with no option. The '-G' option now takes NO argument and enables the configured unwind method, which is currently the frame pointers method. It will be possible to configure unwind method via config file in upcoming patches. All current '-G' arguments is overtaken by --call-graph option. NOTE: The documentation for top --call-graph option was wrongly copied from report command. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Tested-by: David Ahern <dsahern@gmail.com> Tested-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: David Ahern <dsahern@gmail.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1382797536-32303-3-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-28	perf record: Split -g and --call-graph	Jiri Olsa	3	-23/+67
	Splitting -g and --call-graph for record command, so we could use '-g' with no option. The '-g' option now takes NO argument and enables the configured unwind method, which is currently the frame pointers method. It will be possible to configure unwind method via config file in upcoming patches. All current '-g' arguments is overtaken by --call-graph option. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Tested-by: David Ahern <dsahern@gmail.com> Tested-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: David Ahern <dsahern@gmail.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1382797536-32303-2-git-send-email-jolsa@redhat.com [ reordered -g/--call-graph on --help and expanded the man page according to comments by David Ahern and Namhyung Kim ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-28	perf hists: Add color overhead for stdio output buffer	Jiri Olsa	2	-5/+17
	Following commit tightened up the buffer size for output to strict width of used format columns: 99cf666 perf hists: Fix formatting of long symbol names This works fine until you hit color overhead output which places extra bytes into output buffer. We need to account for color overhead in the output buffer. Adding maximum color byte size to the output buffer size. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1382700293-1803-1-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-28	Merge tag 'perf-urgent-for-mingo' of ↵	Ingo Molnar	2	-14/+25
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: * Fix up /proc/PID/maps parsing, where perfectly fine mmap entries were being trown away when synthesizing PERF_RECORD_MMAP for preexisting threads, prevenging symbol resolution to work for those threads, broken in the MMAP2 removal. Reported and pinpointed by Markus Trippelsdorf, * Fix mem leak in the python 'perf script' backend, due to missing Py_DECREFs on dict entries, fix from Joseph Schuchart. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-28	perf tools: Fix up /proc/PID/maps parsing	Arnaldo Carvalho de Melo	1	-1/+1
	When introducing support for MMAP2 we considered more parts of each map representation in /proc/PID/maps, and when disabling it we forgot to reduce the number of expected parsed/assigned entries in the sscanf call, fix it to expect the right number of desired fields, 5. Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Based-on-a-patch-by: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-vrbo1wik997ahjzl1chm3bdm@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-27	Linux 3.12-rc7v3.12-rc7	Linus Torvalds	1	-1/+1

2013-10-27	Merge branch 'parisc-3.12' of ↵	Linus Torvalds	1	-0/+4
	git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fix from Helge Deller: "This is a 2-line patch to save the CPU register which holds our task thread info pointer before calling a firmware function and then to restore it again afterwards. This is necessary because on some 64bit machines the high-order 32bits are being clobbered by the firmware call, and thus we failed to bring up secondary CPUs (and instead crashed the kernel) in some situations eg if we had more than 4GB RAM. This patch fixes a bug which has been since ever in the parisc linux kernel and which prevented some people to use a 64bit kernel" * 'parisc-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Do not crash 64bit SMP kernels on machines with >= 4GB RAM
2013-10-27	Merge branch 'timers-urgent-for-linus' of ↵	Linus Torvalds	1	-15/+50
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Ingo Molnar: "This tree contains a clockevents regression fix for certain ARM subarchitectures" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clockevents: Sanitize ticks to nsec conversion
2013-10-27	Merge branch 'perf-urgent-for-linus' of ↵	Linus Torvalds	5	-20/+19
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "The tree contains three fixes: - Two tooling fixes - Reversal of the new 'MMAP2' extended mmap record ABI, introduced in this merge window. (Patches were proposed to fix it but it was all a bit late and we felt it's safer to just delay the ABI one more kernel release and do it right)" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Disable PERF_RECORD_MMAP2 support perf scripting perl: Fix build error on Fedora 12 perf probe: Fix to initialize fname always before use it
2013-10-27	Merge branch 'core-urgent-for-linus' of ↵	Linus Torvalds	1	-16/+16
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Ingo Molnar: "This tree fixes a boot crash in CONFIG_DEBUG_MUTEXES=y kernels, on kernels built with GCC 3.x (there are still such distros)" Side note: it's not just a fix for old gcc versions, it's also removing an incredibly broken/subtle check that LLVM had issues with, and that made no sense. * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: mutex: Avoid gcc version dependent __builtin_constant_p() usage
2013-10-27	Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending	Linus Torvalds	6	-24/+50
	Pull SCSI target fixes from Nicholas Bellinger: "Here are the outstanding target pending fixes for v3.12-rc7. This includes a number of EXTENDED_COPY related fixes as a result of Thomas and Doug's continuing testing and feedback. Also included is an important vhost/scsi fix that addresses a long standing issue where the 'write' parameter for get_user_pages_fast() was incorrectly set for virtio-scsi WRITEs -> DMA_TO_DEVICE, and not for virtio-scsi READs -> DMA_FROM_DEVICE. This resulted in random userspace segfaults and other unpleasantness on KVM host, and unfortunately has been an issue since the initial merge of vhost/scsi in v3.6. This patch is CC'ed to stable, along with two other less critical items" * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: vhost/scsi: Fix incorrect usage of get_user_pages_fast write parameter target/pscsi: fix return value check target: Fail XCOPY for non matching source + destination block_size target: Generate failure for XCOPY I/O with non-zero scsi_status target: Add missing XCOPY I/O operation sense_buffer iser-target: check device before dereferencing its variable target: Return an error for WRITE SAME with ANCHOR==1 target: Fix assignment of LUN in tracepoints target: Reject EXTENDED_COPY when emulate_3pc is disabled target: Allow non zero ListID in EXTENDED_COPY parameter list target: Make target_do_xcopy failures return INVALID_PARAMETER_LIST
2013-10-27	Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma	Linus Torvalds	2	-1/+8
	Pull slave-dmaengine fixes from Vinod Koul: "Here is the late fixes pull request for dmaengine while you fly back from KS. We have a new dmaengine ML hosted by vger so a patch for that along with addition of Dave as driver mainatainer for ioat. Other fixes are memeory leak fixes on edma driver, small fixes on rcar-hpbdma driver by Sergei" * 'fixes' of git://git.infradead.org/users/vkoul/slave-dma: dmaengine: edma: fix another memory leak dma: edma: Fix memory leak MAINTAINERS: add to ioatdma maintainer list MAINTAINERS: add the new dmaengine mailing list
2013-10-27	parisc: Do not crash 64bit SMP kernels on machines with >= 4GB RAM	Helge Deller	1	-0/+4
	Since the beginning of the parisc-linux port, sometimes 64bit SMP kernels were not able to bring up other CPUs than the monarch CPU and instead crashed the kernel. The reason was unclear, esp. since it involved various machines (e.g. J5600, J6750 and SuperDome). Testing showed, that those crashes didn't happened when less than 4GB were installed, or if a 32bit Linux kernel was booted. In the end, the fix for those SMP problems is trivial: During the early phase of the initialization of the CPUs, including the monarch CPU, the PDC_PSW firmware function to enable WIDE (=64bit) mode is called. It's documented that this firmware function may clobber various registers, and one one of those possibly clobbered registers is %cr30 which holds the task thread info pointer. Now, if %cr30 would always have been clobbered, then this bug would have been detected much earlier. But lots of testing finally showed, that - at least for %cr30 - on some machines only the upper 32bits of the 64bit register suddenly turned zero after the firmware call. So, after finding the root cause, the explanation for the various crashes became clear: - On 32bit SMP Linux kernels all upper 32bit were zero, so we didn't faced this problem. - Monarch CPUs in 64bit mode always booted sucessfully, because the inital task thread info pointer was below 4GB. - Secondary CPUs booted sucessfully on machines with less than 4GB RAM because the upper 32bit were zero anyay. - Secondary CPus failed to boot if we had more than 4GB RAM and the task thread info pointer was located above the 4GB boundary. Finally, the patch to fix this problem is trivial by saving the %cr30 register before the firmware call and restoring it afterwards. Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: <stable@vger.kernel.org> # 2.6.12+ Signed-off-by: Helge Deller <deller@gmx.de>
2013-10-26	Merge tag 'pm+acpi-3.12-rc7' of ↵	Linus Torvalds	3	-25/+23
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management fixes from "These fix two bugs in the intel_pstate driver, a hibernate bug leading to nasty resume failures sometimes and acpi-cpufreq initialization bug that causes problems to happen during module unload when intel_pstate is in use. Specifics: - Fix for rounding errors in intel_pstate causing CPU utilization to be underestimated from Brennan Shacklett. - intel_pstate fix to always use the correct max pstate value when computing the min pstate from Dirk Brandewie. - Hibernation fix for deadlocking resume in cases when the probing of the device containing the image is deferred from Russ Dill. - acpi-cpufreq fix to prevent the module from staying in memory when the driver cannot be registered and then attempting to unregister things that have never been registered on exit" * tag 'pm+acpi-3.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: acpi-cpufreq: Fail initialization if driver cannot be registered PM / hibernate: Move software_resume to late_initcall_sync intel_pstate: Correct calculation of min pstate value intel_pstate: Improve accuracy by not truncating until final result
2013-10-25	Merge tag 'for-linus-20131025' of git://git.infradead.org/linux-mtd	Linus Torvalds	2	-2/+7
	Pull final mtd fixes from Brian Norris: "A few more last-minute regression fixes, prepared jointly by me and David Woodhouse: - Revert pxa3xx to its old name to avoid breaking existing 'mtdparts=' boot strings. - Return GPMI NAND to its legacy ECC layout for backwards compatibility. We will revisit this in 3.13. A note from David on the latter fix: 'This leaves a harmless cosmetic warning about an unused function. At this point in the cycle I really don't care.'" * tag 'for-linus-20131025' of git://git.infradead.org/linux-mtd: mtd: gpmi: fix ECC regression mtd: nand: pxa3xx: Fix registered MTD name
2013-10-25	vhost/scsi: Fix incorrect usage of get_user_pages_fast write parameter	Nicholas Bellinger	1	-1/+1
	This patch addresses a long-standing bug where the get_user_pages_fast() write parameter used for setting the underlying page table entry permission bits was incorrectly set to write=1 for data_direction=DMA_TO_DEVICE, and passed into get_user_pages_fast() via vhost_scsi_map_iov_to_sgl(). However, this parameter is intended to signal WRITEs to pinned userspace PTEs for the virtio-scsi DMA_FROM_DEVICE -> READ payload case, and not for the virtio-scsi DMA_TO_DEVICE -> WRITE payload case. This bug would manifest itself as random process segmentation faults on KVM host after repeated vhost starts + stops and/or with lots of vhost endpoints + LUNs. Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Asias He <asias@redhat.com> Cc: <stable@vger.kernel.org> # 3.6+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-10-25	target/pscsi: fix return value check	Wei Yongjun	1	-4/+4
	In case of error, the function scsi_host_lookup() returns NULL pointer not ERR_PTR(). The IS_ERR() test in the return value check should be replaced with NULL test. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: <stable@vger.kernel.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-10-25	Merge branch 'for-linus' of ↵	Linus Torvalds	2	-2/+4
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes (try two) from Al Viro: "nfsd performance regression fix + seq_file lseek(2) fix" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: seq_file: always update file->f_pos in seq_lseek() nfsd regression since delayed fput()
2013-10-25	mtd: gpmi: fix ECC regression	David Woodhouse	1	-1/+1
	The "legacy" ECC layout used until 3.12-rc1 uses all the OOB area by computing the ECC strength and ECC step size ourselves. Commit 2febcdf84b ("mtd: gpmi: set the BCHs geometry with the ecc info") makes the driver use the ECC info (ECC strength and ECC step size) provided by the MTD code, and creates a different NAND ECC layout for the BCH, and use the new ECC layout. This causes a regression: We can not mount the ubifs which was created by the old NAND ECC layout. This patch fixes this issue by reverting to the legacy ECC layout. We will probably introduce a new device-tree property to indicate that the new ECC layout can be used. For now though, for the imminent 3.12 release, we just unconditionally revert to the 3.11 behaviour. This leaves a harmless cosmetic warning about an unused function. At this point in the cycle I really don't care. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com> Acked-by: Huang Shijie <b32955@freescale.com> Acked-by: Marek Vasut <marex@denx.de> Tested-by: Marek Vasut <marex@denx.de>
2013-10-25	seq_file: always update file->f_pos in seq_lseek()	Gu Zheng	1	-0/+2
	This issue was first pointed out by Jiaxing Wang several months ago, but no further comments: https://lkml.org/lkml/2013/6/29/41 As we know pread() does not change f_pos, so after pread(), file->f_pos and m->read_pos become different. And seq_lseek() does not update file->f_pos if offset equals to m->read_pos, so after pread() and seq_lseek()(lseek to m->read_pos), then a subsequent read may read from a wrong position, the following program produces the problem: char str1[32] = { 0 }; char str2[32] = { 0 }; int poffset = 10; int count = 20; /open any seq file/ int fd = open("/proc/modules", O_RDONLY); pread(fd, str1, count, poffset); printf("pread:%s\n", str1); /seek to where m->read_pos is/ lseek(fd, poffset+count, SEEK_SET); /supposed to read from poffset+count, but this read from position 0/ read(fd, str2, count); printf("read:%s\n", str2); out put: pread: ck_netbios_ns 12665 read: nf_conntrack_netbios /proc/modules: nf_conntrack_netbios_ns 12665 0 - Live 0xffffffffa038b000 nf_conntrack_broadcast 12589 1 nf_conntrack_netbios_ns, Live 0xffffffffa0386000 So we always update file->f_pos to offset in seq_lseek() to fix this issue. Signed-off-by: Jiaxing Wang <hello.wjx@gmail.com> Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-10-25	acpi-cpufreq: Fail initialization if driver cannot be registered	Rafael J. Wysocki	1	-4/+4
	Make acpi_cpufreq_init() return error codes when the driver cannot be registered so that the module doesn't stay useless in memory and so that acpi_cpufreq_exit() doesn't attempt to unregister things that have never been registered when the module is unloaded. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2013-10-25	Merge tag 'fixes-for-linus' of ↵	Linus Torvalds	2	-7/+73
	git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "There's really only one bugfix in this branch, which is a fix for timers on the integrator platform. Since Linus Walleij is resurrecting support for the platform it seems valuable to get the fix into 3.12 even though the regression has been around a while. The rest are a handful of maintainers updates. If you prefer to hold those until 3.13 then just merge the first patch on the branch which is the fix" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: MAINTAINERS: Add maintainers entry for Rockchip SoCs MAINTAINERS: Tegra updates, and driver ownership MAINTAINERS: ARM: mvebu: add Sebastian Hesselbarth ARM: integrator: deactivate timer0 on the Integrator/CP
2013-10-25	Merge tag 'ecryptfs-3.12-rc7-fixes' of ↵	Linus Torvalds	2	-2/+3
	git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs Pull ecryptfs fixes from Tyler Hicks: "Two important fixes - Fix long standing memory leak in the (rarely used) public key support - Fix large file corruption on 32 bit architectures" * tag 'ecryptfs-3.12-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs: eCryptfs: fix 32 bit corruption issue ecryptfs: Fix memory leakage in keystore.c
2013-10-25	PM / hibernate: Move software_resume to late_initcall_sync	Russ Dill	1	-1/+1
	software_resume is being called after deferred_probe_initcall in drivers base. If the probing of the device that contains the resume image is deferred, and the system has been instructed to wait for it to show up, this wait will occur in software_resume. This causes a deadlock. Move software_resume into late_initcall_sync so that it happens after all the other late_initcalls. Signed-off-by: Russ Dill <Russ.Dill@ti.com> Acked-by: Pavel Machek <Pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-24	mtd: nand: pxa3xx: Fix registered MTD name	Ezequiel Garcia	1	-1/+6
	In a recent commit: commit f455578dd961087a5cf94730d9f6489bb1d355f0 Author: Ezequiel Garcia <ezequiel.garcia@free-electrons.com> Date: Mon Aug 12 14:14:53 2013 -0300 mtd: nand: pxa3xx: Remove hardcoded mtd name There's no advantage in using a hardcoded name for the mtd device. Instead use the provided by the platform_device. The MTD name was changed to use the one provided by the platform_device. However, this can be problematic as some users want to set partitions using the kernel parameter 'mtdparts', where the name is needed. Therefore, to avoid regressions in users relying in 'mtdparts' we revert the change and use the previous one 'pxa3xx_nand-0'. While at it, let's put a big comment and prevent this change from happening ever again. Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
2013-10-24	eCryptfs: fix 32 bit corruption issue	Colin Ian King	1	-1/+1
	Shifting page->index on 32 bit systems was overflowing, causing data corruption of > 4GB files. Fix this by casting it first. https://launchpad.net/bugs/1243636 Signed-off-by: Colin Ian King <colin.king@canonical.com> Reported-by: Lars Duesing <lars.duesing@camelotsweb.de> Cc: stable@vger.kernel.org # v3.11+ Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
2013-10-24	dmaengine: edma: fix another memory leak	Vinod Koul	1	-0/+1
	commit 4b6271a6 fix a menory leak but one more existed in driver so fix that Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-10-24	dma: edma: Fix memory leak	Valentin Ilie	1	-0/+1
	When it fails to allocate a slot, edesc should be free'd before return; Signed-off-by: Valentin Ilie <valentin.ilie@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-10-24	perf script python: Fix mem leak due to missing Py_DECREFs on dict entries	Joseph Schuchart	1	-13/+24
	We are using the Python scripting interface in perf to extract kernel events relevant for performance analysis of HPC codes. We noticed that the "perf script" call allocates a significant amount of memory (in the order of several 100 MiB) during it's run, e.g. 125 MiB for a 25 MiB input file: $> perf record -o perf.data -a -R -g fp \ -e power:cpu_frequency -e sched:sched_switch \ -e sched:sched_migrate_task -e sched:sched_process_exit \ -e sched:sched_process_fork -e sched:sched_process_exec \ -e cycles -m 4096 --freq 4000 $> /usr/bin/time perf script -i perf.data -s dummy_script.py 0.84user 0.13system 0:01.92elapsed 51%CPU (0avgtext+0avgdata 125532maxresident)k 73072inputs+0outputs (57major+33086minor)pagefaults 0swaps Upon further investigation using the valgrind massif tool, we noticed that Python objects that are created in trace-event-python.c via PyString_FromString*() (and their Integer and Long counterparts) are never free'd. The reason for this seem to be missing Py_DECREF calls on the objects that are returned by these functions and stored in the Python dictionaries. The Python dictionaries do not steal references (as opposed to Python tuples and lists) but instead add their own reference. Hence, the reference that is returned by these object creation functions is never released and the memory is leaked. (see [1,2]) The attached patch fixes this by wrapping all relevant calls to PyDict_SetItemString() and decrementing the reference counter immediately after the Python function call. This reduces the allocated memory to a reasonable amount: $> /usr/bin/time perf script -i perf.data -s dummy_script.py 0.73user 0.05system 0:00.79elapsed 99%CPU (0avgtext+0avgdata 49132maxresident)k 0inputs+0outputs (0major+14045minor)pagefaults 0swaps For comparison, with a 120 MiB input file the memory consumption reported by time drops from almost 600 MiB to 146 MiB. The patch has been tested using Linux 3.8.2 with Python 2.7.4 and Linux 3.11.6 with Python 2.7.5. Please let me know if you need any further information. [1] http://docs.python.org/2/c-api/tuple.html#PyTuple_SetItem [2] http://docs.python.org/2/c-api/dict.html#PyDict_SetItemString Signed-off-by: Joseph Schuchart <joseph.schuchart@tu-dresden.de> Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Link: http://lkml.kernel.org/r/1381468543-25334-4-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-24	target: Fail XCOPY for non matching source + destination block_size	Nicholas Bellinger	1	-1/+13
	This patch adds an explicit check + failure for XCOPY I/O to source + destination devices with a non-matching block_size. This limitiation is currently due to the fact that the scatterlist memory allocated for the XCOPY READ operation is passed zero-copy to the XCOPY WRITE operation. Reported-by: Thomas Glanzmann <thomas@glanzmann.de> Reported-by: Douglas Gilbert <dgilbert@interlog.com> Cc: Thomas Glanzmann <thomas@glanzmann.de> Cc: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-10-24	target: Generate failure for XCOPY I/O with non-zero scsi_status	Nicholas Bellinger	1	-1/+2
	This patch adds the missing non-zero se_cmd->scsi_status check required for local XCOPY I/O within target_xcopy_issue_pt_cmd() to signal an exception case failure. This will trigger the generation of SAM_STAT_CHECK_CONDITION status from within target_xcopy_do_work() process context code. Reported-by: Thomas Glanzmann <thomas@glanzmann.de> Reported-by: Douglas Gilbert <dgilbert@interlog.com> Cc: Thomas Glanzmann <thomas@glanzmann.de> Cc: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-10-24	target: Add missing XCOPY I/O operation sense_buffer	Nicholas Bellinger	1	-2/+3
	This patch adds the missing xcopy_pt_cmd->sense_buffer[] required for correctly handling CHECK_CONDITION exceptions within the locally generated XCOPY I/O path. Also update target_xcopy_read_source() + target_xcopy_setup_pt_cmd() to pass this buffer into transport_init_se_cmd() to correctly setup se_cmd->sense_buffer. Reported-by: Thomas Glanzmann <thomas@glanzmann.de> Reported-by: Douglas Gilbert <dgilbert@interlog.com> Cc: Thomas Glanzmann <thomas@glanzmann.de> Cc: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-10-24	Merge tag 'perf-core-for-mingo' of ↵	Ingo Molnar	32	-390/+560
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: * Show progress on histogram collapsing, that can take a long time, from Namhyung Kim. * Support "$vars" meta argument syntax for local variables, allowing asking for all possible variables at a given probe point to be collected when it hits, from Masami Hiramatsu. * Address the root cause of that 'perf sched' stack initialization build slowdown, by programmatically setting a big array after moving the global variable back to the stack. Fix from Adrian Hunter. * Do not repipe attributes to a perf.data file in 'perf inject', fix from Adrian Hunter * Change the procps visible command-name of invididual benchmark tests plus cleanups, from Ingo Molnar. * Do not accept parse_tag_value() overflow, fix from Adrian Hunter. * Validate that mmap_pages is not too big. From Adrian Hunter. * Fix non-debug build, from Adrian Hunter. * Clarify the "sample parsing" test entry, from Arnaldo Carvalho de Melo. * Consider PERF_SAMPLE_TRANSACTION in the "sample parsing" test, from Arnaldo Carvalho de Melo. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-24	Merge tag 'md/3.12-fixes' of git://neil.brown.name/md	Linus Torvalds	4	-2/+25
	Pull md bugfixes from Neil Brown: "Assorted md bug-fixes for 3.12. All tagged for -stable releases too" * tag 'md/3.12-fixes' of git://neil.brown.name/md: raid5: avoid finding "discard" stripe raid5: set bio bi_vcnt 0 for discard request md: avoid deadlock when md_set_badblocks. md: Fix skipping recovery for read-only arrays.
2013-10-24	Merge tag 'scsi-fixes' of ↵	Linus Torvalds	4	-10/+19
	git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is a set of two fixes which cause oopses (Buslogic, qla2xxx) and one fix which may cause a hang because of request miscounting (sd)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: [SCSI] sd: call blk_pm_runtime_init before add_disk [SCSI] qla2xxx: Fix request queue null dereference. [SCSI] BusLogic: Fix an oops when intializing multimaster adapter
2013-10-23	iser-target: check device before dereferencing its variable	Vu Pham	1	-1/+1
	This patch changes isert_connect_release() to correctly check for the existence struct isert_device *device before checking for isert_device->use_frwr. Signed-off-by: Vu Pham <vu@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-10-24	raid5: avoid finding "discard" stripe	Shaohua Li	1	-0/+8
	SCSI discard will damage discard stripe bio setting, eg, some fields are changed. If the stripe is reused very soon, we have wrong bios setting. We remove discard stripe from hash list, so next time the strip will be fully initialized. Suitable for backport to 3.7+. Cc: <stable@vger.kernel.org> (3.7+) Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: NeilBrown <neilb@suse.de>
2013-10-24	raid5: set bio bi_vcnt 0 for discard request	Shaohua Li	1	-0/+12
	SCSI layer will add new payload for discard request. If two bios are merged to one, the second bio has bi_vcnt 1 which is set in raid5. This will confuse SCSI and cause oops. Suitable for backport to 3.7+ Cc: stable@vger.kernel.org (v3.7+) Reported-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
2013-10-24	md: avoid deadlock when md_set_badblocks.	Bian Yu	1	-2/+3
	When operate harddisk and hit errors, md_set_badblocks is called after scsi_restart_operations which already disabled the irq. but md_set_badblocks will call write_sequnlock_irq and enable irq. so softirq can preempt the current thread and that may cause a deadlock. I think this situation should use write_sequnlock_irqsave/irqrestore instead. I met the situation and the call trace is below: [ 638.919974] BUG: spinlock recursion on CPU#0, scsi_eh_13/1010 [ 638.921923] lock: 0xffff8800d4d51fc8, .magic: dead4ead, .owner: scsi_eh_13/1010, .owner_cpu: 0 [ 638.923890] CPU: 0 PID: 1010 Comm: scsi_eh_13 Not tainted 3.12.0-rc5+ #37 [ 638.925844] Hardware name: To be filled by O.E.M. To be filled by O.E.M./MAHOBAY, BIOS 4.6.5 03/05/2013 [ 638.927816] ffff880037ad4640 ffff880118c03d50 ffffffff8172ff85 0000000000000007 [ 638.929829] ffff8800d4d51fc8 ffff880118c03d70 ffffffff81730030 ffff8800d4d51fc8 [ 638.931848] ffffffff81a72eb0 ffff880118c03d90 ffffffff81730056 ffff8800d4d51fc8 [ 638.933884] Call Trace: [ 638.935867] <IRQ> [<ffffffff8172ff85>] dump_stack+0x55/0x76 [ 638.937878] [<ffffffff81730030>] spin_dump+0x8a/0x8f [ 638.939861] [<ffffffff81730056>] spin_bug+0x21/0x26 [ 638.941836] [<ffffffff81336de4>] do_raw_spin_lock+0xa4/0xc0 [ 638.943801] [<ffffffff8173f036>] _raw_spin_lock+0x66/0x80 [ 638.945747] [<ffffffff814a73ed>] ? scsi_device_unbusy+0x9d/0xd0 [ 638.947672] [<ffffffff8173fb1b>] ? _raw_spin_unlock+0x2b/0x50 [ 638.949595] [<ffffffff814a73ed>] scsi_device_unbusy+0x9d/0xd0 [ 638.951504] [<ffffffff8149ec47>] scsi_finish_command+0x37/0xe0 [ 638.953388] [<ffffffff814a75e8>] scsi_softirq_done+0xa8/0x140 [ 638.955248] [<ffffffff8130e32b>] blk_done_softirq+0x7b/0x90 [ 638.957116] [<ffffffff8104fddd>] __do_softirq+0xfd/0x330 [ 638.958987] [<ffffffff810b964f>] ? __lock_release+0x6f/0x100 [ 638.960861] [<ffffffff8174a5cc>] call_softirq+0x1c/0x30 [ 638.962724] [<ffffffff81004c7d>] do_softirq+0x8d/0xc0 [ 638.964565] [<ffffffff8105024e>] irq_exit+0x10e/0x150 [ 638.966390] [<ffffffff8174ad4a>] smp_apic_timer_interrupt+0x4a/0x60 [ 638.968223] [<ffffffff817499af>] apic_timer_interrupt+0x6f/0x80 [ 638.970079] <EOI> [<ffffffff810b964f>] ? __lock_release+0x6f/0x100 [ 638.971899] [<ffffffff8173fa6a>] ? _raw_spin_unlock_irq+0x3a/0x50 [ 638.973691] [<ffffffff8173fa60>] ? _raw_spin_unlock_irq+0x30/0x50 [ 638.975475] [<ffffffff81562393>] md_set_badblocks+0x1f3/0x4a0 [ 638.977243] [<ffffffff81566e07>] rdev_set_badblocks+0x27/0x80 [ 638.978988] [<ffffffffa00d97bb>] raid5_end_read_request+0x36b/0x4e0 [raid456] [ 638.980723] [<ffffffff811b5a1d>] bio_endio+0x1d/0x40 [ 638.982463] [<ffffffff81304ff3>] req_bio_endio.isra.65+0x83/0xa0 [ 638.984214] [<ffffffff81306b9f>] blk_update_request+0x7f/0x350 [ 638.985967] [<ffffffff81306ea1>] blk_update_bidi_request+0x31/0x90 [ 638.987710] [<ffffffff813085e0>] __blk_end_bidi_request+0x20/0x50 [ 638.989439] [<ffffffff8130862f>] __blk_end_request_all+0x1f/0x30 [ 638.991149] [<ffffffff81308746>] blk_peek_request+0x106/0x250 [ 638.992861] [<ffffffff814a62a9>] ? scsi_kill_request.isra.32+0xe9/0x130 [ 638.994561] [<ffffffff814a633a>] scsi_request_fn+0x4a/0x3d0 [ 638.996251] [<ffffffff813040a7>] __blk_run_queue+0x37/0x50 [ 638.997900] [<ffffffff813045af>] blk_run_queue+0x2f/0x50 [ 638.999553] [<ffffffff814a5750>] scsi_run_queue+0xe0/0x1c0 [ 639.001185] [<ffffffff814a7721>] scsi_run_host_queues+0x21/0x40 [ 639.002798] [<ffffffff814a2e87>] scsi_restart_operations+0x177/0x200 [ 639.004391] [<ffffffff814a4fe9>] scsi_error_handler+0xc9/0xe0 [ 639.005996] [<ffffffff814a4f20>] ? scsi_unjam_host+0xd0/0xd0 [ 639.007600] [<ffffffff81072f6b>] kthread+0xdb/0xe0 [ 639.009205] [<ffffffff81072e90>] ? flush_kthread_worker+0x170/0x170 [ 639.010821] [<ffffffff81748cac>] ret_from_fork+0x7c/0xb0 [ 639.012437] [<ffffffff81072e90>] ? flush_kthread_worker+0x170/0x170 This bug was introduce in commit 2e8ac30312973dd20e68073653 (the first time rdev_set_badblock was call from interrupt context), so this patch is appropriate for 3.5 and subsequent kernels. Cc: <stable@vger.kernel.org> (3.5+) Signed-off-by: Bian Yu <bianyu@kedacom.com> Reviewed-by: Jianpeng Ma <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
2013-10-24	md: Fix skipping recovery for read-only arrays.	Lukasz Dorau	2	-0/+2
	Since: commit 7ceb17e87bde79d285a8b988cfed9eaeebe60b86 md: Allow devices to be re-added to a read-only array. spares are activated on a read-only array. In case of raid1 and raid10 personalities it causes that not-in-sync devices are marked in-sync without checking if recovery has been finished. If a read-only array is degraded and one of its devices is not in-sync (because the array has been only partially recovered) recovery will be skipped. This patch adds checking if recovery has been finished before marking a device in-sync for raid1 and raid10 personalities. In case of raid5 personality such condition is already present (at raid5.c:6029). Bug was introduced in 3.10 and causes data corruption. Cc: stable@vger.kernel.org Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com> Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
2013-10-23	perf tools: Show progress on histogram collapsing	Namhyung Kim	7	-8/+20
	It can take quite amount of time so add progress bar UI to inform user. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1381468543-25334-4-git-send-email-namhyung@kernel.org [ perf_progress -> ui_progress ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-23	perf ui progress: Per progress bar state	Arnaldo Carvalho de Melo	5	-30/+47
	That will ease using a progress bar across multiple functions, like in the upcoming patches that will present a progress bar when collapsing histograms. Based on a previous patch by Namhyung Kim. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-cr7lq7ud9fj21bg7wvq27w1u@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>