summaryrefslogtreecommitdiffstats
path: root/arch/x86/kernel
AgeCommit message (Collapse)AuthorFilesLines
2011-04-06x86, hibernate: Initialize mmu_cr4_features during bootH. Peter Anvin1-0/+5
Restore the initialization of mmu_cr4_features during boot, which was removed without comment in checkin e5f15b45ddf3afa2bbbb10c7ea34fb32b6de0a0e x86: Cleanup highmap after brk is concluded thereby breaking resume from hibernate. This restores previous functionality in approximately the same place, and corrects the reading of %cr4 on pre-CPUID hardware (%cr4 exists if and only if CPUID is supported.) However, part of the problem is that the hibernate suspend/resume sequence should manage the save/restore of %cr4 explicitly. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <201104020154.57136.rjw@sisk.pl>
2011-03-31x86, UV: Fix kdump rebootCliff Wickman1-0/+9
After a crash dump on an SGI Altix UV system the crash kernel fails to cause a reboot. EFI mode is disabled in the kdump kernel, so only the reboot_type of BOOT_ACPI works. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: rja@sgi.com LKML-Reference: <E1Q5Iuo-00013b-UK@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-31x86, amd-nb: Rename CPU PCI id define for F4Borislav Petkov1-1/+1
With increasing number of PCI function ids, add the PCI function id in the define name instead of its symbolic name in the BKDG for more clarity. This renames function 4 define. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> LKML-Reference: <20110330183447.GA3668@aftab> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-29x86, mtrr, pat: Fix one cpu getting out of sync during resumeSuresh Siddha1-5/+15
On laptops with core i5/i7, there were reports that after resume graphics workloads were performing poorly on a specific AP, while the other cpu's were ok. This was observed on a 32bit kernel specifically. Debug showed that the PAT init was not happening on that AP during resume and hence it contributing to the poor workload performance on that cpu. On this system, resume flow looked like this: 1. BP starts the resume sequence and we reinit BP's MTRR's/PAT early on using mtrr_bp_restore() 2. Resume sequence brings all AP's online 3. Resume sequence now kicks off the MTRR reinit on all the AP's. 4. For some reason, between point 2 and 3, we moved from BP to one of the AP's. My guess is that printk() during resume sequence is contributing to this. We don't see similar behavior with the 64bit kernel but there is no guarantee that at this point the remaining resume sequence (after AP's bringup) has to happen on BP. 5. set_mtrr() was assuming that we are still on BP and skipped the MTRR/PAT init on that cpu (because of 1 above) 6. But we were on an AP and this led to not reprogramming PAT on this cpu leading to bad performance. Fix this by doing unconditional mtrr_if->set_all() in set_mtrr() during MTRR/PAT init. This might be unnecessary if we are still running on BP. But it is of no harm and will guarantee that after resume, all the cpu's will be in sync with respect to the MTRR/PAT registers. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1301438292-28370-1-git-send-email-eric@anholt.net> Signed-off-by: Eric Anholt <eric@anholt.net> Tested-by: Keith Packard <keithp@keithp.com> Cc: stable@kernel.org [v2.6.32+] Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-03-29x86, microcode: Unregister syscore_ops after microcode unloadedXiaotian Feng1-0/+1
Currently, microcode doesn't unregister syscore_ops after it's unloaded. So if we modprobe then rmmod microcode, the stale microcode syscore_ops info will stay on syscore_ops_list. Later when we're trying to reboot/halt/shutdown the machine, kernel will panic on syscore_shutdown(). With the patch applied, I can reboot/halt/shutdown my machine successfully. Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Cc: Rafael J. Wysocki <rjw@sisk.pl> LKML-Reference: <1301387672-23661-1-git-send-email-dfeng@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-29x86: Stop including <linux/delay.h> in two asm header filesJean Delvare4-0/+4
Stop including <linux/delay.h> in x86 header files which don't need it. This will let the compiler complain when this header is not included by source files when it should, so that contributors can fix the problem before building on other architectures starts to fail. Credits go to Geert for the idea. Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: James E.J. Bottomley <James.Bottomley@suse.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> LKML-Reference: <20110325152014.297890ec@endymion.delvare> [ this also fixes an upstream build bug in drivers/media/rc/ite-cir.c ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-25Merge branch 'syscore' of ↵Linus Torvalds9-200/+116
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 * 'syscore' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: Introduce ARCH_NO_SYSDEV_OPS config option (v2) cpufreq: Use syscore_ops for boot CPU suspend/resume (v2) KVM: Use syscore_ops instead of sysdev class and sysdev PCI / Intel IOMMU: Use syscore_ops instead of sysdev class and sysdev timekeeping: Use syscore_ops instead of sysdev class and sysdev x86: Use syscore_ops instead of sysdev classes and sysdevs
2011-03-25Merge branch 'for_linus' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb: kdb: add usage string of 'per_cpu' command kgdb,x86_64: fix compile warning found with sparse kdb: code cleanup to use macro instead of value kgdboc,kgdbts: strlen() doesn't count the terminator
2011-03-25Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds2-3/+9
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf, x86: Complain louder about BIOSen corrupting CPU/PMU state and continue perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows perf symbols: Look at .dynsym again if .symtab not found perf build-id: Add quirk to deal with perf.data file format breakage perf session: Pass evsel in event_ops->sample() perf: Better fit max unprivileged mlock pages for tools needs perf_events: Fix stale ->cgrp pointer in update_cgrp_time_from_cpuctx() perf top: Fix uninitialized 'counter' variable tracing: Fix set_ftrace_filter probe function display perf, x86: Fix Intel fixed counters base initialization
2011-03-25Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: futex: Fix WARN_ON() test for UP WARN_ON_SMP(): Allow use in if() statements on UP x86, dumpstack: Use %pB format specifier for stack trace vsprintf: Introduce %pB format specifier lockdep: Remove unused 'factor' variable from lockdep_stats_show()
2011-03-25Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds2-8/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: DT: Cleanup namespace and call irq_set_irq_type() unconditional x86: DT: Fix return condition in irq_create_of_mapping() x86, mpparse: Move check_slot into CONFIG_X86_IO_APIC context
2011-03-25kgdb,x86_64: fix compile warning found with sparseJason Wessel1-2/+2
Fix sparse warning: arch/x86/kernel/kgdb.c:123:9: warning: switch with no cases Reported-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2011-03-25perf, x86: Complain louder about BIOSen corrupting CPU/PMU state and continueIngo Molnar1-2/+7
Eric Dumazet reported that hardware PMU events do not work on his system, due to the BIOS corrupting PMU state: Performance Events: PEBS fmt0+, Core2 events, Broken BIOS detected, using software events only. [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 186 is 43003c) Linus suggested that we continue in the face of such BIOS-induced CPU state corruption: http://lkml.org/lkml/2011/3/24/608 Such BIOSes will have to be fixed - Linux developers rely on a working and fully capable PMU and the BIOS interfering with the CPU's PMU state is simply not acceptable. So this patch changes perf to continue when it detects such BIOS interaction, some hardware events may be unreliable due to the BIOS writing and re-writing them - there's not much the kernel can do about that but to detect the corruption and report it. Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-24x86: DT: Cleanup namespace and call irq_set_irq_type() unconditionalThomas Gleixner1-3/+1
That call escaped the name space cleanup. Fix it up. We really want to call there. The chip might have changed since the irq was setup initially. So let the core code and the chip decide what to do. The status is just an unreliable snapshot. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2011-03-24x86: DT: Fix return condition in irq_create_of_mapping()Thomas Gleixner1-1/+1
The xlate() function returns 0 or a negative error code. Returning the error code blindly will be seen as an huge irq number by the calling function because irq_create_of_mapping() returns an unsigned value. Return 0 (NO_IRQ) as required. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2011-03-24perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflowsDon Zickus1-0/+1
The read of a proper MSR register was missed and instead of counter the configration register was tested (it has ARCH_P4_UNFLAGGED_BIT always cleared) leading to unknown NMI hitting the system. As result the user may obtain "Dazed and confused, but trying to continue" message. Fix it by reading a proper MSR register. When an NMI happens on a P4, the perf nmi handler checks the configuration register to see if the overflow bit is set or not before taking appropriate action. Unfortunately, various P4 machines had a broken overflow bit, so a backup mechanism was implemented. This mechanism checked to see if the counter rolled over or not. A previous commit that implemented this backup mechanism was broken. Instead of reading the counter register, it used the configuration register to determine if the counter rolled over or not. Reading that bit would give incorrect results. This would lead to 'Dazed and confused' messages for the end user when using the perf tool (or if the nmi watchdog is running). The fix is to read the counter register before determining if the counter rolled over or not. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Lin Ming <ming.m.lin@intel.com> LKML-Reference: <4D8BAB49.3080701@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-24Merge branch 'release' of ↵Linus Torvalds3-25/+31
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (42 commits) ACPI: minor printk format change in acpi_pad ACPI: make acpi_pad /sys output more readable ACPICA: Update version to 20110316 ACPICA: Header support for SLIC table ACPI: Make sure the FADT is at least rev 2 before using the reset register ACPI: Bug compatibility for Windows on the ACPI reboot vector ACPICA: Fix access width for reset vector ACPI battery: fribble sysfs files from a resume notifier ACPI button: remove unused procfs I/F ACPI, APEI, Add PCIe AER error information printing support PCIe, AER, use pre-generated prefix in error information printing ACPI, APEI, Add ERST record ID cache ACPI: Use syscore_ops instead of sysdev class and sysdev ACPI: Remove the unused EC sysdev class ACPI: use __cpuinit for the acpi_processor_set_pdc() call tree ACPI: use __init where possible in processor driver Thermal_Framework-Fix_crash_during_hwmon_unregister ACPICA: Update version to 20110211. ACPICA: Add mechanism to defer _REG methods for some installed handlers ACPICA: Add support for FunctionalFixedHW in acpi_ut_get_region_name ...
2011-03-24x86, dumpstack: Use %pB format specifier for stack traceNamhyung Kim1-1/+1
Improve noreturn function entries in call traces: Before: Call Trace: [<ffffffff812a8502>] panic+0x8c/0x18d [<ffffffffa000012a>] deep01+0x0/0x38 [test_panic] <--- bad [<ffffffff81104666>] proc_file_write+0x73/0x8d [<ffffffff811000b3>] proc_reg_write+0x8d/0xac [<ffffffff810c7d32>] vfs_write+0xa1/0xc5 [<ffffffff810c7e0f>] sys_write+0x45/0x6c [<ffffffff8f02943b>] system_call_fastpath+0x16/0x1b After: Call Trace: [<ffffffff812bce69>] panic+0x8c/0x18d [<ffffffffa000012a>] panic_write+0x20/0x20 [test_panic] <--- good [<ffffffff81115fab>] proc_file_write+0x73/0x8d [<ffffffff81111a5f>] proc_reg_write+0x8d/0xac [<ffffffff810d90ee>] vfs_write+0xa1/0xc5 [<ffffffff810d91cb>] sys_write+0x45/0x6c [<ffffffff812c07fb>] system_call_fastpath+0x16/0x1b Signed-off-by: Namhyung Kim <namhyung@gmail.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <1300934550-21394-2-git-send-email-namhyung@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-23Merge branch 'for-linus' of ↵Linus Torvalds1-0/+8
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: deal with races in /proc/*/{syscall,stack,personality} proc: enable writing to /proc/pid/mem proc: make check_mem_permission() return an mm_struct on success proc: hold cred_guard_mutex in check_mem_permission() proc: disable mem_write after exec mm: implement access_remote_vm mm: factor out main logic of access_process_vm mm: use mm_struct to resolve gate vma's in __get_user_pages mm: arch: rename in_gate_area_no_task to in_gate_area_no_mm mm: arch: make in_gate_area take an mm_struct instead of a task_struct mm: arch: make get_gate_vma take an mm_struct instead of a task_struct x86: mark associated mm when running a task in 32 bit compatibility mode x86: add context tag to mark mm when running a task in 32-bit compatibility mode auxv: require the target to be tracable (or yourself) close race in /proc/*/environ report errors in /proc/*/*map* sanely pagemap: close races with suid execve make sessionid permissions in /proc/*/task/* match those in /proc/* fix leaks in path_lookupat() Fix up trivial conflicts in fs/proc/base.c
2011-03-23crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, ↵Olaf Hering4-28/+1
setup_elfcorehdr and saved_max_pfn The Xen PV drivers in a crashed HVM guest can not connect to the dom0 backend drivers because both frontend and backend drivers are still in connected state. To run the connection reset function only in case of a crashdump, the is_kdump_kernel() function needs to be available for the PV driver modules. Consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn into kernel/crash_dump.c Also export elfcorehdr_addr to make is_kdump_kernel() usable for modules. Leave 'elfcorehdr' as early_param(). This changes powerpc from __setup() to early_param(). It adds an address range check from x86 also on ia64 and powerpc. [akpm@linux-foundation.org: additional #includes] [akpm@linux-foundation.org: remove elfcorehdr_addr export] [akpm@linux-foundation.org: fix for Tejun's mm/nobootmem.c changes] Signed-off-by: Olaf Hering <olaf@aepfle.de> Cc: Russell King <rmk@arm.linux.org.uk> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23x86: Use syscore_ops instead of sysdev classes and sysdevsRafael J. Wysocki9-200/+116
Some subsystems in the x86 tree need to carry out suspend/resume and shutdown operations with one CPU on-line and interrupts disabled and they define sysdev classes and sysdevs or sysdev drivers for this purpose. This leads to unnecessarily complicated code and excessive memory usage, so switch them to using struct syscore_ops objects for this purpose instead. Generally, there are three categories of subsystems that use sysdevs for implementing PM operations: (1) subsystems whose suspend/resume callbacks ignore their arguments entirely (the majority), (2) subsystems whose suspend/resume callbacks use their struct sys_device argument, but don't really need to do that, because they can be implemented differently in an arguably simpler way (io_apic.c), and (3) subsystems whose suspend/resume callbacks use their struct sys_device argument, but the value of that argument is always the same and could be ignored (microcode_core.c). In all of these cases the subsystems in question may be readily converted to using struct syscore_ops objects for power management and shutdown. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu>
2011-03-23x86: mark associated mm when running a task in 32 bit compatibility modeStephen Wilson1-0/+8
This patch simply follows the same practice as for setting the TIF_IA32 flag. In particular, an mm is marked as holding 32-bit tasks when a 32-bit binary is exec'ed. Both ELF and a.out formats are updated. Signed-off-by: Stephen Wilson <wilsons@start.ca> Reviewed-by: Michel Lespinasse <walken@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-23Merge branch 'linus' into releaseLen Brown98-1508/+2548
Conflicts: arch/x86/kernel/acpi/sleep.c Signed-off-by: Len Brown <len.brown@intel.com>
2011-03-23Merge commit 'v2.6.38' into releaseLen Brown4-11/+18
2011-03-22x86: only compile 8237A if CONFIG_ISA_DMA_API is enabledDavid Rientjes1-1/+2
8237A utilizes the interface provided by CONFIG_ISA_DMA_API, specifically claim_dma_lock() and release_dma_lock(). Thus, there's a strict dependency on the config option and the module should only be loaded if the kernel supports ISA-style DMA. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-22move x86 specific oops=panic to generic codeOlaf Hering1-10/+0
The oops=panic cmdline option is not x86 specific, move it to generic code. Update documentation. Signed-off-by: Olaf Hering <olaf@aepfle.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-22Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds2-25/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: xen: update mask_rw_pte after kernel page tables init changes xen: set max_pfn_mapped to the last pfn mapped x86: Cleanup highmap after brk is concluded Fix up trivial onflict (added header file includes) in arch/x86/mm/init_64.c
2011-03-22x86, mpparse: Move check_slot into CONFIG_X86_IO_APIC contextRakib Mullick1-4/+4
When CONFIG_X86_MPPARSE=y and CONFIG_X86_IO_APIC=n, then we get the following warning: arch/x86/kernel/mpparse.c:723: warning: 'check_slot' defined but not used So, put check_slot into CONFIG_X86_IO_APIC context. Its only called from CONFIG_X86_IO_APIC=y context. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> LKML-Reference: <AANLkTinsUfGc=NG_GeH_B+zFVu+DXJzZbJKdQLscqfuH@mail.gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-22Merge branch 'apei-release' into releaseLen Brown1-16/+26
2011-03-21ACPI, APEI, Add ERST record ID cacheHuang Ying1-16/+26
APEI ERST firmware interface and implementation has no multiple users in mind. For example, if there is four records in storage with ID: 1, 2, 3 and 4, if two ERST readers enumerate the records via GET_NEXT_RECORD_ID as follow, reader 1 reader 2 1 2 3 4 -1 -1 where -1 signals there is no more record ID. Reader 1 has no chance to check record 2 and 4, while reader 2 has no chance to check record 1 and 3. And any other GET_NEXT_RECORD_ID will return -1, that is, other readers will has no chance to check any record even they are not cleared by anyone. This makes raw GET_NEXT_RECORD_ID not suitable for used by multiple users. To solve the issue, an in-memory ERST record ID cache is designed and implemented. When enumerating record ID, the ID returned by GET_NEXT_RECORD_ID is added into cache in addition to be returned to caller. So other readers can check the cache to get all record ID available. Signed-off-by: Huang Ying <ying.huang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2011-03-21introduce sys_syncfs to sync a single file systemSage Weil1-0/+1
It is frequently useful to sync a single file system, instead of all mounted file systems via sync(2): - On machines with many mounts, it is not at all uncommon for some of them to hang (e.g. unresponsive NFS server). sync(2) will get stuck on those and may never get to the one you do care about (e.g., /). - Some applications write lots of data to the file system and then want to make sure it is flushed to disk. Calling fsync(2) on each file introduces unnecessary ordering constraints that result in a large amount of sub-optimal writeback/flush/commit behavior by the file system. There are currently two ways (that I know of) to sync a single super_block: - BLKFLSBUF ioctl on the block device: That also invalidates the bdev mapping, which isn't usually desirable, and doesn't work for non-block file systems. - 'mount -o remount,rw' will call sync_filesystem as an artifact of the current implemention. Relying on this little-known side effect for something like data safety sounds foolish. Both of these approaches require root privileges, which some applications do not have (nor should they need?) given that sync(2) is an unprivileged operation. This patch introduces a new system call syncfs(2) that takes an fd and syncs only the file system it references. Maybe someday we can $ sync /some/path and not get sync: ignoring all arguments The syscall is motivated by comments by Al and Christoph at the last LSF. syncfs(2) seems like an appropriate name given statfs(2). A similar ioctl was also proposed a while back, see http://marc.info/?l=linux-fsdevel&m=127970513829285&w=2 Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-19x86: Cleanup highmap after brk is concludedYinghai Lu2-25/+3
Now cleanup_highmap actually is in two steps: one is early in head64.c and only clears above _end; a second one is in init_memory_mapping() and tries to clean from _brk_end to _end. It should check if those boundaries are PMD_SIZE aligned but currently does not. Also init_memory_mapping() is called several times for numa or memory hotplug, so we really should not handle initial kernel mappings there. This patch moves cleanup_highmap() down after _brk_end is settled so we can do everything in one step. Also we honor max_pfn_mapped in the implementation of cleanup_highmap. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> LKML-Reference: <alpine.DEB.2.00.1103171739050.3382@kaball-desktop> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2011-03-19perf, x86: Fix Intel fixed counters base initializationStephane Eranian1-1/+1
The following patch solves the problems introduced by Robert's commit 41bf498 and reported by Arun Sharma. This commit gets rid of the base + index notation for reading and writing PMU msrs. The problem is that for fixed counters, the new calculation for the base did not take into account the fixed counter indexes, thus all fixed counters were read/written from fixed counter 0. Although all fixed counters share the same config MSR, they each have their own counter register. Without: $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds 242202299 unhalted_core_cycles (0.00% scaling, ena=1000790892, run=1000790892) 2389685946 instructions_retired (0.00% scaling, ena=1000790892, run=1000790892) 49473 baclears (0.00% scaling, ena=1000790892, run=1000790892) With: $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds 2392703238 unhalted_core_cycles (0.00% scaling, ena=1000840809, run=1000840809) 2389793744 instructions_retired (0.00% scaling, ena=1000840809, run=1000840809) 47863 baclears (0.00% scaling, ena=1000840809, run=1000840809) Signed-off-by: Stephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: ming.m.lin@intel.com Cc: robert.richter@amd.com Cc: asharma@fb.com Cc: perfmon2-devel@lists.sf.net LKML-Reference: <20110319172005.GB4978@quad> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-18Merge branch 'acpica' into releaseLen Brown4-14/+26
2011-03-18Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds30-60/+63
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Flush TLB if PGD entry is changed in i386 PAE mode x86, dumpstack: Correct stack dump info when frame pointer is available x86: Clean up csum-copy_64.S a bit x86: Fix common misspellings x86: Fix misspelling and align params x86: Use PentiumPro-optimized partial_csum() on VIA C7
2011-03-18Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds2-67/+47
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits) trace, filters: Initialize the match variable in process_ops() properly trace, documentation: Fix branch profiling location in debugfs oprofile, s390: Cleanups oprofile, s390: Remove hwsampler_files.c and merge it into init.c perf: Fix tear-down of inherited group events perf: Reorder & optimize perf_event_context to remove alignment padding on 64 bit builds perf: Handle stopped state with tracepoints perf: Fix the software events state check perf, powerpc: Handle events that raise an exception without overflowing perf, x86: Use INTEL_*_CONSTRAINT() for all PEBS event constraints perf, x86: Clean up SandyBridge PEBS events perf lock: Fix sorting by wait_min perf tools: Version incorrect with some versions of grep perf evlist: New command to list the names of events present in a perf.data file perf script: Add support for H/W and S/W events perf script: Add support for dumping symbols perf script: Support custom field selection for output perf script: Move printing of 'common' data from print_event and rename perf tracing: Remove print_graph_cpu and print_graph_proc from trace-event-parse perf script: Change process_event prototype ...
2011-03-18Merge branch 'for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (47 commits) doc: CONFIG_UNEVICTABLE_LRU doesn't exist anymore Update cpuset info & webiste for cgroups dcdbas: force SMI to happen when expected arch/arm/Kconfig: remove one to many l's in the word. asm-generic/user.h: Fix spelling in comment drm: fix printk typo 'sracth' Remove one to many n's in a word Documentation/filesystems/romfs.txt: fixing link to genromfs drivers:scsi Change printk typo initate -> initiate serial, pch uart: Remove duplicate inclusion of linux/pci.h header fs/eventpoll.c: fix spelling mm: Fix out-of-date comments which refers non-existent functions drm: Fix printk typo 'failled' coh901318.c: Change initate to initiate. mbox-db5500.c Change initate to initiate. edac: correct i82975x error-info reported edac: correct i82975x mci initialisation edac: correct commented info fs: update comments to point correct document target: remove duplicate include of target/target_core_device.h from drivers/target/target_core_hba.c ... Trivial conflict in fs/eventpoll.c (spelling vs addition)
2011-03-18x86, dumpstack: Correct stack dump info when frame pointer is availableNamhyung Kim6-25/+28
Current stack dump code scans entire stack and check each entry contains a pointer to kernel code. If CONFIG_FRAME_POINTER=y it could mark whether the pointer is valid or not based on value of the frame pointer. Invalid entries could be preceded by '?' sign. However this was not going to happen because scan start point was always higher than the frame pointer so that they could not meet. Commit 9c0729dc8062 ("x86: Eliminate bp argument from the stack tracing routines") delayed bp acquisition point, so the bp was read in lower frame, thus all of the entries were marked invalid. This patch fixes this by reverting above commit while retaining stack_frame() helper as suggested by Frederic Weisbecker. End result looks like below: before: [ 3.508329] Call Trace: [ 3.508551] [<ffffffff814f35c9>] ? panic+0x91/0x199 [ 3.508662] [<ffffffff814f3739>] ? printk+0x68/0x6a [ 3.508770] [<ffffffff81a981b2>] ? mount_block_root+0x257/0x26e [ 3.508876] [<ffffffff81a9821f>] ? mount_root+0x56/0x5a [ 3.508975] [<ffffffff81a98393>] ? prepare_namespace+0x170/0x1a9 [ 3.509216] [<ffffffff81a9772b>] ? kernel_init+0x1d2/0x1e2 [ 3.509335] [<ffffffff81003894>] ? kernel_thread_helper+0x4/0x10 [ 3.509442] [<ffffffff814f6880>] ? restore_args+0x0/0x30 [ 3.509542] [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2 [ 3.509641] [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10 after: [ 3.522991] Call Trace: [ 3.523351] [<ffffffff814f35b9>] panic+0x91/0x199 [ 3.523468] [<ffffffff814f3729>] ? printk+0x68/0x6a [ 3.523576] [<ffffffff81a981b2>] mount_block_root+0x257/0x26e [ 3.523681] [<ffffffff81a9821f>] mount_root+0x56/0x5a [ 3.523780] [<ffffffff81a98393>] prepare_namespace+0x170/0x1a9 [ 3.523885] [<ffffffff81a9772b>] kernel_init+0x1d2/0x1e2 [ 3.523987] [<ffffffff81003894>] kernel_thread_helper+0x4/0x10 [ 3.524228] [<ffffffff814f6880>] ? restore_args+0x0/0x30 [ 3.524345] [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2 [ 3.524445] [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10 -v5: * fix build breakage with oprofile -v4: * use 0 instead of regs->bp * separate out printk changes -v3: * apply comment from Frederic * add a couple of printk fixes Signed-off-by: Namhyung Kim <namhyung@gmail.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Soren Sandmann <ssp@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Robert Richter <robert.richter@amd.com> LKML-Reference: <1300416006-3163-1-git-send-email-namhyung@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-18x86: Fix common misspellingsLucas De Marchi25-35/+35
They were generated by 'codespell' and then manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi> Cc: trivial@kernel.org LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-18Merge branch 'linus' into x86/urgentIngo Molnar58-1132/+2061
Merge reason: Merge upstream commits to avoid conflicts in upcoming patches. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-17Merge branch 'kvm-updates/2.6.39' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-1/+1
* 'kvm-updates/2.6.39' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (55 commits) KVM: unbreak userspace that does not sets tss address KVM: MMU: cleanup pte write path KVM: MMU: introduce a common function to get no-dirty-logged slot KVM: fix rcu usage in init_rmode_* functions KVM: fix kvmclock regression due to missing clock update KVM: emulator: Fix permission checking in io permission bitmap KVM: emulator: Fix io permission checking for 64bit guest KVM: SVM: Load %gs earlier if CONFIG_X86_32_LAZY_GS=n KVM: x86: Remove useless regs_page pointer from kvm_lapic KVM: improve comment on rcu use in irqfd_deassign KVM: MMU: remove unused macros KVM: MMU: cleanup page alloc and free KVM: MMU: do not record gfn in kvm_mmu_pte_write KVM: MMU: move mmu pages calculated out of mmu lock KVM: MMU: set spte accessed bit properly KVM: MMU: fix kvm_mmu_slot_remove_write_access dropping intermediate W bits KVM: Start lock documentation KVM: better readability of efer_reserved_bits KVM: Clear async page fault hash after switching to real mode KVM: VMX: Initialize vm86 TSS only once. ...
2011-03-17Merge branch 'next' of ↵Linus Torvalds2-4/+1
git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] pcc-cpufreq: remove duplicate statements [CPUFREQ] Remove the pm_message_t argument from driver suspend [CPUFREQ] Remove unneeded locks [CPUFREQ] Remove old, deprecated per cpu ondemand/conservative sysfs files [CPUFREQ] Remove deprecated sysfs file sampling_rate_max [CPUFREQ] powernow-k8: The table index is not worth displaying [CPUFREQ] calculate delay after dbs_check_cpu [CPUFREQ] Add documentation for sampling_down_factor [CPUFREQ] drivers/cpufreq: Remove unnecessary semicolons
2011-03-17Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bpLinus Torvalds1-1/+1
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: (38 commits) amd64_edac: Fix decode_syndrome types amd64_edac: Fix DCT argument type amd64_edac: Fix ranges signedness amd64_edac: Drop local variable amd64_edac: Fix PCI config addressing types amd64_edac: Fix DRAM base macros amd64_edac: Fix node id signedness amd64_edac: Drop redundant declarations amd64_edac: Enable driver on F15h amd64_edac: Adjust ECC symbol size to F15h amd64_edac: Simplify scrubrate setting PCI: Rename CPU PCI id define amd64_edac: Improve DRAM address mapping amd64_edac: Sanitize ->read_dram_ctl_register amd64_edac: Adjust sys_addr to chip select conversion routine to F15h amd64_edac: Beef up early exit reporting amd64_edac: Revamp online spare handling amd64_edac: Fix channel interleave removal amd64_edac: Correct node interleaving removal amd64_edac: Add support for interleaved region swapping ... Fix up trivial conflict in include/linux/pci_ids.h due to AMD_15H_NB_MISC being renamed as AMD_15H_NB_F3 next to the new AMD_15H_NB_LINK entry.
2011-03-17KVM guest: Fix section mismatch derived from kvm_guest_cpu_online()Sedat Dilek1-1/+1
WARNING: arch/x86/built-in.o(.text+0x1bb74): Section mismatch in reference from the function kvm_guest_cpu_online() to the function .cpuinit.text:kvm_guest_cpu_init() The function kvm_guest_cpu_online() references the function __cpuinit kvm_guest_cpu_init(). This is often because kvm_guest_cpu_online lacks a __cpuinit annotation or the annotation of kvm_guest_cpu_init is wrong. This patch fixes the warning. Tested with linux-next (next-20101231) Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-17PCI: Rename CPU PCI id defineBorislav Petkov1-1/+1
With increasing number of PCI function ids, add the PCI function id in the define name instead of its symbolic name in the BKDG for more clarity. Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-16[CPUFREQ] pcc-cpufreq: remove duplicate statementsChumbalkar, Nagananda1-2/+0
Remove a couple of assigment statements that appear twice. Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Signed-off-by: Dave Jones <davej@redhat.com>
2011-03-16[CPUFREQ] powernow-k8: The table index is not worth displayingThomas Renninger1-2/+1
and it also is misleading due to another message above which makes the index look like it is the CPU. https://bugzilla.kernel.org/show_bug.cgi?id=24562 Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com> CC: cpufreq@vger.kernel.org
2011-03-16Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds4-10/+26
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, AMD: Set ARAT feature on AMD processors x86, quirk: Fix SB600 revision check x86: stop_machine_text_poke() should issue sync_core() x86, amd-nb: Misc cleanliness fixes
2011-03-16Merge branch 'x86-trampoline-for-linus' of ↵Linus Torvalds19-258/+278
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-trampoline-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Fix binutils-2.21 symbol related build failures x86-64, trampoline: Remove unused variable x86, reboot: Fix the use of passed arguments in 32-bit BIOS reboot x86, reboot: Move the real-mode reboot code to an assembly file x86: Make the GDT_ENTRY() macro in <asm/segment.h> safe for assembly x86, trampoline: Use the unified trampoline setup for ACPI wakeup x86, trampoline: Common infrastructure for low memory trampolines Fix up trivial conflicts in arch/x86/kernel/Makefile
2011-03-16Merge branch 'for-linus' of ↵Linus Torvalds1-3/+2
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: (21 commits) PM / Hibernate: Reduce autotuned default image size PM / Core: Introduce struct syscore_ops for core subsystems PM PM QoS: Make pm_qos settings readable PM / OPP: opp_find_freq_exact() documentation fix PM: Documentation/power/states.txt: fix repetition PM: Make system-wide PM and runtime PM treat subsystems consistently PM: Simplify kernel/power/Kconfig PM: Add support for device power domains PM: Drop pm_flags that is not necessary PM: Allow pm_runtime_suspend() to succeed during system suspend PM: Clean up PM_TRACE dependencies and drop unnecessary Kconfig option PM: Remove CONFIG_PM_OPS PM: Reorder power management Kconfig options PM: Make CONFIG_PM depend on (CONFIG_PM_SLEEP || CONFIG_PM_RUNTIME) PM / ACPI: Remove references to pm_flags from bus.c PM: Do not create wakeup sysfs files for devices that cannot wake up USB / Hub: Do not call device_set_wakeup_capable() under spinlock PM: Use appropriate printk() priority level in trace.c PM / Wakeup: Don't update events_check_enabled in pm_get_wakeup_count() PM / Wakeup: Make pm_save_wakeup_count() work as documented ...