summaryrefslogtreecommitdiffstats
path: root/arch/x86
AgeCommit message (Collapse)AuthorFilesLines
2013-01-29x86, boot: Pass cmd_line_ptr with unsigned long insteadYinghai Lu2-6/+6
boot/compressed/misc.c is used for bzImage in 64bit and 32bit, and cmd_line_ptr could point to buffer that is above 4g, cmd_line_ptr should be 64bit otherwise high 32bit will be capped out. So need to change data type to unsigned long, that will be 64bit get correct address of command line buffer. And it is still ok with 32bit bzImage, because unsigned long on 32bit kernel is still 32bit. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-19-git-send-email-yinghai@kernel.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29x86, boot: Move checking of cmd_line_ptr out of common pathYinghai Lu2-6/+16
cmdline.c::__cmdline_find_option... are shared between 16-bit setup code and 32/64 bit decompressor code. for 32/64 only path via kexec, we should not check if ptr is less 1M. as those cmdline could be put above 1M, or even 4G. Move out accessible checking out of __cmdline_find_option() So decompressor in misc.c can parse cmdline correctly. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-18-git-send-email-yinghai@kernel.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29x86, boot: Add get_cmd_line_ptr()Yinghai Lu2-4/+19
Add an accessor function for the command line address. Later we will add support for holding a 64-bit address via ext_cmd_line_ptr. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-17-git-send-email-yinghai@kernel.org Cc: Gokul Caushik <caushik1@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Joe Millenbach <jmillenbach@gmail.com> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29x86: Add get_ramdisk_image/size()Yinghai Lu1-8/+21
There are several places to find ramdisk information early for reserving and relocating. Use accessor functions to make code more readable and consistent. Later will add ext_ramdisk_image/size in those functions to support loading ramdisk above 4g. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-16-git-send-email-yinghai@kernel.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29x86: Merge early_reserve_initrd for 32bit and 64bitYinghai Lu3-26/+18
They are the same, could move them out from head32/64.c to setup.c. We are using memblock, and it could handle overlapping properly, so we don't need to reserve some at first to hold the location, and just need to make sure we reserve them before we are using memblock to find free mem to use. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-15-git-send-email-yinghai@kernel.org Reviewed-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29x86, 64bit: Don't set max_pfn_mapped wrong value early on native pathYinghai Lu3-4/+11
We are not having max_pfn_mapped set correctly until init_memory_mapping. So don't print its initial value for 64bit Also need to use KERNEL_IMAGE_SIZE directly for highmap cleanup. -v2: update comments about max_pfn_mapped according to Stefano Stabellini. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-14-git-send-email-yinghai@kernel.org Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29x86, 64bit: #PF handler set page to cover only 2M per #PFYinghai Lu1-17/+25
We only map a single 2 MiB page per #PF, even though we should be able to do this a full gigabyte at a time with no additional memory cost. This is a workaround for a broken AMD reference BIOS (and its derivatives in shipping system) which maps a large chunk of memory as WB in the MTRR system but will #MC if the processor wanders off and tries to prefetch that memory, which can happen any time the memory is mapped in the TLB. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-13-git-send-email-yinghai@kernel.org Cc: Alexander Duyck <alexander.h.duyck@intel.com> [ hpa: rewrote the patch description ] Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29x86, 64bit: Use a #PF handler to materialize early mappings on demandH. Peter Anvin7-91/+219
Linear mode (CR0.PG = 0) is mutually exclusive with 64-bit mode; all 64-bit code has to use page tables. This makes it awkward before we have first set up properly all-covering page tables to access objects that are outside the static kernel range. So far we have dealt with that simply by mapping a fixed amount of low memory, but that fails in at least two upcoming use cases: 1. We will support load and run kernel, struct boot_params, ramdisk, command line, etc. above the 4 GiB mark. 2. need to access ramdisk early to get microcode to update that as early possible. We could use early_iomap to access them too, but it will make code to messy and hard to be unified with 32 bit. Hence, set up a #PF table and use a fixed number of buffers to set up page tables on demand. If the buffers fill up then we simply flush them and start over. These buffers are all in __initdata, so it does not increase RAM usage at runtime. Thus, with the help of the #PF handler, we can set the final kernel mapping from blank, and switch to init_level4_pgt later. During the switchover in head_64.S, before #PF handler is available, we use three pages to handle kernel crossing 1G, 512G boundaries with sharing page by playing games with page aliasing: the same page is mapped twice in the higher-level tables with appropriate wraparound. The kernel region itself will be properly mapped; other mappings may be spurious. early_make_pgtable is using kernel high mapping address to access pages to set page table. -v4: Add phys_base offset to make kexec happy, and add init_mapping_kernel() - Yinghai -v5: fix compiling with xen, and add back ident level3 and level2 for xen also move back init_level4_pgt from BSS to DATA again. because we have to clear it anyway. - Yinghai -v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai -v7: remove not needed clear_page for init_level4_page it is with fill 512,8,0 already in head_64.S - Yinghai -v8: we need to keep that handler alive until init_mem_mapping and don't let early_trap_init to trash that early #PF handler. So split early_trap_pf_init out and move it down. - Yinghai -v9: switchover only cover kernel space instead of 1G so could avoid touch possible mem holes. - Yinghai -v11: change far jmp back to far return to initial_code, that is needed to fix failure that is reported by Konrad on AMD systems. - Yinghai Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-12-git-send-email-yinghai@kernel.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29x86, realmode: Separate real_mode reserve and setupYinghai Lu3-14/+25
After we switch to use #PF handler help to set page table, init_level4_pgt will only have entries set after init_mem_mapping(). We need to move copying init_level4_pgt to trampoline_pgd after that. So split reserve and setup, and move the setup after init_mem_mapping() Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-11-git-send-email-yinghai@kernel.org Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Acked-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29x86, 64bit, realmode: Use init_level4_pgt to set trampoline_pgd directlyYinghai Lu1-2/+2
with #PF handler way to set early page table, level3_ident will go away with 64bit native path. So just use entries in init_level4_pgt to set them in trampoline_pgd. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-10-git-send-email-yinghai@kernel.org Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Acked-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29x86, 64bit: Copy struct boot_params earlyYinghai Lu1-1/+5
We want to support struct boot_params (formerly known as the zero-page, or real-mode data) above the 4 GiB mark. We will have #PF handler to set page table for not accessible ram early, but want to limit it before x86_64_start_reservations to limit the code change to native path only. Also we will need the ramdisk info in struct boot_params to access the microcode blob in ramdisk in x86_64_start_kernel, so copy struct boot_params early makes it accessing ramdisk info simple. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-9-git-send-email-yinghai@kernel.org Cc: Alexander Duyck <alexander.h.duyck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29x86, 64bit, mm: Add generic kernel/ident mapping helperYinghai Lu2-0/+83
It is simple version for kernel_physical_mapping_init. it will work to build one page table that will be used later. Use mapping_info to control 1. alloc_pg_page method 2. if PMD is EXEC, 3. if pgd is with kernel low mapping or ident mapping. Will use to replace some local versions in kexec, hibernation and etc. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-8-git-send-email-yinghai@kernel.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29x86, realmode: Set real_mode permissions earlyYinghai Lu1-5/+6
Trampoline code is executed by APs with kernel low mapping on 64bit. We need to set trampoline code to EXEC early before we boot APs. Found the problem after switching to #PF handler set page table, and we do not set initial kernel low mapping with EXEC anymore in arch/x86/kernel/head_64.S. Change to use early_initcall instead that will make sure trampoline will have EXEC set. -v2: Merge two comments according to Borislav Petkov <bp@alien8.de> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-7-git-send-email-yinghai@kernel.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29x86, 64bit, mm: Make pgd next calculation consistent with pud/pmdYinghai Lu1-4/+2
Just like the way we calculate next for pud and pmd, aka round down and add size. Also, do not do boundary-checking with 'next', and just pass 'end' down to phys_pud_init() instead. Because the loop in phys_pud_init() stops at PTRS_PER_PUD and thus can handle a possibly bigger 'end' properly. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-6-git-send-email-yinghai@kernel.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29x86: Factor out e820_add_kernel_range()Yinghai Lu1-14/+22
Separate out the reservation of the kernel static memory areas into a separate function. Also add support for case when memmap=xxM$yyM is used without exactmap. Need to remove reserved range at first before we add E820_RAM range, otherwise added E820_RAM range will be ignored. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-5-git-send-email-yinghai@kernel.org Cc: Jacob Shin <jacob.shin@amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29x86, mm: Fix page table early allocation offset checkingYinghai Lu1-1/+12
During debugging loading kernel above 4G, found that one page is not used in pre-allocated BRK area for early page allocation. pgt_buf_top is address that can not be used, so should check if that new end is above that top, otherwise last page will not be used. Fix that checking and also add print out for allocation from pre-allocated BRK area to catch possible bugs later. But after we get back that page for pgt, it tiggers one bug in pgt allocation with xen: We need to avoid to use page as pgt to map range that is overlapping with that pgt page. Add checking about overlapping, when it happens, use memblock allocation instead. That fixes crash on Xen PV guest with 2G that Stefan found. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-2-git-send-email-yinghai@kernel.org Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tested-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29Merge remote-tracking branch 'origin/x86/boot' into x86/mm2H. Peter Anvin275-4042/+8321
Coming patches to x86/mm2 require the changes and advanced baseline in x86/boot. Resolved Conflicts: arch/x86/kernel/setup.c mm/nobootmem.c Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29x86, boot: Sanitize boot_params if not zeroed on creationH. Peter Anvin5-0/+46
Use the new sentinel field to detect bootloaders which fail to follow protocol and don't initialize fields in struct boot_params that they do not explicitly initialize to zero. Based on an original patch and research by Yinghai Lu. Changed by hpa to be invoked both in the decompression path and in the kernel proper; the latter for the case where a bootloader takes over decompression. Originally-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-26-git-send-email-yinghai@kernel.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-27x86, boot: Define the 2.12 bzImage boot protocolH. Peter Anvin3-28/+76
Define the 2.12 bzImage boot protocol: add xloadflags and additional fields to allow the command line, initramfs and struct boot_params to live above the 4 GiB mark. The xloadflags now communicates if this is a 64-bit kernel with the legacy 64-bit entry point and which of the EFI handover entry points are supported. Avoid adding new read flags to loadflags because of claimed bootloaders testing the whole byte for == 1 to determine bzImageness at least until the issue can be researched further. This is based on patches by Yinghai Lu and David Woodhouse. Originally-by: Yinghai Lu <yinghai@kernel.org> Originally-by: David Woodhouse <dwmw2@infradead.org> Acked-by: Yinghai Lu <yinghai@kernel.org> Acked-by: David Woodhouse <dwmw2@infradead.org> Acked-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1359058816-7615-26-git-send-email-yinghai@kernel.org Cc: Rob Landley <rob@landley.net> Cc: Gokul Caushik <caushik1@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Joe Millenbach <jmillenbach@gmail.com>
2013-01-27x86/boot: Fix minor fd leakage in tools/relocs.cCong Ding1-2/+4
The opened file should be closed. Signed-off-by: Cong Ding <dinggnu@gmail.com> Cc: Kusanagi Kouichi <slash@ac.auone-net.jp> Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Matt Fleming <matt.fleming@intel.com> Link: http://lkml.kernel.org/r/1358183628-27784-1-git-send-email-dinggnu@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-01-22Merge tag 'perf-urgent-for-mingo' of ↵Linus Torvalds1-6/+0
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf/urgent fixes from Arnaldo Carvalho de Melo: . revert 20b279 - require exclude_guest to use PEBS - kernel side, now older binaries will continue working for things like cycles:pp without needing to pass extra modifiers, from David Ahern. . Fix building from 'make perf-*-src-pkg' tarballs, broken by UAPI, from Sebastian Andrzej Siewior [ Pulling directly, Ingo would normally pull but has been unresponsive ] * tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf tools: Fix building from 'make perf-*-src-pkg' tarballs perf x86: revert 20b279 - require exclude_guest to use PEBS - kernel side
2013-01-22ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILLOleg Nesterov1-4/+5
putreg() assumes that the tracee is not running and pt_regs_access() can safely play with its stack. However a killed tracee can return from ptrace_stop() to the low-level asm code and do RESTORE_REST, this means that debugger can actually read/modify the kernel stack until the tracee does SAVE_REST again. set_task_blockstep() can race with SIGKILL too and in some sense this race is even worse, the very fact the tracee can be woken up breaks the logic. As Linus suggested we can clear TASK_WAKEKILL around the arch_ptrace() call, this ensures that nobody can ever wakeup the tracee while the debugger looks at it. Not only this fixes the mentioned problems, we can do some cleanups/simplifications in arch_ptrace() paths. Probably ptrace_unfreeze_traced() needs more callers, for example it makes sense to make the tracee killable for oom-killer before access_process_vm(). While at it, add the comment into may_ptrace_stop() to explain why ptrace_stop() still can't rely on SIGKILL and signal_pending_state(). Reported-by: Salman Qazi <sqazi@google.com> Reported-by: Suleiman Souhlal <suleiman@google.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-18Merge tag 'stable/for-linus-3.8-rc3-tag' of ↵Linus Torvalds2-8/+0
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen fixes from Konrad Rzeszutek Wilk: - CVE-2013-0190/XSA-40 (or stack corruption for 32-bit PV kernels) - Fix racy vma access spotted by Al Viro - Fix mmap batch ioctl potentially resulting in large O(n) page allcations. - Fix vcpu online/offline BUG:scheduling while atomic.. - Fix unbound buffer scanning for more than 32 vCPUs. - Fix grant table being incorrectly initialized - Fix incorrect check in pciback - Allow privcmd in backend domains. Fix up whitespace conflict due to ugly merge resolution in Xen tree in arch/arm/xen/enlighten.c * tag 'stable/for-linus-3.8-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: Fix stack corruption in xen_failsafe_callback for 32bit PVOPS guests. Revert "xen/smp: Fix CPU online/offline bug triggering a BUG: scheduling while atomic." xen/gntdev: remove erronous use of copy_to_user xen/gntdev: correctly unmap unlinked maps in mmu notifier xen/gntdev: fix unsafe vma access xen/privcmd: Fix mmap batch ioctl. Xen: properly bound buffer access when parsing cpu/*/availability xen/grant-table: correctly initialize grant table version 1 x86/xen : Fix the wrong check in pciback xen/privcmd: Relax access control in privcmd_ioctl_mmap
2013-01-16xen: Fix stack corruption in xen_failsafe_callback for 32bit PVOPS guests.Andrew Cooper1-1/+0
This fixes CVE-2013-0190 / XSA-40 There has been an error on the xen_failsafe_callback path for failed iret, which causes the stack pointer to be wrong when entering the iret_exc error path. This can result in the kernel crashing. In the classic kernel case, the relevant code looked a little like: popl %eax # Error code from hypervisor jz 5f addl $16,%esp jmp iret_exc # Hypervisor said iret fault 5: addl $16,%esp # Hypervisor said segment selector fault Here, there are two identical addls on either option of a branch which appears to have been optimised by hoisting it above the jz, and converting it to an lea, which leaves the flags register unaffected. In the PVOPS case, the code looks like: popl_cfi %eax # Error from the hypervisor lea 16(%esp),%esp # Add $16 before choosing fault path CFI_ADJUST_CFA_OFFSET -16 jz 5f addl $16,%esp # Incorrectly adjust %esp again jmp iret_exc It is possible unprivileged userspace applications to cause this behaviour, for example by loading an LDT code selector, then changing the code selector to be not-present. At this point, there is a race condition where it is possible for the hypervisor to return back to userspace from an interrupt, fault on its own iret, and inject a failsafe_callback into the kernel. This bug has been present since the introduction of Xen PVOPS support in commit 5ead97c84 (xen: Core Xen implementation), in 2.6.23. Signed-off-by: Frediano Ziglio <frediano.ziglio@citrix.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Cc: stable@vger.kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-01-16Merge branch 'x86/urgent' of ↵Linus Torvalds2-1/+81
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "This is mainly a workaround for a bug in Sandy Bridge graphics which causes corruption of certain memory pages." * 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/Sandy Bridge: Sandy Bridge workaround depends on CONFIG_PCI x86/Sandy Bridge: mark arrays in __init functions as __initconst x86/Sandy Bridge: reserve pages when integrated graphics is present x86, efi: correct precedence of operators in setup_efi_pci
2013-01-15Revert "xen/smp: Fix CPU online/offline bug triggering a BUG: scheduling ↵Konrad Rzeszutek Wilk1-7/+0
while atomic." This reverts commit 41bd956de3dfdc3a43708fe2e0c8096c69064a1e. The fix is incorrect and not appropiate for the latest kernels. In fact it _causes_ the BUG: scheduling while atomic while doing vCPU hotplug. Suggested-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-01-15Merge tag 'v3.7' into stable/for-linus-3.8Konrad Rzeszutek Wilk23-94/+226
Linux 3.7 * tag 'v3.7': (833 commits) Linux 3.7 Input: matrix-keymap - provide proper module license Revert "revert "Revert "mm: remove __GFP_NO_KSWAPD""" and associated damage ipv4: ip_check_defrag must not modify skb before unsharing Revert "mm: avoid waking kswapd for THP allocations when compaction is deferred or contended" inet_diag: validate port comparison byte code to prevent unsafe reads inet_diag: avoid unsafe and nonsensical prefix matches in inet_diag_bc_run() inet_diag: validate byte code to prevent oops in inet_diag_bc_run() inet_diag: fix oops for IPv4 AF_INET6 TCP SYN-RECV state mm: vmscan: fix inappropriate zone congestion clearing vfs: fix O_DIRECT read past end of block device net: gro: fix possible panic in skb_gro_receive() tcp: bug fix Fast Open client retransmission tmpfs: fix shared mempolicy leak mm: vmscan: do not keep kswapd looping forever due to individual uncompactable zones mm: compaction: validate pfn range passed to isolate_freepages_block mmc: sh-mmcif: avoid oops on spurious interrupts (second try) Revert misapplied "mmc: sh-mmcif: avoid oops on spurious interrupts" mmc: sdhci-s3c: fix missing clock for gpio card-detect lib/Makefile: Fix oid_registry build dependency ... Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Conflicts: arch/arm/xen/enlighten.c drivers/xen/Makefile [We need to have the v3.7 base as the 'for-3.8' was based off v3.7-rc3 and there are some patches in v3.7-rc6 that we to have in our branch]
2013-01-13x86/Sandy Bridge: Sandy Bridge workaround depends on CONFIG_PCIH. Peter Anvin1-0/+2
early_pci_allowed() and read_pci_config_16() are only available if CONFIG_PCI is defined. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
2013-01-13x86/Sandy Bridge: mark arrays in __init functions as __initconstH. Peter Anvin1-2/+2
Mark static arrays as __initconst so they get removed when the init sections are flushed. Reported-by: Mathias Krause <minipli@googlemail.com> Link: http://lkml.kernel.org/r/75F4BEE6-CB0E-4426-B40B-697451677738@googlemail.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-11x86/Sandy Bridge: reserve pages when integrated graphics is presentJesse Barnes1-0/+78
SNB graphics devices have a bug that prevent them from accessing certain memory ranges, namely anything below 1M and in the pages listed in the table. So reserve those at boot if set detect a SNB gfx device on the CPU to avoid GPU hangs. Stephane Marchesin had a similar patch to the page allocator awhile back, but rather than reserving pages up front, it leaked them at allocation time. [ hpa: made a number of stylistic changes, marked arrays as static const, and made less verbose; use "memblock=debug" for full verbosity. ] Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-10Merge git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2-8/+28
Pull KVM bugfixes from Marcelo Tosatti. * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: use dynamic percpu allocations for shared msrs area KVM: PPC: Book3S HV: Fix compilation without CONFIG_PPC_POWERNV powerpc: Corrected include header path in kvm_para.h Add rcu user eqs exception hooks for async page fault
2013-01-10perf x86: revert 20b279 - require exclude_guest to use PEBS - kernel sideDavid Ahern1-6/+0
This patch is brought to you by the letter 'H'. Commit 20b279 breaks compatiblity with older perf binaries when run with precise modifier (:p or :pp) by requiring the exclude_guest attribute to be set. Older binaries default exclude_guest to 0 (ie., wanting guest-based samples) unless host only profiling is requested (:H modifier). The workaround for older binaries is to add H to the modifier list (e.g., -e cycles:ppH - toggles exclude_guest to 1). This was deemed unacceptable by Linus: https://lkml.org/lkml/2012/12/12/570 Between family in town and the fresh snow in Breckenridge there is no time left to be working on the proper fix for this over the holidays. In the New Year I have more pressing problems to resolve -- like some memory leaks in perf which are proving to be elusive -- although the aforementioned snow is probably why they are proving to be elusive. Either way I do not have any spare time to work on this and from the time I have managed to spend on it the solution is more difficult than just moving to a new exclude_guest flag (does not work) or flipping the logic to include_guest (which is not as trivial as one would think). So, two options: silently force exclude_guest on as suggested by Gleb which means no impact to older perf binaries or revert the original patch which caused the breakage. This patch does the latter -- reverts the original patch that introduced the regression. The problem can be revisited in the future as time allows. Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Avi Kivity <avi@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@amd.com> Link: http://lkml.kernel.org/r/1356749767-17322-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-01-08KVM: x86: use dynamic percpu allocations for shared msrs areaMarcelo Tosatti1-6/+18
Use dynamic percpu allocations for the shared msrs structure, to avoid using the limited reserved percpu space. Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-01-03X86: drivers: remove __dev* attributes.Greg Kroah-Hartman21-89/+84
CONFIG_HOTPLUG is going away as an option. As a result, the __dev* markings need to be removed. This change removes the use of __devinit, __devexit_p, __devinitconst, and __devexit from these drivers. Based on patches originally written by Bill Pemberton, but redone by me in order to handle some of the coding style issues better, by hand. Cc: Bill Pemberton <wfp5p@virginia.edu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Daniel Drake <dsd@laptop.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-12-26PCI: Work around Stratus ftServer broken PCIe hierarchy (fix DMI check)Myron Stowe1-1/+2
Commit 284f5f9 was intended to disable the "only_one_child()" optimization on Stratus ftServer systems, but its DMI check is wrong. It looks for DMI_SYS_VENDOR that contains "ftServer", when it should look for DMI_SYS_VENDOR containing "Stratus" and DMI_PRODUCT_NAME containing "ftServer". Tested on Stratus ftServer 6400. Reported-by: Fadeeva Marina <astarta@rat.ru> Reference: https://bugzilla.kernel.org/show_bug.cgi?id=51331 Signed-off-by: Myron Stowe <myron.stowe@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: stable@vger.kernel.org # v3.5+
2012-12-20Merge branch 'for-linus' of ↵Linus Torvalds19-118/+24
git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull signal handling cleanups from Al Viro: "sigaltstack infrastructure + conversion for x86, alpha and um, COMPAT_SYSCALL_DEFINE infrastructure. Note that there are several conflicts between "unify SS_ONSTACK/SS_DISABLE definitions" and UAPI patches in mainline; resolution is trivial - just remove definitions of SS_ONSTACK and SS_DISABLED from arch/*/uapi/asm/signal.h; they are all identical and include/uapi/linux/signal.h contains the unified variant." Fixed up conflicts as per Al. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: alpha: switch to generic sigaltstack new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to those generic compat_sys_sigaltstack() introduce generic sys_sigaltstack(), switch x86 and um to it new helper: compat_user_stack_pointer() new helper: restore_altstack() unify SS_ONSTACK/SS_DISABLE definitions new helper: current_user_stack_pointer() missing user_stack_pointer() instances Bury the conditionals from kernel_thread/kernel_execve series COMPAT_SYSCALL_DEFINE: infrastructure
2012-12-20x86, efi: correct precedence of operators in setup_efi_pciSasha Levin1-1/+1
With the current code, the condition in the if() doesn't make much sense due to precedence of operators. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Link: http://lkml.kernel.org/r/1356030701-16284-25-git-send-email-sasha.levin@oracle.com Cc: Matt Fleming <matt.fleming@intel.com> Cc: Matthew Garrett <mjg@redhat.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-12-20Merge tag 'iommu-updates-v3.8' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "A few new features this merge-window. The most important one is probably, that dma-debug now warns if a dma-handle is not checked with dma_mapping_error by the device driver. This requires minor changes to some architectures which make use of dma-debug. Most of these changes have the respective Acks by the Arch-Maintainers. Besides that there are updates to the AMD IOMMU driver for refactor the IOMMU-Groups support and to make sure it does not trigger a hardware erratum. The OMAP changes (for which I pulled in a branch from Tony Lindgren's tree) have a conflict in linux-next with the arm-soc tree. The conflict is in the file arch/arm/mach-omap2/clock44xx_data.c which is deleted in the arm-soc tree. It is safe to delete the file too so solve the conflict. Similar changes are done in the arm-soc tree in the common clock framework migration. A missing hunk from the patch in the IOMMU tree will be submitted as a seperate patch when the merge-window is closed." * tag 'iommu-updates-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (29 commits) ARM: dma-mapping: support debug_dma_mapping_error ARM: OMAP4: hwmod data: ipu and dsp to use parent clocks instead of leaf clocks iommu/omap: Adapt to runtime pm iommu/omap: Migrate to hwmod framework iommu/omap: Keep mmu enabled when requested iommu/omap: Remove redundant clock handling on ISR iommu/amd: Remove obsolete comment iommu/amd: Don't use 512GB pages iommu/tegra: smmu: Move bus_set_iommu after probe for multi arch iommu/tegra: gart: Move bus_set_iommu after probe for multi arch iommu/tegra: smmu: Remove unnecessary PTC/TLB flush all tile: dma_debug: add debug_dma_mapping_error support sh: dma_debug: add debug_dma_mapping_error support powerpc: dma_debug: add debug_dma_mapping_error support mips: dma_debug: add debug_dma_mapping_error support microblaze: dma-mapping: support debug_dma_mapping_error ia64: dma_debug: add debug_dma_mapping_error support c6x: dma_debug: add debug_dma_mapping_error support ARM64: dma_debug: add debug_dma_mapping_error support intel-iommu: Prevent devices with RMRRs from being placed into SI Domain ...
2012-12-19new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to thoseAl Viro3-25/+7
note that they are relying on access_ok() already checked by caller. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19generic compat_sys_sigaltstack()Al Viro8-67/+6
Again, conditional on CONFIG_GENERIC_SIGALTSTACK Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19introduce generic sys_sigaltstack(), switch x86 and um to itAl Viro10-16/+4
Conditional on CONFIG_GENERIC_SIGALTSTACK; architectures that do not select it are completely unaffected Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19new helper: compat_user_stack_pointer()Al Viro1-0/+7
Compat counterpart of current_user_stack_pointer(); for most of the biarch architectures those two are identical, but e.g. arm64 and arm use different registers for stack pointer... Note that amd64 variants of current_user_stack_pointer/compat_user_stack_pointer do *not* rely on pt_regs having been through FIXUP_TOP_OF_STACK. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19unify SS_ONSTACK/SS_DISABLE definitionsAl Viro1-6/+0
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19missing user_stack_pointer() instancesAl Viro1-0/+1
for the architectures that have usp in pt_regs and do not have user_stack_pointer() already defined. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19Bury the conditionals from kernel_thread/kernel_execve seriesAl Viro3-5/+0
All architectures have CONFIG_GENERIC_KERNEL_THREAD CONFIG_GENERIC_KERNEL_EXECVE __ARCH_WANT_SYS_EXECVE None of them have __ARCH_WANT_KERNEL_EXECVE and there are only two callers of kernel_execve() (which is a trivial wrapper for do_execve() now) left. Kill the conditionals and make both callers use do_execve(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19Merge branch 'x86/nuke386' of ↵Linus Torvalds3-52/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull one final 386 removal patch from Peter Anvin. IRQ 13 FPU error handling is gone. That was not one of the proudest moments in PC history. * 'x86/nuke386' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, 386 removal: Remove support for IRQ 13 FPU error reporting
2012-12-19Merge tag 'modules-next-for-linus' of ↵Linus Torvalds2-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module update from Rusty Russell: "Nothing all that exciting; a new module-from-fd syscall for those who want to verify the source of the module (ChromeOS) and/or use standard IMA on it or other security hooks." * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: MODSIGN: Fix kbuild output when using default extra_certificates MODSIGN: Avoid using .incbin in C source modules: don't hand 0 to vmalloc. module: Remove a extra null character at the top of module->strtab. ASN.1: Use the ASN1_LONG_TAG and ASN1_INDEFINITE_LENGTH constants ASN.1: Define indefinite length marker constant moduleparam: use __UNIQUE_ID() __UNIQUE_ID() MODSIGN: Add modules_sign make target powerpc: add finit_module syscall. ima: support new kernel module syscall add finit_module syscall to asm-generic ARM: add finit_module syscall to ARM security: introduce kernel_module_from_file hook module: add flags arg to sys_finit_module() module: add syscall to load module from fd
2012-12-18Merge branch 'akpm' (more patches from Andrew)Linus Torvalds1-10/+57
Merge patches from Andrew Morton: "Most of the rest of MM, plus a few dribs and drabs. I still have quite a few irritating patches left around: ones with dubious testing results, lack of review, ones which should have gone via maintainer trees but the maintainers are slack, etc. I need to be more activist in getting these things wrapped up outside the merge window, but they're such a PITA." * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (48 commits) mm/vmscan.c: avoid possible deadlock caused by too_many_isolated() vmscan: comment too_many_isolated() mm/kmemleak.c: remove obsolete simple_strtoul mm/memory_hotplug.c: improve comments mm/hugetlb: create hugetlb cgroup file in hugetlb_init mm/mprotect.c: coding-style cleanups Documentation: ABI: /sys/devices/system/node/ slub: drop mutex before deleting sysfs entry memcg: add comments clarifying aspects of cache attribute propagation kmem: add slab-specific documentation about the kmem controller slub: slub-specific propagation changes slab: propagate tunable values memcg: aggregate memcg cache values in slabinfo memcg/sl[au]b: shrink dead caches memcg/sl[au]b: track all the memcg children of a kmem_cache memcg: destroy memcg caches sl[au]b: allocate objects from memcg cache sl[au]b: always get the cache from its page in kmem_cache_free() memcg: skip memcg kmem allocations in specified code regions memcg: infrastructure to match an allocation to the right cache ...
2012-12-18arch/x86/platform/iris/iris.c: register a platform device and a platform driverShérab1-10/+57
This makes the iris driver use the platform API, so it is properly exposed in /sys. [akpm@linux-foundation.org: remove commented-out code, add missing space to printk, clean up code layout] Signed-off-by: Shérab <Sebastien.Hinderer@ens-lyon.org> Cc: Len Brown <lenb@kernel.org> Cc: Matthew Garrett <mjg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18Merge branch 'release' of ↵Linus Torvalds1-0/+37
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull powertool update from Len Brown: "This updates the tree w/ the latest version of turbostat, which reports temperature and - on SNB and later - Watts." Fix up semantic merge conflict as per Len. * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools: Allow tools to be installed in a user specified location tools/power: turbostat: make Makefile a bit more capable tools/power x86_energy_perf_policy: close /proc/stat in for_every_cpu() tools/power turbostat: v3.0: monitor Watts and Temperature tools/power turbostat: fix output buffering issue tools/power turbostat: prevent infinite loop on migration error path x86 power: define RAPL MSRs tools/power/x86/turbostat: share kernel MSR #defines