summaryrefslogtreecommitdiffstats
path: root/arch/x86
AgeCommit message (Collapse)AuthorFilesLines
2016-12-12Merge tag 'docs-4.10' of git://git.lwn.net/linuxLinus Torvalds1-1/+1
Pull documentation update from Jonathan Corbet: "These are the documentation changes for 4.10. It's another busy cycle for the docs tree, as the sphinx conversion continues. Highlights include: - Further work on PDF output, which remains a bit of a pain but should be more solid now. - Five more DocBook template files converted to Sphinx. Only 27 to go... Lots of plain-text files have also been converted and integrated. - Images in binary formats have been replaced with more source-friendly versions. - Various bits of organizational work, including the renaming of various files discussed at the kernel summit. - New documentation for the device_link mechanism. ... and, of course, lots of typo fixes and small updates" * tag 'docs-4.10' of git://git.lwn.net/linux: (193 commits) dma-buf: Extract dma-buf.rst Update Documentation/00-INDEX docs: 00-INDEX: document directories/files with no docs docs: 00-INDEX: remove non-existing entries docs: 00-INDEX: add missing entries for documentation files/dirs docs: 00-INDEX: consolidate process/ and admin-guide/ description scripts: add a script to check if Documentation/00-INDEX is sane Docs: change sh -> awk in REPORTING-BUGS Documentation/core-api/device_link: Add initial documentation core-api: remove an unexpected unident ppc/idle: Add documentation for powersave=off Doc: Correct typo, "Introdution" => "Introduction" Documentation/atomic_ops.txt: convert to ReST markup Documentation/local_ops.txt: convert to ReST markup Documentation/assoc_array.txt: convert to ReST markup docs-rst: parse-headers.pl: cleanup the documentation docs-rst: fix media cleandocs target docs-rst: media/Makefile: reorganize the rules docs-rst: media: build SVG from graphviz files docs-rst: replace bayer.png by a SVG image ...
2016-12-12Merge branch 'akpm' (patches from Andrew)Linus Torvalds2-1/+25
Merge updates from Andrew Morton: - various misc bits - most of MM (quite a lot of MM material is awaiting the merge of linux-next dependencies) - kasan - printk updates - procfs updates - MAINTAINERS - /lib updates - checkpatch updates * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (123 commits) init: reduce rootwait polling interval time to 5ms binfmt_elf: use vmalloc() for allocation of vma_filesz checkpatch: don't emit unified-diff error for rename-only patches checkpatch: don't check c99 types like uint8_t under tools checkpatch: avoid multiple line dereferences checkpatch: don't check .pl files, improve absolute path commit log test scripts/checkpatch.pl: fix spelling checkpatch: don't try to get maintained status when --no-tree is given lib/ida: document locking requirements a bit better lib/rbtree.c: fix typo in comment of ____rb_erase_color lib/Kconfig.debug: make CONFIG_STRICT_DEVMEM depend on CONFIG_DEVMEM MAINTAINERS: add drm and drm/i915 irc channels MAINTAINERS: add "C:" for URI for chat where developers hang out MAINTAINERS: add drm and drm/i915 bug filing info MAINTAINERS: add "B:" for URI where to file bugs get_maintainer: look for arbitrary letter prefixes in sections printk: add Kconfig option to set default console loglevel printk/sound: handle more message headers printk/btrfs: handle more message headers printk/kdb: handle more message headers ...
2016-12-12Merge branch 'timers-core-for-linus' of ↵Linus Torvalds1-0/+9
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "The time/timekeeping/timer folks deliver with this update: - Fix a reintroduced signed/unsigned issue and cleanup the whole signed/unsigned mess in the timekeeping core so this wont happen accidentaly again. - Add a new trace clock based on boot time - Prevent injection of random sleep times when PM tracing abuses the RTC for storage - Make posix timers configurable for real tiny systems - Add tracepoints for the alarm timer subsystem so timer based suspend wakeups can be instrumented - The usual pile of fixes and updates to core and drivers" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) timekeeping: Use mul_u64_u32_shr() instead of open coding it timekeeping: Get rid of pointless typecasts timekeeping: Make the conversion call chain consistently unsigned timekeeping_Force_unsigned_clocksource_to_nanoseconds_conversion alarmtimer: Add tracepoints for alarm timers trace: Update documentation for mono, mono_raw and boot clock trace: Add an option for boot clock as trace clock timekeeping: Add a fast and NMI safe boot clock timekeeping/clocksource_cyc2ns: Document intended range limitation timekeeping: Ignore the bogus sleep time if pm_trace is enabled selftests/timers: Fix spelling mistake "Asyncrhonous" -> "Asynchronous" clocksource/drivers/bcm2835_timer: Unmap region obtained by of_iomap clocksource/drivers/arm_arch_timer: Map frame with of_io_request_and_map() arm64: dts: rockchip: Arch counter doesn't tick in system suspend clocksource/drivers/arm_arch_timer: Don't assume clock runs in suspend posix-timers: Make them configurable posix_cpu_timers: Move the add_device_randomness() call to a proper place timer: Move sys_alarm from timer.c to itimer.c ptp_clock: Allow for it to be optional Kconfig: Regenerate *.c_shipped files after previous changes ...
2016-12-12Merge branch 'smp-hotplug-for-linus' of ↵Linus Torvalds8-336/+186
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull smp hotplug updates from Thomas Gleixner: "This is the final round of converting the notifier mess to the state machine. The removal of the notifiers and the related infrastructure will happen around rc1, as there are conversions outstanding in other trees. The whole exercise removed about 2000 lines of code in total and in course of the conversion several dozen bugs got fixed. The new mechanism allows to test almost every hotplug step standalone, so usage sites can exercise all transitions extensively. There is more room for improvement, like integrating all the pointlessly different architecture mechanisms of synchronizing, setting cpus online etc into the core code" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) tracing/rb: Init the CPU mask on allocation soc/fsl/qbman: Convert to hotplug state machine soc/fsl/qbman: Convert to hotplug state machine zram: Convert to hotplug state machine KVM/PPC/Book3S HV: Convert to hotplug state machine arm64/cpuinfo: Convert to hotplug state machine arm64/cpuinfo: Make hotplug notifier symmetric mm/compaction: Convert to hotplug state machine iommu/vt-d: Convert to hotplug state machine mm/zswap: Convert pool to hotplug state machine mm/zswap: Convert dst-mem to hotplug state machine mm/zsmalloc: Convert to hotplug state machine mm/vmstat: Convert to hotplug state machine mm/vmstat: Avoid on each online CPU loops mm/vmstat: Drop get_online_cpus() from init_cpu_node_state/vmstat_cpu_dead() tracing/rb: Convert to hotplug state machine oprofile/nmi timer: Convert to hotplug state machine net/iucv: Use explicit clean up labels in iucv_init() x86/pci/amd-bus: Convert to hotplug state machine x86/oprofile/nmi: Convert to hotplug state machine ...
2016-12-12x86/ldt: use vfree_atomic() to free ldt entriesAndrey Ryabinin1-1/+1
vfree() is going to use sleeping lock. free_ldt_struct() may be called with disabled preemption, therefore we must use vfree_atomic() here. E.g. call trace: vfree() free_ldt_struct() destroy_context_ldt() __mmdrop() finish_task_switch() schedule_tail() ret_from_fork() Link: http://lkml.kernel.org/r/1479474236-4139-7-git-send-email-hch@lst.de Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Joel Fernandes <joelaf@google.com> Cc: Jisheng Zhang <jszhang@marvell.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: John Dias <joaodias@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12mm: remove x86-only restriction of movable_nodeReza Arbab1-0/+24
In commit c5320926e370 ("mem-hotplug: introduce movable_node boot option"), the memblock allocation direction is changed to bottom-up and then back to top-down like this: 1. memblock_set_bottom_up(true), called by cmdline_parse_movable_node(). 2. memblock_set_bottom_up(false), called by x86's numa_init(). Even though (1) occurs in generic mm code, it is wrapped by #ifdef CONFIG_MOVABLE_NODE, which depends on X86_64. This means that when we extend CONFIG_MOVABLE_NODE to non-x86 arches, things will be unbalanced. (1) will happen for them, but (2) will not. This toggle was added in the first place because x86 has a delay between adding memblocks and marking them as hotpluggable. Since other arches do this marking either immediately or not at all, they do not require the bottom-up toggle. So, resolve things by moving (1) from cmdline_parse_movable_node() to x86's setup_arch(), immediately after the movable_node parameter has been parsed. Link: http://lkml.kernel.org/r/1479160961-25840-3-git-send-email-arbab@linux.vnet.ibm.com Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alistair Popple <apopple@au1.ibm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bharata B Rao <bharata@linux.vnet.ibm.com> Cc: Frank Rowand <frowand.list@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Rob Herring <robh+dt@kernel.org> Cc: Stewart Smith <stewart@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12Merge branch 'x86-platform-for-linus' of ↵Linus Torvalds2-17/+70
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform updates from Ingo Molnar: "Two changes: - implement various VMWare guest OS improvements/fixes (Alexey Makhalov) - unexport a spurious export from the intel-mid platform driver (Lukas Wunner)" * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vmware: Add paravirt sched clock x86/vmware: Add basic paravirt ops support x86/vmware: Use tsc_khz value for calibrate_cpu() x86/platform/intel-mid: Unexport intel_mid_pci_set_power_state() x86/vmware: Read tsc_khz only once at boot time
2016-12-12Merge branch 'x86-microcode-for-linus' of ↵Linus Torvalds9-940/+643
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 microcode update from Ingo Molnar: "The biggest change (by Borislav Petkov) is a thorough rewrite of the Intel microcode loader and its interactions with the core code. The biggest conceptual change is the decoupling of the microcode loading on boot and application processors (which load the microcode in different scenarios), so that both parse the input patches with as few assumptions as possible - this also fixes various kernel address space randomization bugs. (The AP side then goes on and caches the result to improve boot performance.) Since the AMD side already did this, this change also opened up the path towards more unification/simplification of the core microcode loading infrastructure: 10 files changed, 647 insertions(+), 940 deletions(-) which speaks for itself" * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode: Bump driver version, update copyrights x86/microcode: Rework microcode loading x86/microcode/intel: Remove intel_lib.c x86/microcode/amd: Move private inlines to .c and mark local functions static x86/microcode: Collect CPU info on resume x86/microcode: Issue the debug printk on resume only on success x86/microcode/amd: Hand down the CPU family x86/microcode: Export the microcode cache linked list x86/microcode: Remove one #ifdef clause x86/microcode/intel: Simplify generic_load_microcode() x86/microcode: Move driver authors to CREDITS x86/microcode: Run the AP-loading routine only on the application processors
2016-12-12Merge branch 'x86-idle-for-linus' of ↵Linus Torvalds23-161/+70
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 idle updates from Ingo Molnar: "There were two bigger changes in this development cycle: - remove idle notifiers: 32 files changed, 74 insertions(+), 803 deletions(-) These notifiers were of questionable value and the main usecase, the i7300 driver, was essentially unmaintained and can be removed, plus modern power management concepts don't need the callback - so use this golden opportunity and get rid of this opaque and fragile callback from a latency sensitive code path. (Len Brown, Thomas Gleixner) - improve the AMD Erratum 400 workaround that used high overhead MSR polling in the idle loop (Borisla Petkov, Thomas Gleixner)" * 'x86-idle-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Remove empty idle.h header x86/amd: Simplify AMD E400 aware idle routine x86/amd: Check for the C1E bug post ACPI subsystem init x86/bugs: Separate AMD E400 erratum and C1E bug x86/cpufeature: Provide helper to set bugs bits x86/idle: Remove enter_idle(), exit_idle() x86: Remove x86_test_and_clear_bit_percpu() x86/idle: Remove is_idle flag x86/idle: Remove idle_notifier i7300_idle: Remove this driver
2016-12-12Merge branch 'x86-headers-for-linus' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 header fixlet from Ingo Molnar: "Remove unnecessary module.h inclusion from core code (Paul Gortmaker)" * 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/percpu: Remove unnecessary include of module.h, add asm/desc.h
2016-12-12Merge branch 'x86-fpu-for-linus' of ↵Linus Torvalds29-471/+100
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 FPU updates from Ingo Molnar: "The main changes in this cycle were: - do a large round of simplifications after all CPUs do 'eager' FPU context switching in v4.9: remove CR0 twiddling, remove leftover eager/lazy bts, etc (Andy Lutomirski) - more FPU code simplifications: remove struct fpu::counter, clarify nomenclature, remove unnecessary arguments/functions and better structure the code (Rik van Riel)" * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fpu: Remove clts() x86/fpu: Remove stts() x86/fpu: Handle #NM without FPU emulation as an error x86/fpu, lguest: Remove CR0.TS support x86/fpu, kvm: Remove host CR0.TS manipulation x86/fpu: Remove irq_ts_save() and irq_ts_restore() x86/fpu: Stop saving and restoring CR0.TS in fpu__init_check_bugs() x86/fpu: Get rid of two redundant clts() calls x86/fpu: Finish excising 'eagerfpu' x86/fpu: Split old_fpu & new_fpu handling into separate functions x86/fpu: Remove 'cpu' argument from __cpu_invalidate_fpregs_state() x86/fpu: Split old & new FPU code paths x86/fpu: Remove __fpregs_(de)activate() x86/fpu: Rename lazy restore functions to "register state valid" x86/fpu, kvm: Remove KVM vcpu->fpu_counter x86/fpu: Remove struct fpu::counter x86/fpu: Remove use_eager_fpu() x86/fpu: Remove the XFEATURE_MASK_EAGER/LAZY distinction x86/fpu: Hard-disable lazy FPU mode x86/crypto, x86/fpu: Remove X86_FEATURE_EAGER_FPU #ifdef from the crc32c code
2016-12-12Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds6-106/+43
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 CPU updates from Ingo Molnar: "The changes in this development cycle were: - AMD CPU topology enhancements that are cleanups on current CPUs but which enable future Fam17 hardware. (Yazen Ghannam) - unify bugs.c and bugs_64.c (Borislav Petkov) - remove the show_msr= boot option (Borislav Petkov) - simplify a boot message (Borislav Petkov)" * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu/AMD: Clean up cpu_llc_id assignment per topology feature x86/cpu: Get rid of the show_msr= boot option x86/cpu: Merge bugs.c and bugs_64.c x86/cpu: Remove the printk format specifier in "CPU0: "
2016-12-12Merge branch 'x86-cleanups-for-linus' of ↵Linus Torvalds3-8/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "Two cleanups in the LDT handling code, by Dan Carpenter and Thomas Gleixner" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ldt: Make all size computations unsigned x86/ldt: Make a size argument unsigned
2016-12-12Merge branch 'x86-build-for-linus' of ↵Linus Torvalds3-32/+42
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 build updates from Ingo Molnar: "The main changes in this cycle were: - Makefile improvements (Paul Bolle) - KConfig cleanups to better separate 32-bit only, 64-bit only and generic feature enablement sections (Ingo Molnar)" * 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/build: Remove three unneeded genhdr-y entries x86/build: Don't use $(LINUXINCLUDE) twice x86/kconfig: Sort the 'config X86' selects alphabetically x86/kconfig: Clean up 32-bit compat options x86/kconfig: Clean up IA32_EMULATION select x86/kconfig, x86/pkeys: Move pkeys selects to X86_INTEL_MEMORY_PROTECTION_KEYS x86/kconfig: Move 64-bit only arch Kconfig selects to 'config X86_64' x86/kconfig: Move 32-bit only arch Kconfig selects to 'config X86_32'
2016-12-12Merge branch 'x86-boot-for-linus' of ↵Linus Torvalds3-5/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot updates from Ingo Molnar: "Misc cleanups/simplifications by Borislav Petkov, Paul Bolle and Wei Yang" * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot/64: Optimize fixmap page fixup x86/boot: Simplify the GDTR calculation assembly code a bit x86/boot/build: Remove always empty $(USERINCLUDE)
2016-12-12Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds32-459/+554
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Ingo Molnar: "The main changes in this development cycle were: - a large number of call stack dumping/printing improvements: higher robustness, better cross-context dumping, improved output, etc. (Josh Poimboeuf) - vDSO getcpu() performance improvement for future Intel CPUs with the RDPID instruction (Andy Lutomirski) - add two new Intel AVX512 features and the CPUID support infrastructure for it: AVX512IFMA and AVX512VBMI. (Gayatri Kammela, He Chen) - more copy-user unification (Borislav Petkov) - entry code assembly macro simplifications (Alexander Kuleshov) - vDSO C/R support improvements (Dmitry Safonov) - misc fixes and cleanups (Borislav Petkov, Paul Bolle)" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) scripts/decode_stacktrace.sh: Fix address line detection on x86 x86/boot/64: Use defines for page size x86/dumpstack: Make stack name tags more comprehensible selftests/x86: Add test_vdso to test getcpu() x86/vdso: Use RDPID in preference to LSL when available x86/dumpstack: Handle NULL stack pointer in show_trace_log_lvl() x86/cpufeatures: Enable new AVX512 cpu features x86/cpuid: Provide get_scattered_cpuid_leaf() x86/cpuid: Cleanup cpuid_regs definitions x86/copy_user: Unify the code by removing the 64-bit asm _copy_*_user() variants x86/unwind: Ensure stack grows down x86/vdso: Set vDSO pointer only after success x86/prctl/uapi: Remove #ifdef for CHECKPOINT_RESTORE x86/unwind: Detect bad stack return address x86/dumpstack: Warn on stack recursion x86/unwind: Warn on bad frame pointer x86/decoder: Use stderr if insn sanity test fails x86/decoder: Use stdout if insn decoder test is successful mm/page_alloc: Remove kernel address exposure in free_reserved_area() x86/dumpstack: Remove raw stack dump ...
2016-12-12Merge branch 'x86-apic-for-linus' of ↵Linus Torvalds7-23/+39
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 apic updates from Ingo Molnar: "Misc changes: - optimize (reduce) IRQ handler tracing overhead (Wanpeng Li) - clean up MSR helpers (Borislav Petkov) - fix build warning on some configs (Sebastian Andrzej Siewior)" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/msr: Cleanup/streamline MSR helpers x86/apic: Prevent tracing on apic_msr_write_eoi() x86/msr: Add wrmsr_notrace() x86/apic: Get rid of "warning: 'acpi_ioapic_lock' defined but not used"
2016-12-12Merge branch 'ras-core-for-linus' of ↵Linus Torvalds12-85/+522
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 RAS updates from Ingo Molnar: "The main changes in this development cycle were: - more AMD northbridge support work, mostly in preparation for Fam17h CPUs (Yazen Ghannam, Borislav Petkov) - cleanups/refactorings and fixes (Borislav Petkov, Tony Luck, Yinghai Lu)" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Include the PPIN in MCE records when available x86/mce/AMD: Add system physical address translation for AMD Fam17h x86/amd_nb: Add SMN and Indirect Data Fabric access for AMD Fam17h x86/amd_nb: Add Fam17h Data Fabric as "Northbridge" x86/amd_nb: Make all exports EXPORT_SYMBOL_GPL x86/amd_nb: Make amd_northbridges internal to amd_nb.c x86/mce/AMD: Reset Threshold Limit after logging error x86/mce/AMD: Fix HWID_MCATYPE calculation by grouping arguments x86/MCE: Correct TSC timestamping of error records x86/RAS: Hide SMCA bank names x86/RAS: Rename smca_bank_names to smca_names x86/RAS: Simplify SMCA HWID descriptor struct x86/RAS: Simplify SMCA bank descriptor struct x86/MCE: Dump MCE to dmesg if no consumers x86/RAS: Add TSC timestamp to the injected MCE x86/MCE: Do not look at panic_on_oops in the severity grading
2016-12-12Merge branch 'sched-core-for-linus' of ↵Linus Torvalds8-9/+324
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main scheduler changes in this cycle were: - support Intel Turbo Boost Max Technology 3.0 (TBM3) by introducig a notion of 'better cores', which the scheduler will prefer to schedule single threaded workloads on. (Tim Chen, Srinivas Pandruvada) - enhance the handling of asymmetric capacity CPUs further (Morten Rasmussen) - improve/fix load handling when moving tasks between task groups (Vincent Guittot) - simplify and clean up the cputime code (Stanislaw Gruszka) - improve mass fork()ed task spread a.k.a. hackbench speedup (Vincent Guittot) - make struct kthread kmalloc()ed and related fixes (Oleg Nesterov) - add uaccess atomicity debugging (when using access_ok() in the wrong context), under CONFIG_DEBUG_ATOMIC_SLEEP=y (Peter Zijlstra) - implement various fixes, cleanups and other enhancements (Daniel Bristot de Oliveira, Martin Schwidefsky, Rafael J. Wysocki)" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) sched/core: Use load_avg for selecting idlest group sched/core: Fix find_idlest_group() for fork kthread: Don't abuse kthread_create_on_cpu() in __kthread_create_worker() kthread: Don't use to_live_kthread() in kthread_[un]park() kthread: Don't use to_live_kthread() in kthread_stop() Revert "kthread: Pin the stack via try_get_task_stack()/put_task_stack() in to_live_kthread() function" kthread: Make struct kthread kmalloc'ed x86/uaccess, sched/preempt: Verify access_ok() context sched/x86: Make CONFIG_SCHED_MC_PRIO=y easier to enable sched/x86: Change CONFIG_SCHED_ITMT to CONFIG_SCHED_MC_PRIO x86/sched: Use #include <linux/mutex.h> instead of #include <asm/mutex.h> cpufreq/intel_pstate: Use CPPC to get max performance acpi/bus: Set _OSC for diverse core support acpi/bus: Enable HWP CPPC objects x86/sched: Add SD_ASYM_PACKING flags to x86 ITMT CPU x86/sysctl: Add sysctl for ITMT scheduling feature x86: Enable Intel Turbo Boost Max Technology 3.0 x86/topology: Define x86's arch_update_cpu_topology sched: Extend scheduler's asym packing sched/fair: Clean up the tunable parameter definitions ...
2016-12-12Merge branch 'perf-core-for-linus' of ↵Linus Torvalds2-3/+7
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "This update is pretty big and almost exclusively includes tooling changes, because v4.9's LTS status forced to completion most of the pending kernel side hardware enablement work and because we tried to freeze core perf work a bit to give a time window for the fuzzing efforts. The diff is large mostly due to the JSON hardware event tables added for Intel and Power8 CPUs. This was a popular feature request from people working close to hardware and from the HPC community. Tree size is big because this added the CPU event tables for over a decade of Intel CPUs. Future changes for a CPU vendor alrady support should be much smaller, as events for new models are added. The new events are listed in 'perf list', for the CPU model the tool is running on. If you find an interesting event it can be used as-is: $ perf stat -a -e l2_lines_out.pf_clean sleep 1 Performance counter stats for 'system wide': 7,860,403 l2_lines_out.pf_clean 1.000624918 seconds time elapsed The event lists can be searched the usual 'perf list' fashion for (case insensitive) substrings as well: $ perf list l2_lines_out List of pre-defined events (to be used in -e): cache: l2_lines_out.demand_clean [Clean L2 cache lines evicted by demand] l2_lines_out.demand_dirty [Dirty L2 cache lines evicted by demand] l2_lines_out.dirty_all [Dirty L2 cache lines filling the L2] l2_lines_out.pf_clean [Clean L2 cache lines evicted by L2 prefetch] l2_lines_out.pf_dirty [Dirty L2 cache lines evicted by L2 prefetch] etc. There's a few high level categories as well that can be listed: 'cache', 'floating point', 'frontend', 'memory', 'pipeline', 'virtual memory'. Existing generic events and workflows should work as-is. The only kernel side change is a late breaking fix for an older regression, related to Intel BTS, LBR and PT feature interaction. On the tooling side there are three new tools / major features: - The new 'perf c2c' tool provides means for Shared Data C2C/HITM analysis. This allows you to track down cacheline contention. The tool is based on x86's load latency and precise store facility events provided by Intel CPUs. It was tested by Joe Mario and has proven to be useful, finding some cacheline contentions. Joe also wrote a blog about c2c tool with examples: https://joemario.github.io/blog/2016/09/01/c2c-blog/ excerpt of the content on this site: At a high level, “perf c2c” will show you: * The cachelines where false sharing was detected. * The readers and writers to those cachelines, and the offsets where those accesses occurred. * The pid, tid, instruction addr, function name, binary object name for those readers and writers. * The source file and line number for each reader and writer. * The average load latency for the loads to those cachelines. * Which numa nodes the samples a cacheline came from and which CPUs were involved. Using perf c2c is similar to using the Linux perf tool today. First collect data with “perf c2c record”, then generate a report output with “perf c2c report” There one finds extensive details on using the tool, with tips on reducing the volume of samples while still capturing enough to do its job. (Dick Fowles, Joe Mario, Don Zickus, Jiri Olsa) - The new 'perf sched timehist' tool provides tailored analysis of scheduling events. Example usage: perf sched record -- sleep 1 perf sched timehist By default it shows the individual schedule events, including the wait time (time between sched-out and next sched-in events for the task), the task scheduling delay (time between wakeup and actually running) and run time for the task: time cpu task name wait time sch delay run time [tid/pid] (msec) (msec) (msec) -------- ------ ---------------- --------- --------- -------- 1.874569 [0011] gcc[31949] 0.014 0.000 1.148 1.874591 [0010] gcc[31951] 0.000 0.000 0.024 1.874603 [0010] migration/10[59] 3.350 0.004 0.011 1.874604 [0011] <idle> 1.148 0.000 0.035 1.874723 [0005] <idle> 0.016 0.000 1.383 1.874746 [0005] gcc[31949] 0.153 0.078 0.022 ... Times are in msec.usec. (David Ahern, Namhyung Kim) - Add CPU vendor hardware event tables: Add JSON files with vendor event naming for Intel and Power8 processors, allowing users of tools like oprofile to keep using the event names they are used to, as well as people reading vendor documentation, where such naming is used. (Andi Kleen, Sukadev Bhattiprolu) You should see all the new events with 'perf list' and you should be able to search them, for example 'perf list miss' will list all the myriads of miss events. Other tooling features added were: - Cross-arch annotation support: o Improve ARM support in the annotation code, affecting 'perf annotate', 'perf report' and live annotation in 'perf top' (Kim Phillips) o Initial support for PowerPC in the annotation code (Ravi Bangoria) o Support AArch64 in the 'annotate' code, native/local and cross-arch/remote (Kim Phillips) - Allow considering just events in a given time interval, via the '--time start.s.ms,end.s.ms' command line, added to 'perf kmem', 'perf report', 'perf sched timehist' and 'perf script' (David Ahern) - Add option to stop printing a callchain at one of a given group of symbol names (David Ahern) - Track memory freed in 'perf kmem stat' (David Ahern) - Allow querying and setting .perfconfig variables (Taeung Song) - Show branch information in callchains (predicted, TSX aborts, loop iteractions, etc) (Jin Yao) - Dynamicly change verbosity level by pressing 'V' in the 'perf top/report' hists TUI browser (Alexis Berlemont) - Implement 'perf trace --delay' in the same fashion as in 'perf record --delay', to skip sampling workload initialization events (Alexis Berlemont) - Make vendor named events case insensitive in 'perf list', i.e. 'perf list LONGEST_LAT' works just the same as 'perf list longest_lat' (Andi Kleen) - Add unwinding support for jitdump (Stefano Sanfilippo) Tooling infrastructure changes: - Support linking perf with clang and LLVM libraries, initially statically, but this limitation will be lifted and shared libraries, when available, will be preferred to the static build, that should, as with other features, be enabled explicitly (Wang Nan) - Add initial support (and perf test entry) for tooling hooks, starting with 'record_start' and 'record_end', that will have as its initial user the eBPF infrastructure, where perf_ prefixed functions will be JITed and run when such hooks are called (Wang Nan) - Implement assorted libbpf improvements (Wang Nan)" ... and lots of other changes, features, cleanups and refactorings I did not list, see the shortlog and the git log for details" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (220 commits) perf/x86: Fix exclusion of BTS and LBR for Goldmont perf tools: Explicitly document that --children is enabled by default perf sched timehist: Cleanup idle_max_cpu handling perf sched timehist: Handle zero sample->tid properly perf callchain: Introduce callchain_cursor__copy() perf sched: Cleanup option processing perf sched timehist: Improve error message when analyzing wrong file perf tools: Move perf build related variables under non fixdep leg perf tools: Force fixdep compilation at the start of the build perf tools: Move PERF-VERSION-FILE target into rules area perf build: Check LLVM version in feature check perf annotate: Show raw form for jump instruction with indirect target perf tools: Add non config targets perf tools: Cleanup build directory before each test perf tools: Move python/perf.so target into rules area perf tools: Move install-gtk target into rules area tools build: Move tabs to spaces where suitable tools build: Make the .cmd file more readable perf clang: Compile BPF script using builtin clang support perf clang: Support compile IR to BPF object and add testcase ...
2016-12-12Merge branch 'mm-pat-for-linus' of ↵Linus Torvalds1-5/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull mm/PAT cleanup from Ingo Molnar: "A single cleanup for a generic interface that was originally introduced for PAT" * 'mm-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/pat, mm: Make track_pfn_insert() return void
2016-12-12Merge branch 'locking-core-for-linus' of ↵Linus Torvalds15-248/+97
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The tree got pretty big in this development cycle, but the net effect is pretty good: 115 files changed, 673 insertions(+), 1522 deletions(-) The main changes were: - Rework and generalize the mutex code to remove per arch mutex primitives. (Peter Zijlstra) - Add vCPU preemption support: add an interface to query the preemption status of vCPUs and use it in locking primitives - this optimizes paravirt performance. (Pan Xinhui, Juergen Gross, Christian Borntraeger) - Introduce cpu_relax_yield() and remov cpu_relax_lowlatency() to clean up and improve the s390 lock yielding machinery and its core kernel impact. (Christian Borntraeger) - Micro-optimize mutexes some more. (Waiman Long) - Reluctantly add the to-be-deprecated mutex_trylock_recursive() interface on a temporary basis, to give the DRM code more time to get rid of its locking hacks. Any other users will be NAK-ed on sight. (We turned off the deprecation warning for the time being to not pollute the build log.) (Peter Zijlstra) - Improve the rtmutex code a bit, in light of recent long lived bugs/races. (Thomas Gleixner) - Misc fixes, cleanups" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) x86/paravirt: Fix bool return type for PVOP_CALL() x86/paravirt: Fix native_patch() locking/ww_mutex: Use relaxed atomics locking/rtmutex: Explain locking rules for rt_mutex_proxy_unlock()/init_proxy_locked() locking/rtmutex: Get rid of RT_MUTEX_OWNER_MASKALL x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted() locking/mutex: Break out of expensive busy-loop on {mutex,rwsem}_spin_on_owner() when owner vCPU is preempted locking/osq: Break out of spin-wait busy waiting loop for a preempted vCPU in osq_lock() Documentation/virtual/kvm: Support the vCPU preemption check x86/xen: Support the vCPU preemption check x86/kvm: Support the vCPU preemption check x86/kvm: Support the vCPU preemption check kvm: Introduce kvm_write_guest_offset_cached() locking/core, x86/paravirt: Implement vcpu_is_preempted(cpu) for KVM and Xen guests locking/spinlocks, s390: Implement vcpu_is_preempted(cpu) locking/core, powerpc: Implement vcpu_is_preempted(cpu) sched/core: Introduce the vcpu_is_preempted(cpu) interface sched/wake_q: Rename WAKE_Q to DEFINE_WAKE_Q locking/core: Provide common cpu_relax_yield() definition locking/mutex: Don't mark mutex_trylock_recursive() as deprecated, temporarily ...
2016-12-12Merge branch 'efi-core-for-linus' of ↵Linus Torvalds3-5/+77
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI updates from Ingo Molnar: "The main changes in this development cycle were: - Implement EFI dev path parser and other changes to fully support thunderbolt devices on Apple Macbooks (Lukas Wunner) - Add RNG seeding via the EFI stub, on ARM/arm64 (Ard Biesheuvel) - Expose EFI framebuffer configuration to user-space, to improve tooling (Peter Jones) - Misc fixes and cleanups (Ivan Hu, Wei Yongjun, Yisheng Xie, Dan Carpenter, Roy Franz)" * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi/libstub: Make efi_random_alloc() allocate below 4 GB on 32-bit thunderbolt: Compile on x86 only thunderbolt, efi: Fix Kconfig dependencies harder thunderbolt, efi: Fix Kconfig dependencies thunderbolt: Use Device ROM retrieved from EFI x86/efi: Retrieve and assign Apple device properties efi: Allow bitness-agnostic protocol calls efi: Add device path parser efi/arm*/libstub: Invoke EFI_RNG_PROTOCOL to seed the UEFI RNG table efi/libstub: Add random.c to ARM build efi: Add support for seeding the RNG from a UEFI config table MAINTAINERS: Add ARM and arm64 EFI specific files to EFI subsystem efi/libstub: Fix allocation size calculations efi/efivar_ssdt_load: Don't return success on allocation failure efifb: Show framebuffer layout as device attributes efi/efi_test: Use memdup_user() as a cleanup efi/efi_test: Fix uninitialized variable 'rv' efi/efi_test: Fix uninitialized variable 'datasize' efi/arm*: Fix efi_init() error handling efi: Remove unused include of <linux/version.h>
2016-12-12Merge branch 'core-smp-for-linus' of ↵Linus Torvalds1-8/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull SMP bootup updates from Ingo Molnar: "Three changes to unify/standardize some of the bootup message printing in kernel/smp.c between architectures" * 'core-smp-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kernel/smp: Tell the user we're bringing up secondary CPUs kernel/smp: Make the SMP boot message common on all arches kernel/smp: Define pr_fmt() for smp.c
2016-12-11Merge branch 'linus' into sched/core, to pick up fixesIngo Molnar18-78/+80
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-11x86/paravirt: Fix bool return type for PVOP_CALL()Peter Zijlstra1-1/+13
Commit: 3cded4179481 ("x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()") introduced a paravirt op with bool return type [*] It turns out that the PVOP_CALL*() macros miscompile when rettype is bool. Code that looked like: 83 ef 01 sub $0x1,%edi ff 15 32 a0 d8 00 callq *0xd8a032(%rip) # ffffffff81e28120 <pv_lock_ops+0x20> 84 c0 test %al,%al ended up looking like so after PVOP_CALL1() was applied: 83 ef 01 sub $0x1,%edi 48 63 ff movslq %edi,%rdi ff 14 25 20 81 e2 81 callq *0xffffffff81e28120 48 85 c0 test %rax,%rax Note how it tests the whole of %rax, even though a typical bool return function only sets %al, like: 0f 95 c0 setne %al c3 retq This is because ____PVOP_CALL() does: __ret = (rettype)__eax; and while regular integer type casts truncate the result, a cast to bool tests for any !0 value. Fix this by explicitly truncating to sizeof(rettype) before casting. [*] The actual bug should've been exposed in commit: 446f3dc8cc0a ("locking/core, x86/paravirt: Implement vcpu_is_preempted(cpu) for KVM and Xen guests") but that didn't properly implement the paravirt call. Reported-by: kernel test robot <xiaolong.ye@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alok Kataria <akataria@vmware.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 3cded4179481 ("x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()") Link: http://lkml.kernel.org/r/20161208154349.346057680@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-11x86/paravirt: Fix native_patch()Peter Zijlstra2-0/+8
While chasing a regression I noticed we potentially patch the wrong code in native_patch(). If we do not select the native code sequence, we must use the default patcher, not fall-through the switch case. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alok Kataria <akataria@vmware.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel test robot <xiaolong.ye@intel.com> Fixes: 3cded4179481 ("x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()") Link: http://lkml.kernel.org/r/20161208154349.270616999@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-11Merge branch 'linus' into locking/core, to pick up fixesIngo Molnar13-40/+43
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-11perf/x86: Fix exclusion of BTS and LBR for GoldmontAndi Kleen2-3/+7
An earlier patch allowed enabling PT and LBR at the same time on Goldmont. However it also allowed enabling BTS and LBR at the same time, which is still not supported. Fix this by bypassing the check only for PT. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: alexander.shishkin@intel.com Cc: kan.liang@intel.com Cc: <stable@vger.kernel.org> Fixes: ccbebba4c6bf ("perf/x86/intel/pt: Bypass PT vs. LBR exclusivity if the core supports it") Link: http://lkml.kernel.org/r/20161209001417.4713-1-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller8-11/+15
2016-12-10x86/ldt: Make all size computations unsignedThomas Gleixner3-7/+7
ldt->size can never be negative. The helper functions take 'unsigned int' arguments which are assigned from ldt->size. The related user space user_desc struct member entry_number is unsigned as well. But ldt->size itself and a few local variables which are related to ldt->size are type 'int' which makes no sense whatsoever and results in typecasts which make the eyes bleed. Clean it up and convert everything which is related to ldt->size to unsigned it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dan Carpenter <dan.carpenter@oracle.com>
2016-12-10x86/ldt: Make a size argument unsignedDan Carpenter1-1/+1
My static checker complains that we put an upper bound on the "size" argument but not a lower bound. The checker is not smart enough to know the possible ranges of "old_mm->context.ldt->size" from init_new_context_ldt() so it thinks maybe it could be negative. Let's make it unsigned to silence the warning and future proof the code a bit. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: kernel-janitors@vger.kernel.org Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20161208105602.GA11382@elgon.mountain Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-09x86: Remove empty idle.h headerThomas Gleixner16-21/+0
One include less is always a good thing(tm). Good riddance. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20161209182912.2726-6-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-09x86/amd: Simplify AMD E400 aware idle routineBorislav Petkov6-54/+30
Reorganize the E400 detection now that we have everything in place: switch the CPUs to broadcast mode after the LAPIC has been initialized and remove the facilities that were used previously on the idle path. Unfortunately static_cpu_has_bug() cannpt be used in the E400 idle routine because alternatives have been applied when the actual detection happens, so the static switching does not take effect and the test will stay false. Use boot_cpu_has_bug() instead which is definitely an improvement over the RDMSR and the cpumask handling. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20161209182912.2726-5-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-09x86/amd: Check for the C1E bug post ACPI subsystem initThomas Gleixner1-0/+23
AMD CPUs affected by the E400 erratum suffer from the issue that the local APIC timer stops when the CPU goes into C1E. Unfortunately there is no way to detect the affected CPUs on early boot. It's only possible to determine the range of possibly affected CPUs from the family/model range. The actual decision whether to enter C1E and thus cause the bug is done by the firmware and we need to detect that case late, after ACPI has been initialized. The current solution is to check in the idle routine whether the CPU is affected by reading the MSR_K8_INT_PENDING_MSG MSR and checking for the K8_INTP_C1E_ACTIVE_MASK bits. If one of the bits is set then the CPU is affected and the system is switched into forced broadcast mode. This is ineffective and on non-affected CPUs every entry to idle does the extra RDMSR. After doing some research it turns out that the bits are visible on the boot CPU right after the ACPI subsystem is initialized in the early boot process. So instead of polling for the bits in the idle loop, add a detection function after acpi_subsystem_init() and check for the MSR bits. If set, then the X86_BUG_AMD_APIC_C1E is set on the boot CPU and the TSC is marked unstable when X86_FEATURE_NONSTOP_TSC is not set as it will stop in C1E state as well. The switch to broadcast mode cannot be done at this point because the boot CPU still uses HPET as a clockevent device and the local APIC timer is not yet calibrated and installed. The switch to broadcast mode on the affected CPUs needs to be done when the local APIC timer is actually set up. This allows to cleanup the amd_e400_idle() function in the next step. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20161209182912.2726-4-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-09x86/bugs: Separate AMD E400 erratum and C1E bugThomas Gleixner3-9/+16
The workaround for the AMD Erratum E400 (Local APIC timer stops in C1E state) is a two step process: - Selection of the E400 aware idle routine - Detection whether the platform is affected The idle routine selection happens for possibly affected CPUs depending on family/model/stepping information. These range of CPUs is not necessarily affected as the decision whether to enable the C1E feature is made by the firmware. Unfortunately there is no way to query this at early boot. The current implementation polls a MSR in the E400 aware idle routine to detect whether the CPU is affected. This is inefficient on non affected CPUs because every idle entry has to do the MSR read. There is a better way to detect this before going idle for the first time which requires to seperate the bug flags: X86_BUG_AMD_E400 - Selects the E400 aware idle routine and enables the detection X86_BUG_AMD_APIC_C1E - Set when the platform is affected by E400 Replace the current X86_BUG_AMD_APIC_C1E usage by the new X86_BUG_AMD_E400 bug bit to select the idle routine which currently does an unconditional detection poll. X86_BUG_AMD_APIC_C1E is going to be used in later patches to remove the MSR polling and simplify the handling of this misfeature. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20161209182912.2726-3-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-09x86/cpufeature: Provide helper to set bugs bitsBorislav Petkov1-0/+1
Will be used in a later patch to set bug bits for bugs which need late detection. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20161209182912.2726-2-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-08bpf: xdp: Allow head adjustment in XDP progMartin KaFai Lau1-1/+1
This patch allows XDP prog to extend/remove the packet data at the head (like adding or removing header). It is done by adding a new XDP helper bpf_xdp_adjust_head(). It also renames bpf_helper_changes_skb_data() to bpf_helper_changes_pkt_data() to better reflect that XDP prog does not work on skb. This patch adds one "xdp_adjust_head" bit to bpf_prog for the XDP-capable driver to check if the XDP prog requires bpf_xdp_adjust_head() support. The driver can then decide to error out during XDP_SETUP_PROG. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-07Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds5-9/+12
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: a core dumping crash fix, a guess-unwinder regression fix, plus three build warning fixes" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/unwind: Fix guess-unwinder regression x86/build: Annotate die() with noreturn to fix build warning on clang x86/platform/olpc: Fix resume handler build warning x86/apic/uv: Silence a shift wrapping warning x86/coredump: Always use user_regs_struct for compat_elf_gregset_t
2016-12-06x86/uaccess, sched/preempt: Verify access_ok() contextPeter Zijlstra1-2/+11
I recently encountered wreckage because access_ok() was used where it should not be, add an explicit WARN when access_ok() is used wrongly. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-06perf/x86: Fix full width counter, counter overflowPeter Zijlstra (Intel)2-2/+2
Lukasz reported that perf stat counters overflow handling is broken on KNL/SLM. Both these parts have full_width_write set, and that does indeed have a problem. In order to deal with counter wrap, we must sample the counter at at least half the counter period (see also the sampling theorem) such that we can unambiguously reconstruct the count. However commit: 069e0c3c4058 ("perf/x86/intel: Support full width counting") sets the sampling interval to the full period, not half. Fixing that exposes another issue, in that we must not sign extend the delta value when we shift it right; the counter cannot have decremented after all. With both these issues fixed, counter overflow functions correctly again. Reported-by: Lukasz Odzioba <lukasz.odzioba@intel.com> Tested-by: Liang, Kan <kan.liang@intel.com> Tested-by: Odzioba, Lukasz <lukasz.odzioba@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: stable@vger.kernel.org Fixes: 069e0c3c4058 ("perf/x86/intel: Support full width counting") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-06perf/x86/intel: Enable C-state residency events for Knights MillPiotr Luc1-0/+1
The Knights Mill is enough close to Knights Landing so the path reuses C-state residency support of the latter. Signed-off-by: Piotr Luc <piotr.luc@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/20161201000853.18260-1-piotr.luc@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-02Merge branch 'locking/urgent' into locking/core, to pick up dependent fixesIngo Molnar16-71/+132
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-30sched/x86: Make CONFIG_SCHED_MC_PRIO=y easier to enableIngo Molnar1-11/+13
Right now CONFIG_SCHED_MC_PRIO has X86_INTEL_PSTATE as a dependency, which is not enabled by default and which hides the CONFIG_SCHED_MC_PRIO hardware-enabling feature. Select X86_INTEL_PSTATE instead, plus its dependency (CPU_FREQ), if the user enables CONFIG_SCHED_MC_PRIO=y. (Also align the CONFIG_SCHED_MC_PRIO Kconfig help text in standard style.) Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: bp@suse.de Cc: jolsa@redhat.com Cc: linux-acpi@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: rjw@rjwysocki.net Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-30sched/x86: Change CONFIG_SCHED_ITMT to CONFIG_SCHED_MC_PRIOTim Chen3-10/+20
Rename CONFIG_SCHED_ITMT for Intel Turbo Boost Max Technology 3.0 to CONFIG_SCHED_MC_PRIO. This makes the configuration extensible in future to other architectures that wish to similarly establish CPU core priorities support in the scheduler. The description in Kconfig is updated to reflect this change with added details for better clarity. The configuration is explicitly default-y, to enable the feature on CPUs that have this feature. It has no effect on non-TBM3 CPUs. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bp@suse.de Cc: jolsa@redhat.com Cc: linux-acpi@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: rjw@rjwysocki.net Link: http://lkml.kernel.org/r/2b2ee29d93e3f162922d72d0165a1405864fbb23.1480444902.git.tim.c.chen@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-29timekeeping: Ignore the bogus sleep time if pm_trace is enabledChen Yu1-0/+9
Power management suspend/resume tracing (ab)uses the RTC to store suspend/resume information persistently. As a consequence the RTC value is clobbered when timekeeping is resumed and tries to inject the sleep time. Commit a4f8f6667f09 ("timekeeping: Cap array access in timekeeping_debug") plugged a out of bounds array access in the timekeeping debug code which was caused by the clobbered RTC value, but we still use the clobbered RTC value for sleep time injection into kernel timekeeping, which will result in random adjustments depending on the stored "hash" value. To prevent this keep track of the RTC clobbering and ignore the invalid RTC timestamp at resume. If the system resumed successfully clear the flag, which marks the RTC as unusable, warn the user about the RTC clobber and recommend to adjust the RTC with 'ntpdate' or 'rdate'. [jstultz: Fixed up pr_warn formating, and implemented suggestions from Ingo] [ tglx: Rewrote changelog ] Originally-from: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Prarit Bhargava <prarit@redhat.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Xunlei Pang <xlpang@redhat.com> Cc: Len Brown <lenb@kernel.org> Link: http://lkml.kernel.org/r/1480372524-15181-3-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-28x86/sched: Use #include <linux/mutex.h> instead of #include <asm/mutex.h>Ingo Molnar1-1/+1
asm/mutex.h is gone from the locking tree, which makes sched/core break the build. Use linux/mutex.h instead, which is the canonical method. Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: peterz@infradead.org Cc: jolsa@redhat.com Cc: rjw@rjwysocki.net Cc: bp@suse.de Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-28x86/build: Remove three unneeded genhdr-y entriesPaul Bolle1-4/+0
In x86's include/asm/Kbuild three entries are appended to the genhdr-y make variable: genhdr-y += unistd_32.h genhdr-y += unistd_64.h genhdr-y += unistd_x32.h The same entries are also appended to that variable in include/uapi/asm/Kbuild. So commit: 10b63956fce7 ("UAPI: Plumb the UAPI Kbuilds into the user header installation and checking") ... removed these three entries from include/asm/Kbuild. But, apparently, some merge conflict resolution re-added them. The net effect is, in short, that the genhdr-y make variable contains these file names twice and, as a consequence, that the corresponding headers get installed twice. And so the build prints: INSTALL usr/include/asm/ (65 files) ... while in reality only 62 files are installed in that directory. Nothing breaks because of all that, but it's a good idea to finally remove these unneeded entries nevertheless. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1480077707-2837-1-git-send-email-pebolle@tiscali.nl Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-28x86/build: Don't use $(LINUXINCLUDE) twicePaul Bolle1-1/+1
The make variable KBUILD_CFLAGS contains $(LINUXINCLUDE). But the build already picks up $(LINUXINCLUDE) from scripts/Makefile.lib. The net effect is that the (long) list of include directories is used twice. This is harmless but pointless. So stop using $(LINUXINCLUDE) twice. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1480077514-2586-1-git-send-email-pebolle@tiscali.nl Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-28x86/unwind: Fix guess-unwinder regressionJosh Poimboeuf1-3/+6
My attempt at fixing some KASAN false positive warnings was rather brain dead, and it broke the guess unwinder. With frame pointers disabled, /proc/<pid>/stack is broken: # cat /proc/1/stack [<ffffffffffffffff>] 0xffffffffffffffff Restore the code flow to more closely resemble its previous state, while still using READ_ONCE_NOCHECK() macros to silence KASAN false positives. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: c2d75e03d630 ("x86/unwind: Prevent KASAN false positive warnings in guess unwinder") Link: http://lkml.kernel.org/r/b824f92c2c22eca5ec95ac56bd2a7c84cf0b9df9.1480309971.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>