summaryrefslogtreecommitdiffstats
path: root/init/Kconfig
AgeCommit message (Collapse)AuthorFilesLines
2018-10-26psi: cgroup supportJohannes Weiner1-0/+4
On a system that executes multiple cgrouped jobs and independent workloads, we don't just care about the health of the overall system, but also that of individual jobs, so that we can ensure individual job health, fairness between jobs, or prioritize some jobs over others. This patch implements pressure stall tracking for cgroups. In kernels with CONFIG_PSI=y, cgroup2 groups will have cpu.pressure, memory.pressure, and io.pressure files that track aggregate pressure stall times for only the tasks inside the cgroup. Link: http://lkml.kernel.org/r/20180828172258.3185-10-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Daniel Drake <drake@endlessm.com> Tested-by: Suren Baghdasaryan <surenb@google.com> Cc: Christopher Lameter <cl@linux.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Weiner <jweiner@fb.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Enderborg <peter.enderborg@sony.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Vinayak Menon <vinmenon@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26psi: pressure stall information for CPU, memory, and IOJohannes Weiner1-0/+15
When systems are overcommitted and resources become contended, it's hard to tell exactly the impact this has on workload productivity, or how close the system is to lockups and OOM kills. In particular, when machines work multiple jobs concurrently, the impact of overcommit in terms of latency and throughput on the individual job can be enormous. In order to maximize hardware utilization without sacrificing individual job health or risk complete machine lockups, this patch implements a way to quantify resource pressure in the system. A kernel built with CONFIG_PSI=y creates files in /proc/pressure/ that expose the percentage of time the system is stalled on CPU, memory, or IO, respectively. Stall states are aggregate versions of the per-task delay accounting delays: cpu: some tasks are runnable but not executing on a CPU memory: tasks are reclaiming, or waiting for swapin or thrashing cache io: tasks are waiting for io completions These percentages of walltime can be thought of as pressure percentages, and they give a general sense of system health and productivity loss incurred by resource overcommit. They can also indicate when the system is approaching lockup scenarios and OOMs. To do this, psi keeps track of the task states associated with each CPU and samples the time they spend in stall states. Every 2 seconds, the samples are averaged across CPUs - weighted by the CPUs' non-idle time to eliminate artifacts from unused CPUs - and translated into percentages of walltime. A running average of those percentages is maintained over 10s, 1m, and 5m periods (similar to the loadaverage). [hannes@cmpxchg.org: doc fixlet, per Randy] Link: http://lkml.kernel.org/r/20180828205625.GA14030@cmpxchg.org [hannes@cmpxchg.org: code optimization] Link: http://lkml.kernel.org/r/20180907175015.GA8479@cmpxchg.org [hannes@cmpxchg.org: rename psi_clock() to psi_update_work(), per Peter] Link: http://lkml.kernel.org/r/20180907145404.GB11088@cmpxchg.org [hannes@cmpxchg.org: fix build] Link: http://lkml.kernel.org/r/20180913014222.GA2370@cmpxchg.org Link: http://lkml.kernel.org/r/20180828172258.3185-9-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Daniel Drake <drake@endlessm.com> Tested-by: Suren Baghdasaryan <surenb@google.com> Cc: Christopher Lameter <cl@linux.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Weiner <jweiner@fb.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Enderborg <peter.enderborg@sony.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-02sched/pelt: Fix warning and clean up IRQ PELT configVincent Guittot1-0/+5
Create a config for enabling irq load tracking in the scheduler. irq load tracking is useful only when irq or paravirtual time is accounted but it's only possible with SMP for now. Also use __maybe_unused to remove the compilation warning in update_rq_clock_task() that has been introduced by: 2e62c4743adc ("sched/fair: Remove #ifdefs from scale_rt_capacity()") Suggested-by: Ingo Molnar <mingo@redhat.com> Reported-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Reported-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bp@alien8.de Cc: dou_liyang@163.com Fixes: 2e62c4743adc ("sched/fair: Remove #ifdefs from scale_rt_capacity()") Link: http://lkml.kernel.org/r/1537867062-27285-1-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-08-25Merge tag 'kbuild-v4.19-2' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull more Kbuild updates from Masahiro Yamada: - add build_{menu,n,g,x}config targets for compile-testing Kconfig - fix and improve recursive dependency detection in Kconfig - fix parallel building of menuconfig/nconfig - fix syntax error in clang-version.sh - suppress distracting log from syncconfig - remove obsolete "rpm" target - remove VMLINUX_SYMBOL(_STR) macro entirely - fix microblaze build with CONFIG_DYNAMIC_FTRACE - move compiler test for dead code/data elimination to Kconfig - rename well-known LDFLAGS variable to KBUILD_LDFLAGS - misc fixes and cleanups * tag 'kbuild-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: rename LDFLAGS to KBUILD_LDFLAGS kbuild: pass LDFLAGS to recordmcount.pl kbuild: test dead code/data elimination support in Kconfig initramfs: move gen_initramfs_list.sh from scripts/ to usr/ vmlinux.lds.h: remove stale <linux/export.h> include export.h: remove VMLINUX_SYMBOL() and VMLINUX_SYMBOL_STR() Coccinelle: remove pci_alloc_consistent semantic to detect in zalloc-simple.cocci kbuild: make sorting initramfs contents independent of locale kbuild: remove "rpm" target, which is alias of "rpm-pkg" kbuild: Fix LOADLIBES rename in Documentation/kbuild/makefiles.txt kconfig: suppress "configuration written to .config" for syncconfig kconfig: fix "Can't open ..." in parallel build kbuild: Add a space after `!` to prevent parsing as file pattern scripts: modpost: check memory allocation results kconfig: improve the recursive dependency report kconfig: report recursive dependency involving 'imply' kconfig: error out when seeing recursive dependency kconfig: add build-only configurator targets scripts/dtc: consolidate include path options in Makefile
2018-08-24kbuild: test dead code/data elimination support in KconfigMasahiro Yamada1-0/+2
This config option should be enabled only when both the compiler and the linker support necessary flags. Add proper dependencies to Kconfig. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-08-22init/Kconfig: remove EXPERT from CHECKPOINT_RESTOREAdrian Reber1-12/+12
The CHECKPOINT_RESTORE configuration option was introduced in 2012 and combined with EXPERT. CHECKPOINT_RESTORE is already enabled in many distribution kernels and also part of the defconfigs of various architectures. To make it easier for distributions to enable CHECKPOINT_RESTORE this removes EXPERT and moves the configuration option out of the EXPERT block. Link: http://lkml.kernel.org/r/20180712130733.11510-1-adrian@lisas.de Signed-off-by: Adrian Reber <adrian@lisas.de> Acked-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Acked-by: Pavel Emelyanov <xemul@virtuozzo.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22init/Kconfig: fix its typosRandy Dunlap1-2/+2
Correct typos of "it's" to "its. Link: http://lkml.kernel.org/r/0ac627b6-5527-55f4-0489-1631aa34fc11@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17mm: introduce CONFIG_MEMCG_KMEM as combination of CONFIG_MEMCG && !CONFIG_SLOBKirill Tkhai1-0/+5
Introduce new config option, which is used to replace repeating CONFIG_MEMCG && !CONFIG_SLOB pattern. Next patches add a little more memcg+kmem related code, so let's keep the defines more clearly. Link: http://lkml.kernel.org/r/153063053670.1818.15013136946600481138.stgit@localhost.localdomain Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Tested-by: Shakeel Butt <shakeelb@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Josef Bacik <jbacik@fb.com> Cc: Li RongQing <lirongqing@baidu.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Roman Gushchin <guro@fb.com> Cc: Sahitya Tummala <stummala@codeaurora.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <longman@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-15Merge tag 'kconfig-v4.19-2' of ↵Linus Torvalds1-3/+11
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kconfig consolidation from Masahiro Yamada: "Consolidation of Kconfig files by Christoph Hellwig. Move the source statements of arch-independent Kconfig files instead of duplicating the includes in every arch/$(SRCARCH)/Kconfig" * tag 'kconfig-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kconfig: add a Memory Management options" menu kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt kconfig: use a menu in arch/Kconfig to reduce clutter kconfig: include kernel/Kconfig.preempt from init/Kconfig Kconfig: consolidate the "Kernel hacking" menu kconfig: include common Kconfig files from top-level Kconfig kconfig: remove duplicate SWAP symbol defintions um: create a proper drivers Kconfig um: cleanup Kconfig files um: stop abusing KBUILD_KCONFIG
2018-08-15Merge tag 'kconfig-v4.19' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kconfig updates from Masahiro Yamada: - show clearer error messages where pkg-config is needed, but not installed - rename SYMBOL_AUTO to SYMBOL_NO_WRITE to reflect its semantics - create all necessary directories by Kconfig tool itself instead of Makefile - update the .config unconditionally when syncconfig is invoked - use 'include' directive instead of '-include' where include/config/{auto,tristate}.conf is mandatory - do not try to update the .config when running install targets - add .DELETE_ON_ERROR to delete partially updated files - misc cleanups and fixes * tag 'kconfig-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kconfig: remove P_ENV property type kconfig: remove unused sym_get_env_prop() function kconfig: fix the rule of mainmenu_stmt symbol init/Kconfig: Use short unix-style option instead of --longname Kbuild: Makefile.modbuiltin: include auto.conf and tristate.conf mandatory kbuild: remove auto.conf from prerequisite of phony targets kbuild: do not update config for 'make kernelrelease' kbuild: do not update config when running install targets kbuild: add .DELETE_ON_ERROR special target kbuild: use 'include' directive to load auto.conf from top Makefile kconfig: allow all config targets to write auto.conf if missing kconfig: make syncconfig update .config regardless of sym_change_count kconfig: create directories needed for syncconfig by itself kconfig: remove unneeded directory generation from local*config kconfig: split out useful helpers in confdata.c kconfig: rename file_write_dep and move it to confdata.c kconfig: fix typos in description of "choice" in kconfig-language.txt kconfig: handle format string before calling conf_message_callback() kconfig: rename SYMBOL_AUTO to SYMBOL_NO_WRITE kconfig: check for pkg-config on make {menu,n,g,x}config
2018-08-15Merge tag 'kbuild-v4.19' of ↵Linus Torvalds1-0/+9
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - verify depmod is installed before modules_install - support build salt in case build ids must be unique between builds - allow users to specify additional host compiler flags via HOST*FLAGS, and rename internal variables to KBUILD_HOST*FLAGS - update buildtar script to drop vax support, add arm64 support - update builddeb script for better debarch support - document the pit-fall of if_changed usage - fix parallel build of UML with O= option - make 'samples' target depend on headers_install to fix build errors - remove deprecated host-progs variable - add a new coccinelle script for refcount_t vs atomic_t check - improve double-test coccinelle script - misc cleanups and fixes * tag 'kbuild-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (41 commits) coccicheck: return proper error code on fail Coccinelle: doubletest: reduce side effect false positives kbuild: remove deprecated host-progs variable kbuild: make samples really depend on headers_install um: clean up archheaders recipe kbuild: add %asm-generic to no-dot-config-targets um: fix parallel building with O= option scripts: Add Python 3 support to tracing/draw_functrace.py builddeb: Add automatic support for sh{3,4}{,eb} architectures builddeb: Add automatic support for riscv* architectures builddeb: Add automatic support for m68k architecture builddeb: Add automatic support for or1k architecture builddeb: Add automatic support for sparc64 architecture builddeb: Add automatic support for mips{,64}r6{,el} architectures builddeb: Add automatic support for mips64el architecture builddeb: Add automatic support for ppc64 and powerpcspe architectures builddeb: Introduce functions to simplify kconfig tests in set_debarch builddeb: Drop check for 32-bit s390 builddeb: Change architecture detection fallback to use dpkg-architecture builddeb: Skip architecture detection when KBUILD_DEBARCH is set ...
2018-08-13Merge branch 'for-linus' of ↵Linus Torvalds1-1/+14
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Heiko Carstens: "Since Martin is on vacation you get the s390 pull request from me: - Host large page support for KVM guests. As the patches have large impact on arch/s390/mm/ this series goes out via both the KVM and the s390 tree. - Add an option for no compression to the "Kernel compression mode" menu, this will come in handy with the rework of the early boot code. - A large rework of the early boot code that will make life easier for KASAN and KASLR. With the rework the bootable uncompressed image is not generated anymore, only the bzImage is available. For debuggung purposes the new "no compression" option is used. - Re-enable the gcc plugins as the issue with the latent entropy plugin is solved with the early boot code rework. - More spectre relates changes: + Detect the etoken facility and remove expolines automatically. + Add expolines to a few more indirect branches. - A rewrite of the common I/O layer trace points to make them consumable by 'perf stat'. - Add support for format-3 PCI function measurement blocks. - Changes for the zcrypt driver: + Add attributes to indicate the load of cards and queues. + Restructure some code for the upcoming AP device support in KVM. - Build flags improvements in various Makefiles. - A few fixes for the kdump support. - A couple of patches for gcc 8 compile warning cleanup. - Cleanup s390 specific proc handlers. - Add s390 support to the restartable sequence self tests. - Some PTR_RET vs PTR_ERR_OR_ZERO cleanup. - Lots of bug fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (107 commits) s390/dasd: fix hanging offline processing due to canceled worker s390/dasd: fix panic for failed online processing s390/mm: fix addressing exception after suspend/resume rseq/selftests: add s390 support s390: fix br_r1_trampoline for machines without exrl s390/lib: use expoline for all bcr instructions s390/numa: move initial setup of node_to_cpumask_map s390/kdump: Fix elfcorehdr size calculation s390/cpum_sf: save TOD clock base in SDBs for time conversion KVM: s390: Add huge page enablement control s390/mm: Add huge page gmap linking support s390/mm: hugetlb pages within a gmap can not be freed KVM: s390: Add skey emulation fault handling s390/mm: Add huge pmd storage key handling s390/mm: Clear skeys for newly mapped huge guest pmds s390/mm: Clear huge page storage keys on enable_skey s390/mm: Add huge page dirty sync support s390/mm: Add gmap pmd invalidation and clearing s390/mm: Add gmap pmd notification bit setting s390/mm: Add gmap pmd linking ...
2018-08-09init/Kconfig: Use short unix-style option instead of --longnameRob Landley1-2/+2
Avoids warning messages with the latest release of toybox, which never bothered to implement the --longopts nothing was using. Signed-off-by: Rob Landley <rob@landley.net> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-08-02kconfig: include kernel/Kconfig.preempt from init/KconfigChristoph Hellwig1-0/+1
Almost all architectures include it. Add a ARCH_NO_PREEMPT symbol to disable preempt support for alpha, hexagon, non-coldfire m68k and user mode Linux. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-08-02kconfig: include common Kconfig files from top-level KconfigChristoph Hellwig1-2/+2
Instead of duplicating the source statements in every architecture just do it once in the toplevel Kconfig file. Note that with this the inclusion of arch/$(SRCARCH/Kconfig moves out of the top-level Kconfig into arch/Kconfig so that don't violate ordering constraits while keeping a sensible menu structure. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-08-02kconfig: remove duplicate SWAP symbol defintionsChristoph Hellwig1-1/+8
microblaze and nios2 define their own always n SWAP symbols. Remove those and let the generic defintion do the right thing by adding a new symbol to disable swap entirely. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-07-18kbuild: Add build salt to the kernel and modulesLaura Abbott1-0/+9
In Fedora, the debug information is packaged separately (foo-debuginfo) and can be installed separately. There's been a long standing issue where only one version of a debuginfo info package can be installed at a time. There's been an effort for Fedora for parallel debuginfo to rectify this problem. Part of the requirement to allow parallel debuginfo to work is that build ids are unique between builds. The existing upstream rpm implementation ensures this by re-calculating the build-id using the version and release as a seed. This doesn't work 100% for the kernel because of the vDSO which is its own binary and doesn't get updated when embedded. Fix this by adding some data in an ELF note for both the kernel and modules. The data is controlled via a Kconfig option so distributions can set it to an appropriate value to ensure uniqueness between builds. Suggested-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-06-30Merge tag 'kbuild-fixes-v4.18' of ↵Linus Torvalds1-4/+3
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - introduce __diag_* macros and suppress -Wattribute-alias warnings from GCC 8 - fix stack protector test script for x86_64 - fix line number handling in Kconfig - document that '#' starts a comment in Kconfig - handle P_SYMBOL property in dump debugging of Kconfig - correct help message of LD_DEAD_CODE_DATA_ELIMINATION - fix occasional segmentation faults in Kconfig * tag 'kbuild-fixes-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kconfig: loop boundary condition fix kbuild: reword help of LD_DEAD_CODE_DATA_ELIMINATION kconfig: handle P_SYMBOL in print_symbol() kconfig: document Kconfig source file comments kconfig: fix line numbers for if-entries in menu tree stack-protector: Fix test with 32-bit userland and CONFIG_64BIT=y powerpc: Remove -Wattribute-alias pragmas disable -Wattribute-alias warning for SYSCALL_DEFINEx() kbuild: add macro for controlling warnings to linux/compiler.h
2018-06-28kbuild: reword help of LD_DEAD_CODE_DATA_ELIMINATIONMasahiro Yamada1-4/+3
Since commit 5d20ee3192a5 ("kbuild: Allow LD_DEAD_CODE_DATA_ELIMINATION to be selectable if enabled"), HAVE_LD_DEAD_CODE_DATA_ELIMINATION is supposed to be selected by architectures that are capable of this functionality. LD_DEAD_CODE_DATA_ELIMINATION is now users' selection. Update the help message. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-06-25init/Kconfig: add an option for uncompressed kernelVasily Gorbik1-1/+14
Add "None" as the kernel compression mode. This option is useful for debugging the kernel in slow simulation environments, where decompressing and moving the kernel is awfully slow. Uncompressed kernel implementation might allow early boot code to skip the decompressor and jump right at uncompressed kernel image entry point. Platforms implementing that should define HAVE_KERNEL_UNCOMPRESSED. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-06-14dma-mapping: move all DMA mapping code to kernel/dmaChristoph Hellwig1-4/+0
Currently the code is split over various files with dma- prefixes in the lib/ and drives/base directories, and the number of files keeps growing. Move them into a single directory to keep the code together and remove the file name prefixes. To match the irq infrastructure this directory is placed under the kernel/ directory. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-06-13Merge tag 'kbuild-v4.18-2' of ↵Linus Torvalds1-0/+15
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull more Kbuild updates from Masahiro Yamada: - fix some bugs introduced by the recent Kconfig syntax extension - add some symbols about compiler information in Kconfig, such as CC_IS_GCC, CC_IS_CLANG, GCC_VERSION, etc. - test compiler capability for the stack protector in Kconfig, and clean-up Makefile - test compiler capability for GCC-plugins in Kconfig, and clean-up Makefile - allow to enable GCC-plugins for COMPILE_TEST - test compiler capability for KCOV in Kconfig and correct dependency - remove auto-detect mode of the GCOV format, which is now more nicely handled in Kconfig - test compiler capability for mprofile-kernel on PowerPC, and clean-up Makefile - misc cleanups * tag 'kbuild-v4.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: linux/linkage.h: replace VMLINUX_SYMBOL_STR() with __stringify() kconfig: fix localmodconfig sh: remove no-op macro VMLINUX_SYMBOL() powerpc/kbuild: move -mprofile-kernel check to Kconfig Documentation: kconfig: add recommended way to describe compiler support gcc-plugins: disable GCC_PLUGIN_STRUCTLEAK_BYREF_ALL for COMPILE_TEST gcc-plugins: allow to enable GCC_PLUGINS for COMPILE_TEST gcc-plugins: test plugin support in Kconfig and clean up Makefile gcc-plugins: move GCC version check for PowerPC to Kconfig kcov: test compiler capability in Kconfig and correct dependency gcov: remove CONFIG_GCOV_FORMAT_AUTODETECT arm64: move GCC version check for ARCH_SUPPORTS_INT128 to Kconfig kconfig: add CC_IS_CLANG and CLANG_VERSION kconfig: add CC_IS_GCC and GCC_VERSION stack-protector: test compiler capability in Kconfig and drop AUTO mode kbuild: fix endless syncconfig in case arch Makefile sets CROSS_COMPILE
2018-06-10Merge branch 'core-rseq-for-linus' of ↵Linus Torvalds1-0/+23
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull restartable sequence support from Thomas Gleixner: "The restartable sequences syscall (finally): After a lot of back and forth discussion and massive delays caused by the speculative distraction of maintainers, the core set of restartable sequences has finally reached a consensus. It comes with the basic non disputed core implementation along with support for arm, powerpc and x86 and a full set of selftests It was exposed to linux-next earlier this week, so it does not fully comply with the merge window requirements, but there is really no point to drag it out for yet another cycle" * 'core-rseq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rseq/selftests: Provide Makefile, scripts, gitignore rseq/selftests: Provide parametrized tests rseq/selftests: Provide basic percpu ops test rseq/selftests: Provide basic test rseq/selftests: Provide rseq library selftests/lib.mk: Introduce OVERRIDE_TARGETS powerpc: Wire up restartable sequences system call powerpc: Add syscall detection for restartable sequences powerpc: Add support for restartable sequences x86: Wire up restartable sequence system call x86: Add support for restartable sequences arm: Wire up restartable sequences system call arm: Add syscall detection for restartable sequences arm: Add restartable sequences support rseq: Introduce restartable sequences system call uapi/headers: Provide types_32_64.h
2018-06-08kconfig: add CC_IS_CLANG and CLANG_VERSIONMasahiro Yamada1-0/+7
This will be useful to describe the clang version dependency. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org>
2018-06-08kconfig: add CC_IS_GCC and GCC_VERSIONMasahiro Yamada1-0/+8
This will be useful to specify the required compiler version, like this: config FOO bool "Use Foo" depends on GCC_VERSION >= 40800 help This feature requires GCC 4.8 or newer. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org>
2018-06-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds1-0/+1
Pull networking updates from David Miller: 1) Add Maglev hashing scheduler to IPVS, from Inju Song. 2) Lots of new TC subsystem tests from Roman Mashak. 3) Add TCP zero copy receive and fix delayed acks and autotuning with SO_RCVLOWAT, from Eric Dumazet. 4) Add XDP_REDIRECT support to mlx5 driver, from Jesper Dangaard Brouer. 5) Add ttl inherit support to vxlan, from Hangbin Liu. 6) Properly separate ipv6 routes into their logically independant components. fib6_info for the routing table, and fib6_nh for sets of nexthops, which thus can be shared. From David Ahern. 7) Add bpf_xdp_adjust_tail helper, which can be used to generate ICMP messages from XDP programs. From Nikita V. Shirokov. 8) Lots of long overdue cleanups to the r8169 driver, from Heiner Kallweit. 9) Add BTF ("BPF Type Format"), from Martin KaFai Lau. 10) Add traffic condition monitoring to iwlwifi, from Luca Coelho. 11) Plumb extack down into fib_rules, from Roopa Prabhu. 12) Add Flower classifier offload support to igb, from Vinicius Costa Gomes. 13) Add UDP GSO support, from Willem de Bruijn. 14) Add documentation for eBPF helpers, from Quentin Monnet. 15) Add TLS tx offload to mlx5, from Ilya Lesokhin. 16) Allow applications to be given the number of bytes available to read on a socket via a control message returned from recvmsg(), from Soheil Hassas Yeganeh. 17) Add x86_32 eBPF JIT compiler, from Wang YanQing. 18) Add AF_XDP sockets, with zerocopy support infrastructure as well. From Björn Töpel. 19) Remove indirect load support from all of the BPF JITs and handle these operations in the verifier by translating them into native BPF instead. From Daniel Borkmann. 20) Add GRO support to ipv6 gre tunnels, from Eran Ben Elisha. 21) Allow XDP programs to do lookups in the main kernel routing tables for forwarding. From David Ahern. 22) Allow drivers to store hardware state into an ELF section of kernel dump vmcore files, and use it in cxgb4. From Rahul Lakkireddy. 23) Various RACK and loss detection improvements in TCP, from Yuchung Cheng. 24) Add TCP SACK compression, from Eric Dumazet. 25) Add User Mode Helper support and basic bpfilter infrastructure, from Alexei Starovoitov. 26) Support ports and protocol values in RTM_GETROUTE, from Roopa Prabhu. 27) Support bulking in ->ndo_xdp_xmit() API, from Jesper Dangaard Brouer. 28) Add lots of forwarding selftests, from Petr Machata. 29) Add generic network device failover driver, from Sridhar Samudrala. * ra.kernel.org:/pub/scm/linux/kernel/git/davem/net-next: (1959 commits) strparser: Add __strp_unpause and use it in ktls. rxrpc: Fix terminal retransmission connection ID to include the channel net: hns3: Optimize PF CMDQ interrupt switching process net: hns3: Fix for VF mailbox receiving unknown message net: hns3: Fix for VF mailbox cannot receiving PF response bnx2x: use the right constant Revert "net: sched: cls: Fix offloading when ingress dev is vxlan" net: dsa: b53: Fix for brcm tag issue in Cygnus SoC enic: fix UDP rss bits netdev-FAQ: clarify DaveM's position for stable backports rtnetlink: validate attributes in do_setlink() mlxsw: Add extack messages for port_{un, }split failures netdevsim: Add extack error message for devlink reload devlink: Add extack to reload and port_{un, }split operations net: metrics: add proper netlink validation ipmr: fix error path when ipmr_new_table fails ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds net: hns3: remove unused hclgevf_cfg_func_mta_filter netfilter: provide udp*_lib_lookup for nf_tproxy qed*: Utilize FW 8.37.2.0 ...
2018-06-06Merge tag 'kconfig-v4.18' of ↵Linus Torvalds1-21/+4
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kconfig updates from Masahiro Yamada: "Kconfig now supports new functionality to perform textual substitution. It has been a while since Linus suggested to move compiler option tests from makefiles to Kconfig. Finally, here it is. The implementation has been generalized into a Make-like macro language. Some built-in functions such as 'shell' are provided. Variables and user-defined functions are also supported so that 'cc-option', 'ld-option', etc. are implemented as macros. Summary: - refactor package checks for building {m,n,q,g}conf - remove unused/unmaintained localization support - remove Kbuild cache - drop CONFIG_CROSS_COMPILE support - replace 'option env=' with direct variable expansion - add built-in functions such as 'shell' - support variables and user-defined functions - add helper macros as as 'cc-option' - add unit tests and a document of the new macro language - add 'testconfig' to help - fix warnings from GCC 8.1" * tag 'kconfig-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (30 commits) kconfig: Avoid format overflow warning from GCC 8.1 kbuild: Move last word of nconfig help to the previous line kconfig: Add testconfig into make help output kconfig: add basic helper macros to scripts/Kconfig.include kconfig: show compiler version text in the top comment kconfig: test: add Kconfig macro language tests Documentation: kconfig: document a new Kconfig macro language kconfig: error out if a recursive variable references itself kconfig: add 'filename' and 'lineno' built-in variables kconfig: add 'info', 'warning-if', and 'error-if' built-in functions kconfig: expand lefthand side of assignment statement kconfig: support append assignment operator kconfig: support simply expanded variable kconfig: support user-defined function and recursively expanded variable kconfig: begin PARAM state only when seeing a command keyword kconfig: replace $(UNAME_RELEASE) with function call kconfig: add 'shell' built-in function kconfig: add built-in function support kconfig: make default prompt of mainmenu less specific kconfig: remove sym_expand_string_value() ...
2018-06-06Merge tag 'kbuild-v4.18' of ↵Linus Torvalds1-0/+27
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - improve fixdep to coalesce consecutive slashes in dep-files - fix some issues of the maintainer string generation in deb-pkg script - remove unused CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX and clean-up several tools and linker scripts - clean-up modpost - allow to enable the dead code/data elimination for PowerPC in EXPERT mode - improve two coccinelle scripts for better performance - pass endianness and machine size flags to sparse for all architecture - misc fixes * tag 'kbuild-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (25 commits) kbuild: add machine size to CHECKFLAGS kbuild: add endianness flag to CHEKCFLAGS kbuild: $(CHECK) doesnt need NOSTDINC_FLAGS twice scripts: Fixed printf format mismatch scripts/tags.sh: use `find` for $ALLSOURCE_ARCHS generation coccinelle: deref_null: improve performance coccinelle: mini_lock: improve performance powerpc: Allow LD_DEAD_CODE_DATA_ELIMINATION to be selected kbuild: Allow LD_DEAD_CODE_DATA_ELIMINATION to be selectable if enabled kbuild: LD_DEAD_CODE_DATA_ELIMINATION no -ffunction-sections/-fdata-sections for module build kbuild: Fix asm-generic/vmlinux.lds.h for LD_DEAD_CODE_DATA_ELIMINATION modpost: constify *modname function argument where possible modpost: remove redundant is_vmlinux() test modpost: use strstarts() helper more widely modpost: pass struct elf_info pointer to get_modinfo() checkpatch: remove VMLINUX_SYMBOL() check vmlinux.lds.h: remove no-op macro VMLINUX_SYMBOL() kbuild: remove CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX export.h: remove code for prefixing symbols with underscore depmod.sh: remove symbol prefix support ...
2018-06-06rseq: Introduce restartable sequences system callMathieu Desnoyers1-0/+23
Expose a new system call allowing each thread to register one userspace memory area to be used as an ABI between kernel and user-space for two purposes: user-space restartable sequences and quick access to read the current CPU number value from user-space. * Restartable sequences (per-cpu atomics) Restartables sequences allow user-space to perform update operations on per-cpu data without requiring heavy-weight atomic operations. The restartable critical sections (percpu atomics) work has been started by Paul Turner and Andrew Hunter. It lets the kernel handle restart of critical sections. [1] [2] The re-implementation proposed here brings a few simplifications to the ABI which facilitates porting to other architectures and speeds up the user-space fast path. Here are benchmarks of various rseq use-cases. Test hardware: arm32: ARMv7 Processor rev 4 (v7l) "Cubietruck", 2-core x86-64: Intel E5-2630 v3@2.40GHz, 16-core, hyperthreading The following benchmarks were all performed on a single thread. * Per-CPU statistic counter increment getcpu+atomic (ns/op) rseq (ns/op) speedup arm32: 344.0 31.4 11.0 x86-64: 15.3 2.0 7.7 * LTTng-UST: write event 32-bit header, 32-bit payload into tracer per-cpu buffer getcpu+atomic (ns/op) rseq (ns/op) speedup arm32: 2502.0 2250.0 1.1 x86-64: 117.4 98.0 1.2 * liburcu percpu: lock-unlock pair, dereference, read/compare word getcpu+atomic (ns/op) rseq (ns/op) speedup arm32: 751.0 128.5 5.8 x86-64: 53.4 28.6 1.9 * jemalloc memory allocator adapted to use rseq Using rseq with per-cpu memory pools in jemalloc at Facebook (based on rseq 2016 implementation): The production workload response-time has 1-2% gain avg. latency, and the P99 overall latency drops by 2-3%. * Reading the current CPU number Speeding up reading the current CPU number on which the caller thread is running is done by keeping the current CPU number up do date within the cpu_id field of the memory area registered by the thread. This is done by making scheduler preemption set the TIF_NOTIFY_RESUME flag on the current thread. Upon return to user-space, a notify-resume handler updates the current CPU value within the registered user-space memory area. User-space can then read the current CPU number directly from memory. Keeping the current cpu id in a memory area shared between kernel and user-space is an improvement over current mechanisms available to read the current CPU number, which has the following benefits over alternative approaches: - 35x speedup on ARM vs system call through glibc - 20x speedup on x86 compared to calling glibc, which calls vdso executing a "lsl" instruction, - 14x speedup on x86 compared to inlined "lsl" instruction, - Unlike vdso approaches, this cpu_id value can be read from an inline assembly, which makes it a useful building block for restartable sequences. - The approach of reading the cpu id through memory mapping shared between kernel and user-space is portable (e.g. ARM), which is not the case for the lsl-based x86 vdso. On x86, yet another possible approach would be to use the gs segment selector to point to user-space per-cpu data. This approach performs similarly to the cpu id cache, but it has two disadvantages: it is not portable, and it is incompatible with existing applications already using the gs segment selector for other purposes. Benchmarking various approaches for reading the current CPU number: ARMv7 Processor rev 4 (v7l) Machine model: Cubietruck - Baseline (empty loop): 8.4 ns - Read CPU from rseq cpu_id: 16.7 ns - Read CPU from rseq cpu_id (lazy register): 19.8 ns - glibc 2.19-0ubuntu6.6 getcpu: 301.8 ns - getcpu system call: 234.9 ns x86-64 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz: - Baseline (empty loop): 0.8 ns - Read CPU from rseq cpu_id: 0.8 ns - Read CPU from rseq cpu_id (lazy register): 0.8 ns - Read using gs segment selector: 0.8 ns - "lsl" inline assembly: 13.0 ns - glibc 2.19-0ubuntu6 getcpu: 16.6 ns - getcpu system call: 53.9 ns - Speed (benchmark taken on v8 of patchset) Running 10 runs of hackbench -l 100000 seems to indicate, contrary to expectations, that enabling CONFIG_RSEQ slightly accelerates the scheduler: Configuration: 2 sockets * 8-core Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz (directly on hardware, hyperthreading disabled in BIOS, energy saving disabled in BIOS, turboboost disabled in BIOS, cpuidle.off=1 kernel parameter), with a Linux v4.6 defconfig+localyesconfig, restartable sequences series applied. * CONFIG_RSEQ=n avg.: 41.37 s std.dev.: 0.36 s * CONFIG_RSEQ=y avg.: 40.46 s std.dev.: 0.33 s - Size On x86-64, between CONFIG_RSEQ=n/y, the text size increase of vmlinux is 567 bytes, and the data size increase of vmlinux is 5696 bytes. [1] https://lwn.net/Articles/650333/ [2] http://www.linuxplumbersconf.org/2013/ocw/system/presentations/1695/original/LPC%20-%20PerCpu%20Atomics.pdf Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Joel Fernandes <joelaf@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Watson <davejwatson@fb.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Chris Lameter <cl@linux.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Andrew Hunter <ahh@google.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Paul Turner <pjt@google.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Maurer <bmaurer@fb.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-api@vger.kernel.org Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20151027235635.16059.11630.stgit@pjt-glaptop.roam.corp.google.com Link: http://lkml.kernel.org/r/20150624222609.6116.86035.stgit@kitami.mtv.corp.google.com Link: https://lkml.kernel.org/r/20180602124408.8430-3-mathieu.desnoyers@efficios.com
2018-05-29kconfig: replace $(UNAME_RELEASE) with function callMasahiro Yamada1-2/+2
Now that 'shell' function is supported, this can be self-contained in Kconfig. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-05-29kconfig: reference environment variables directly and remove 'option env='Masahiro Yamada1-12/+4
To get access to environment variables, Kconfig needs to define a symbol using "option env=" syntax. It is tedious to add a symbol entry for each environment variable given that we need to define much more such as 'CC', 'AS', 'srctree' etc. to evaluate the compiler capability in Kconfig. Adding '$' for symbol references is grammatically inconsistent. Looking at the code, the symbols prefixed with 'S' are expanded by: - conf_expand_value() This is used to expand 'arch/$ARCH/defconfig' and 'defconfig_list' - sym_expand_string_value() This is used to expand strings in 'source' and 'mainmenu' All of them are fixed values independent of user configuration. So, they can be changed into the direct expansion instead of symbols. This change makes the code much cleaner. The bounce symbols 'SRCARCH', 'ARCH', 'SUBARCH', 'KERNELVERSION' are gone. sym_init() hard-coding 'UNAME_RELEASE' is also gone. 'UNAME_RELEASE' should be replaced with an environment variable. ARCH_DEFCONFIG is a normal symbol, so it should be simply referenced without '$' prefix. The new syntax is addicted by Make. The variable reference needs parentheses, like $(FOO), but you can omit them for single-letter variables, like $F. Yet, in Makefiles, people tend to use the parenthetical form for consistency / clarification. At this moment, only the environment variable is supported, but I will extend the concept of 'variable' later on. The variables are expanded in the lexer so we can simplify the token handling on the parser side. For example, the following code works. [Example code] config MY_TOOLCHAIN_LIST string default "My tools: CC=$(CC), AS=$(AS), CPP=$(CPP)" [Result] $ make -s alldefconfig && tail -n 1 .config CONFIG_MY_TOOLCHAIN_LIST="My tools: CC=gcc, AS=as, CPP=gcc -E" Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org>
2018-05-29kbuild: remove CONFIG_CROSS_COMPILE supportMasahiro Yamada1-9/+0
Kbuild provides a couple of ways to specify CROSS_COMPILE: [1] Command line [2] Environment [3] arch/*/Makefile (only some architectures) [4] CONFIG_CROSS_COMPILE [4] is problematic for the compiler capability tests in Kconfig. CONFIG_CROSS_COMPILE allows users to change the compiler prefix from 'make menuconfig', etc. It means, the compiler options would have to be all re-calculated everytime CONFIG_CROSS_COMPILE is changed. To avoid complexity and performance issues, I'd like to evaluate the shell commands statically, i.e. only parsing Kconfig files. I guess the majority is [1] or [2]. Currently, there are only 5 defconfig files that specify CONFIG_CROSS_COMPILE. arch/arm/configs/lpc18xx_defconfig arch/hexagon/configs/comet_defconfig arch/nds32/configs/defconfig arch/openrisc/configs/or1ksim_defconfig arch/openrisc/configs/simple_smp_defconfig Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org>
2018-05-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-1/+1
S390 bpf_jit.S is removed in net-next and had changes in 'net', since that code isn't used any more take the removal. TLS data structures split the TX and RX components in 'net-next', put the new struct members from the bug fix in 'net' into the RX part. The 'net-next' tree had some reworking of how the ERSPAN code works in the GRE tunneling code, overlapping with a one-line headroom calculation fix in 'net'. Overlapping changes in __sock_map_ctx_update_elem(), keep the bits that read the prog members via READ_ONCE() into local variables before using them. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18sched/fair: Fix documentation file pathSebastian Andrzej Siewior1-1/+1
The 'tip' prefix probably referred to the -tip tree and is not required, remove it. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180515165328.24899-1-bigeasy@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-05-17kbuild: Allow LD_DEAD_CODE_DATA_ELIMINATION to be selectable if enabledNicholas Piggin1-0/+27
Architectures that are capable can select HAVE_LD_DEAD_CODE_DATA_ELIMINATION to enable selection of that option (as an EXPERT kernel option). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-05-14bpf: enable stackmap with build_id in nmi contextSong Liu1-0/+1
Currently, we cannot parse build_id in nmi context because of up_read(&current->mm->mmap_sem), this makes stackmap with build_id less useful. This patch enables parsing build_id in nmi by putting the up_read() call in irq_work. To avoid memory allocation in nmi context, we use per cpu variable for the irq_work. As a result, only one irq_work per cpu is allowed. If the irq_work is in-use, we fallback to only report ips. Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-15Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds1-0/+10
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A set of fixes and updates for x86: - Address a swiotlb regression which was caused by the recent DMA rework and made driver fail because dma_direct_supported() returned false - Fix a signedness bug in the APIC ID validation which caused invalid APIC IDs to be detected as valid thereby bloating the CPU possible space. - Fix inconsisten config dependcy/select magic for the MFD_CS5535 driver. - Fix a corruption of the physical address space bits when encryption has reduced the address space and late cpuinfo updates overwrite the reduced bit information with the original value. - Dominiks syscall rework which consolidates the architecture specific syscall functions so all syscalls can be wrapped with the same macros. This allows to switch x86/64 to struct pt_regs based syscalls. Extend the clearing of user space controlled registers in the entry patch to the lower registers" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Fix signedness bug in APIC ID validity checks x86/cpu: Prevent cpuinfo_x86::x86_phys_bits adjustment corruption x86/olpc: Fix inconsistent MFD_CS5535 configuration swiotlb: Use dma_direct_supported() for swiotlb_ops syscalls/x86: Adapt syscall_wrapper.h to the new syscall stub naming convention syscalls/core, syscalls/x86: Rename struct pt_regs-based sys_*() to __x64_sys_*() syscalls/core, syscalls/x86: Clean up compat syscall stub naming convention syscalls/core, syscalls/x86: Clean up syscall stub naming convention syscalls/x86: Extend register clearing on syscall entry to lower registers syscalls/x86: Unconditionally enable 'struct pt_regs' based syscalls on x86_64 syscalls/x86: Use 'struct pt_regs' based syscall calling for IA32_EMULATION and x32 syscalls/core: Prepare CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y for compat syscalls syscalls/x86: Use 'struct pt_regs' based syscall calling convention for 64-bit syscalls syscalls/core: Introduce CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y x86/syscalls: Don't pointlessly reload the system call number x86/mm: Fix documentation of module mapping range with 4-level paging x86/cpuid: Switch to 'static const' specifier
2018-04-05Merge tag 'gpio-v4.17-1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO updates from Linus Walleij: "This is the bulk of GPIO changes for the v4.17 kernel cycle: New drivers: - Nintendo Wii GameCube GPIO, known as "Hollywood" - Raspberry Pi mailbox service GPIO expander - Spreadtrum main SC9860 SoC and IEC GPIO controllers. Improvements: - Implemented .get_multiple() callback for most of the high-performance industrial GPIO cards for the ISA bus. - ISA GPIO drivers now select the ISA_BUS_API instead of depending on it. This is merged with the same pattern for all the ISA drivers and some other Kconfig cleanups related to this. Cleanup: - Delete the TZ1090 GPIO drivers following the deletion of this SoC from the ARM tree. - Move the documentation over to driver-api to conform with the rest of the kernel documentation build. - Continue to make the GPIO drivers include only <linux/gpio/driver.h> and not the too broad <linux/gpio.h> that we want to get rid of. - Managed to remove VLA allocation from two drivers pending more fixes in this area for the next merge window. - Misc janitorial fixes" * tag 'gpio-v4.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (77 commits) gpio: Add Spreadtrum PMIC EIC driver support gpio: Add Spreadtrum EIC driver support dt-bindings: gpio: Add Spreadtrum EIC controller documentation gpio: ath79: Fix potential NULL dereference in ath79_gpio_probe() pinctrl: qcom: Don't allow protected pins to be requested gpiolib: Support 'gpio-reserved-ranges' property gpiolib: Change bitmap allocation to kmalloc_array gpiolib: Extract mask allocation into subroutine dt-bindings: gpio: Add a gpio-reserved-ranges property gpio: mockup: fix a potential crash when creating debugfs entries gpio: pca953x: add compatibility for pcal6524 and pcal9555a gpio: dwapb: Add support for a bus clock gpio: Remove VLA from xra1403 driver gpio: Remove VLA from MAX3191X driver gpio: ws16c48: Implement get_multiple callback gpio: gpio-mm: Implement get_multiple callback gpio: 104-idi-48: Implement get_multiple callback gpio: 104-dio-48e: Implement get_multiple callback gpio: pcie-idio-24: Implement get_multiple/set_multiple callbacks gpio: pci-idio-16: Implement get_multiple callback ...
2018-04-05syscalls/core: Prepare CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y for compat syscallsDominik Brodowski1-3/+6
It may be useful for an architecture to override the definitions of the COMPAT_SYSCALL_DEFINE0() and __COMPAT_SYSCALL_DEFINEx() macros in <linux/compat.h>, in particular to use a different calling convention for syscalls. This patch provides a mechanism to do so, based on the previously introduced CONFIG_ARCH_HAS_SYSCALL_WRAPPER. If it is enabled, <asm/sycall_wrapper.h> is included in <linux/compat.h> and may be used to define the macros mentioned above. Moreover, as the syscall calling convention may be different if CONFIG_ARCH_HAS_SYSCALL_WRAPPER is set, the compat syscall function prototypes in <linux/compat.h> are #ifndef'd out in that case. As some of the syscalls and/or compat syscalls may not be present, the COND_SYSCALL() and COND_SYSCALL_COMPAT() macros in kernel/sys_ni.c as well as the SYS_NI() and COMPAT_SYS_NI() macros in kernel/time/posix-stubs.c can be re-defined in <asm/syscall_wrapper.h> iff CONFIG_ARCH_HAS_SYSCALL_WRAPPER is enabled. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180405095307.3730-5-linux@dominikbrodowski.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-04-05syscalls/core: Introduce CONFIG_ARCH_HAS_SYSCALL_WRAPPER=yDominik Brodowski1-0/+7
It may be useful for an architecture to override the definitions of the SYSCALL_DEFINE0() and __SYSCALL_DEFINEx() macros in <linux/syscalls.h>, in particular to use a different calling convention for syscalls. This patch provides a mechanism to do so: It introduces CONFIG_ARCH_HAS_SYSCALL_WRAPPER. If it is enabled, <asm/sycall_wrapper.h> is included in <linux/syscalls.h> and may be used to define the macros mentioned above. Moreover, as the syscall calling convention may be different if CONFIG_ARCH_HAS_SYSCALL_WRAPPER is set, the syscall function prototypes in <linux/syscalls.h> are #ifndef'd out in that case. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180405095307.3730-3-linux@dominikbrodowski.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-26treewide: simplify Kconfig dependencies for removed archsArnd Bergmann1-3/+2
A lot of Kconfig symbols have architecture specific dependencies. In those cases that depend on architectures we have already removed, they can be omitted. Acked-by: Kalle Valo <kvalo@codeaurora.org> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-09mn10300: Remove the architectureDavid Howells1-1/+1
Remove the MN10300 arch as the hardware is defunct. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David Howells <dhowells@redhat.com> cc: Masahiro Yamada <yamada.masahiro@socionext.com> cc: linux-am33-list@redhat.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-02-22pc104: Add EXPERT dependency for PC104 Kconfig optionWilliam Breathitt Gray1-1/+1
PC/104 device driver Kconfig options previously had an implicit EXPERT dependency by way of an explicit ISA_BUS_API dependency. Now that these driver Kconfig options select ISA_BUS_API rather than depend on it, the PC104 Kconfig option should have an explicit EXPERT dependency. The PC/104 form factor and bus architecture are common in embedded and specialized systems, but uncommon in typical desktop setups. For this reason, it is best to mask these devices and configurations via the EXPERT Kconfig option because the majority of users will never need to concern themselves with PC/104. Signed-off-by: William Breathitt Gray <vilhelm.gray@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2018-02-05membarrier: Provide core serializing command, *_SYNC_COREMathieu Desnoyers1-0/+3
Provide core serializing membarrier command to support memory reclaim by JIT. Each architecture needs to explicitly opt into that support by documenting in their architecture code how they provide the core serializing instructions required when returning from the membarrier IPI, and after the scheduler has updated the curr->mm pointer (before going back to user-space). They should then select ARCH_HAS_MEMBARRIER_SYNC_CORE to enable support for that command on their architecture. Architectures selecting this feature need to either document that they issue core serializing instructions when returning to user-space, or implement their architecture-specific sync_core_before_usermode(). Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Andrew Hunter <ahh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Avi Kivity <avi@scylladb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Dave Watson <davejwatson@fb.com> Cc: David Sehr <sehr@google.com> Cc: Greg Hackmann <ghackmann@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maged Michael <maged.michael@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/20180129202020.8515-9-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-05locking: Introduce sync_core_before_usermode()Mathieu Desnoyers1-0/+3
Introduce an architecture function that ensures the current CPU issues a core serializing instruction before returning to usermode. This is needed for the membarrier "sync_core" command. Architectures defining the sync_core_before_usermode() static inline need to select ARCH_HAS_SYNC_CORE_BEFORE_USERMODE. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Andrew Hunter <ahh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Avi Kivity <avi@scylladb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Dave Watson <davejwatson@fb.com> Cc: David Sehr <sehr@google.com> Cc: Greg Hackmann <ghackmann@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maged Michael <maged.michael@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/20180129202020.8515-7-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-05powerpc, membarrier: Skip memory barrier in switch_mm()Mathieu Desnoyers1-0/+3
Allow PowerPC to skip the full memory barrier in switch_mm(), and only issue the barrier when scheduling into a task belonging to a process that has registered to use expedited private. Threads targeting the same VM but which belong to different thread groups is a tricky case. It has a few consequences: It turns out that we cannot rely on get_nr_threads(p) to count the number of threads using a VM. We can use (atomic_read(&mm->mm_users) == 1 && get_nr_threads(p) == 1) instead to skip the synchronize_sched() for cases where the VM only has a single user, and that user only has a single thread. It also turns out that we cannot use for_each_thread() to set thread flags in all threads using a VM, as it only iterates on the thread group. Therefore, test the membarrier state variable directly rather than relying on thread flags. This means membarrier_register_private_expedited() needs to set the MEMBARRIER_STATE_PRIVATE_EXPEDITED flag, issue synchronize_sched(), and only then set MEMBARRIER_STATE_PRIVATE_EXPEDITED_READY which allows private expedited membarrier commands to succeed. membarrier_arch_switch_mm() now tests for the MEMBARRIER_STATE_PRIVATE_EXPEDITED flag. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Andrew Hunter <ahh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Avi Kivity <avi@scylladb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Dave Watson <davejwatson@fb.com> Cc: David Sehr <sehr@google.com> Cc: Greg Hackmann <ghackmann@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maged Michael <maged.michael@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20180129202020.8515-3-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-12Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "A Kconfig fix, a build fix and a membarrier bug fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: membarrier: Disable preemption when calling smp_call_function_many() sched/isolation: Make CONFIG_CPU_ISOLATION=y depend on SMP or COMPILE_TEST ia64, sched/cputime: Fix build error if CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y
2018-01-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller1-0/+7
Daniel Borkmann says: ==================== pull-request: bpf 2018-01-09 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Prevent out-of-bounds speculation in BPF maps by masking the index after bounds checks in order to fix spectre v1, and add an option BPF_JIT_ALWAYS_ON into Kconfig that allows for removing the BPF interpreter from the kernel in favor of JIT-only mode to make spectre v2 harder, from Alexei. 2) Remove false sharing of map refcount with max_entries which was used in spectre v1, from Daniel. 3) Add a missing NULL psock check in sockmap in order to fix a race, from John. 4) Fix test_align BPF selftest case since a recent change in verifier rejects the bit-wise arithmetic on pointers earlier but test_align update was missing, from Alexei. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-09bpf: introduce BPF_JIT_ALWAYS_ON configAlexei Starovoitov1-0/+7
The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715. A quote from goolge project zero blog: "At this point, it would normally be necessary to locate gadgets in the host kernel code that can be used to actually leak data by reading from an attacker-controlled location, shifting and masking the result appropriately and then using the result of that as offset to an attacker-controlled address for a load. But piecing gadgets together and figuring out which ones work in a speculation context seems annoying. So instead, we decided to use the eBPF interpreter, which is built into the host kernel - while there is no legitimate way to invoke it from inside a VM, the presence of the code in the host kernel's text section is sufficient to make it usable for the attack, just like with ordinary ROP gadgets." To make attacker job harder introduce BPF_JIT_ALWAYS_ON config option that removes interpreter from the kernel in favor of JIT-only mode. So far eBPF JIT is supported by: x64, arm64, arm32, sparc64, s390, powerpc64, mips64 The start of JITed program is randomized and code page is marked as read-only. In addition "constant blinding" can be turned on with net.core.bpf_jit_harden v2->v3: - move __bpf_prog_ret0 under ifdef (Daniel) v1->v2: - fix init order, test_bpf and cBPF (Daniel's feedback) - fix offloaded bpf (Jakub's feedback) - add 'return 0' dummy in case something can invoke prog->bpf_func - retarget bpf tree. For bpf-next the patch would need one extra hunk. It will be sent when the trees are merged back to net-next Considered doing: int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT; but it seems better to land the patch as-is and in bpf-next remove bpf_jit_enable global variable from all JITs, consolidate in one place and remove this jit_init() function. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-08sched/isolation: Make CONFIG_CPU_ISOLATION=y depend on SMP or COMPILE_TESTGeert Uytterhoeven1-0/+1
On uniprocessor systems, critical and non-critical tasks cannot be isolated, as there is only a single CPU core. Hence enabling CPU isolation by default on such systems does not make much sense. Instead of changing the default for !SMP, fix this by making the feature depend on SMP, with an override for compile-testing. Note that its sole selector (NO_HZ_FULL) already depends on SMP. This decreases kernel size for a default uniprocessor kernel by ca. 1 KiB. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Nicolas Pitre <nico@linaro.org> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 2c43838c99d9d23f ("sched/isolation: Enable CONFIG_CPU_ISOLATION=y by default") Link: http://lkml.kernel.org/r/1514891590-20782-1-git-send-email-geert@linux-m68k.org Signed-off-by: Ingo Molnar <mingo@kernel.org>