summaryrefslogtreecommitdiffstats
path: root/arch/x86
AgeCommit message (Collapse)AuthorFilesLines
2022-10-04Merge tag 'x86_platform_for_v6.1_rc1' of ↵Linus Torvalds2-0/+17
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform update from Borislav Petkov: "A single x86/platform improvement when the kernel is running as an ACRN guest: - Get TSC and CPU frequency from CPUID leaf 0x40000010 when the kernel is running as a guest on the ACRN hypervisor" * tag 'x86_platform_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/acrn: Set up timekeeping
2022-10-04Merge tag 'edac_updates_for_v6.1' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC updates from Borislav Petkov: - Add support for Skylake-S CPUs to ie31200_edac - Improve error decoding speed of the Intel drivers by avoiding the ACPI facilities but doing decoding in the driver itself - Other misc improvements to the Intel drivers - The usual cleanups and fixlets all over EDAC land * tag 'edac_updates_for_v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/i7300: Correct the i7300_exit() function name in comment x86/sb_edac: Add row column translation for Broadwell EDAC/i10nm: Print an extra register set of retry_rd_err_log EDAC/i10nm: Retrieve and print retry_rd_err_log registers for HBM EDAC/skx_common: Add ChipSelect ADXL component EDAC/ppc_4xx: Reorder symbols to get rid of a few forward declarations EDAC: Remove obsolete declarations in edac_module.h EDAC/i10nm: Add driver decoder for Ice Lake and Tremont CPUs EDAC/skx_common: Make output format similar EDAC/skx_common: Use driver decoder first EDAC/mc: Drop duplicated dimm->nr_pages debug printout EDAC/mc: Replace spaces with tabs in memtype flags definition EDAC/wq: Remove unneeded flush_workqueue() EDAC/ie31200: Add Skylake-S support
2022-10-03Merge tag 'hardening-v6.1-rc1' of ↵Linus Torvalds2-10/+20
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull kernel hardening updates from Kees Cook: "Most of the collected changes here are fixes across the tree for various hardening features (details noted below). The most notable new feature here is the addition of the memcpy() overflow warning (under CONFIG_FORTIFY_SOURCE), which is the next step on the path to killing the common class of "trivially detectable" buffer overflow conditions (i.e. on arrays with sizes known at compile time) that have resulted in many exploitable vulnerabilities over the years (e.g. BleedingTooth). This feature is expected to still have some undiscovered false positives. It's been in -next for a full development cycle and all the reported false positives have been fixed in their respective trees. All the known-bad code patterns we could find with Coccinelle are also either fixed in their respective trees or in flight. The commit message in commit 54d9469bc515 ("fortify: Add run-time WARN for cross-field memcpy()") for the feature has extensive details, but I'll repeat here that this is a warning _only_, and is not intended to actually block overflows (yet). The many patches fixing array sizes and struct members have been landing for several years now, and we're finally able to turn this on to find any remaining stragglers. Summary: Various fixes across several hardening areas: - loadpin: Fix verity target enforcement (Matthias Kaehlcke). - zero-call-used-regs: Add missing clobbers in paravirt (Bill Wendling). - CFI: clean up sparc function pointer type mismatches (Bart Van Assche). - Clang: Adjust compiler flag detection for various Clang changes (Sami Tolvanen, Kees Cook). - fortify: Fix warnings in arch-specific code in sh, ARM, and xen. Improvements to existing features: - testing: improve overflow KUnit test, introduce fortify KUnit test, add more coverage to LKDTM tests (Bart Van Assche, Kees Cook). - overflow: Relax overflow type checking for wider utility. New features: - string: Introduce strtomem() and strtomem_pad() to fill a gap in strncpy() replacement needs. - um: Enable FORTIFY_SOURCE support. - fortify: Enable run-time struct member memcpy() overflow warning" * tag 'hardening-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (27 commits) Makefile.extrawarn: Move -Wcast-function-type-strict to W=1 hardening: Remove Clang's enable flag for -ftrivial-auto-var-init=zero sparc: Unbreak the build x86/paravirt: add extra clobbers with ZERO_CALL_USED_REGS enabled x86/paravirt: clean up typos and grammaros fortify: Convert to struct vs member helpers fortify: Explicitly check bounds are compile-time constants x86/entry: Work around Clang __bdos() bug ARM: decompressor: Include .data.rel.ro.local fortify: Adjust KUnit test for modular build sh: machvec: Use char[] for section boundaries kunit/memcpy: Avoid pathological compile-time string size lib: Improve the is_signed_type() kunit test LoadPin: Require file with verity root digests to have a header dm: verity-loadpin: Only trust verity targets with enforcement LoadPin: Fix Kconfig doc about format of file with verity digests um: Enable FORTIFY_SOURCE lkdtm: Update tests for memcpy() run-time warnings fortify: Add run-time WARN for cross-field memcpy() fortify: Use SIZE_MAX instead of (size_t)-1 ...
2022-10-03Merge tag 'kcfi-v6.1-rc1' of ↵Linus Torvalds11-5/+139
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull kcfi updates from Kees Cook: "This replaces the prior support for Clang's standard Control Flow Integrity (CFI) instrumentation, which has required a lot of special conditions (e.g. LTO) and work-arounds. The new implementation ("Kernel CFI") is specific to C, directly designed for the Linux kernel, and takes advantage of architectural features like x86's IBT. This series retains arm64 support and adds x86 support. GCC support is expected in the future[1], and additional "generic" architectural support is expected soon[2]. Summary: - treewide: Remove old CFI support details - arm64: Replace Clang CFI support with Clang KCFI support - x86: Introduce Clang KCFI support" Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107048 [1] Link: https://github.com/samitolvanen/llvm-project/commits/kcfi_generic [2] * tag 'kcfi-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (22 commits) x86: Add support for CONFIG_CFI_CLANG x86/purgatory: Disable CFI x86: Add types to indirectly called assembly functions x86/tools/relocs: Ignore __kcfi_typeid_ relocations kallsyms: Drop CONFIG_CFI_CLANG workarounds objtool: Disable CFI warnings objtool: Preserve special st_shndx indexes in elf_update_symbol treewide: Drop __cficanonical treewide: Drop WARN_ON_FUNCTION_MISMATCH treewide: Drop function_nocfi init: Drop __nocfi from __init arm64: Drop unneeded __nocfi attributes arm64: Add CFI error handling arm64: Add types to indirect called assembly functions psci: Fix the function type for psci_initcall_t lkdtm: Emit an indirect call for CFI tests cfi: Add type helper macros cfi: Switch to -fsanitize=kcfi cfi: Drop __CFI_ADDRESSABLE cfi: Remove CONFIG_CFI_CLANG_SHADOW ...
2022-10-03Merge tag 'rust-v6.1-rc1' of https://github.com/Rust-for-Linux/linuxLinus Torvalds2-0/+11
Pull Rust introductory support from Kees Cook: "The tree has a recent base, but has fundamentally been in linux-next for a year and a half[1]. It's been updated based on feedback from the Kernel Maintainer's Summit, and to gain recent Reviewed-by: tags. Miguel is the primary maintainer, with me helping where needed/wanted. Our plan is for the tree to switch to the standard non-rebasing practice once this initial infrastructure series lands. The contents are the absolute minimum to get Rust code building in the kernel, with many more interfaces[2] (and drivers - NVMe[3], 9p[4], M1 GPU[5]) on the way. The initial support of Rust-for-Linux comes in roughly 4 areas: - Kernel internals (kallsyms expansion for Rust symbols, %pA format) - Kbuild infrastructure (Rust build rules and support scripts) - Rust crates and bindings for initial minimum viable build - Rust kernel documentation and samples Rust support has been in linux-next for a year and a half now, and the short log doesn't do justice to the number of people who have contributed both to the Linux kernel side but also to the upstream Rust side to support the kernel's needs. Thanks to these 173 people, and many more, who have been involved in all kinds of ways: Miguel Ojeda, Wedson Almeida Filho, Alex Gaynor, Boqun Feng, Gary Guo, Björn Roy Baron, Andreas Hindborg, Adam Bratschi-Kaye, Benno Lossin, Maciej Falkowski, Finn Behrens, Sven Van Asbroeck, Asahi Lina, FUJITA Tomonori, John Baublitz, Wei Liu, Geoffrey Thomas, Philip Herron, Arthur Cohen, David Faust, Antoni Boucher, Philip Li, Yujie Liu, Jonathan Corbet, Greg Kroah-Hartman, Paul E. McKenney, Josh Triplett, Kent Overstreet, David Gow, Alice Ryhl, Robin Randhawa, Kees Cook, Nick Desaulniers, Matthew Wilcox, Linus Walleij, Joe Perches, Michael Ellerman, Petr Mladek, Masahiro Yamada, Arnaldo Carvalho de Melo, Andrii Nakryiko, Konstantin Shelekhin, Rasmus Villemoes, Konstantin Ryabitsev, Stephen Rothwell, Andy Shevchenko, Sergey Senozhatsky, John Paul Adrian Glaubitz, David Laight, Nathan Chancellor, Jonathan Cameron, Daniel Latypov, Shuah Khan, Brendan Higgins, Julia Lawall, Laurent Pinchart, Geert Uytterhoeven, Akira Yokosawa, Pavel Machek, David S. Miller, John Hawley, James Bottomley, Arnd Bergmann, Christian Brauner, Dan Robertson, Nicholas Piggin, Zhouyi Zhou, Elena Zannoni, Jose E. Marchesi, Leon Romanovsky, Will Deacon, Richard Weinberger, Randy Dunlap, Paolo Bonzini, Roland Dreier, Mark Brown, Sasha Levin, Ted Ts'o, Steven Rostedt, Jarkko Sakkinen, Michal Kubecek, Marco Elver, Al Viro, Keith Busch, Johannes Berg, Jan Kara, David Sterba, Connor Kuehl, Andy Lutomirski, Andrew Lunn, Alexandre Belloni, Peter Zijlstra, Russell King, Eric W. Biederman, Willy Tarreau, Christoph Hellwig, Emilio Cobos Álvarez, Christian Poveda, Mark Rousskov, John Ericson, TennyZhuang, Xuanwo, Daniel Paoliello, Manish Goregaokar, comex, Josh Stone, Stephan Sokolow, Philipp Krones, Guillaume Gomez, Joshua Nelson, Mats Larsen, Marc Poulhiès, Samantha Miller, Esteban Blanc, Martin Schmidt, Martin Rodriguez Reboredo, Daniel Xu, Viresh Kumar, Bartosz Golaszewski, Vegard Nossum, Milan Landaverde, Dariusz Sosnowski, Yuki Okushi, Matthew Bakhtiari, Wu XiangCheng, Tiago Lam, Boris-Chengbiao Zhou, Sumera Priyadarsini, Viktor Garske, Niklas Mohrin, Nándor István Krácser, Morgan Bartlett, Miguel Cano, Léo Lanteri Thauvin, Julian Merkle, Andreas Reindl, Jiapeng Chong, Fox Chen, Douglas Su, Antonio Terceiro, SeongJae Park, Sergio González Collado, Ngo Iok Ui (Wu Yu Wei), Joshua Abraham, Milan, Daniel Kolsoi, ahomescu, Manas, Luis Gerhorst, Li Hongyu, Philipp Gesang, Russell Currey, Jalil David Salamé Messina, Jon Olson, Raghvender, Angelos, Kaviraj Kanagaraj, Paul Römer, Sladyn Nunes, Mauro Baladés, Hsiang-Cheng Yang, Abhik Jain, Hongyu Li, Sean Nash, Yuheng Su, Peng Hao, Anhad Singh, Roel Kluin, Sara Saa, Geert Stappers, Garrett LeSage, IFo Hancroft, and Linus Torvalds" Link: https://lwn.net/Articles/849849/ [1] Link: https://github.com/Rust-for-Linux/linux/commits/rust [2] Link: https://github.com/metaspace/rust-linux/commit/d88c3744d6cbdf11767e08bad56cbfb67c4c96d0 [3] Link: https://github.com/wedsonaf/linux/commit/9367032607f7670de0ba1537cf09ab0f4365a338 [4] Link: https://github.com/AsahiLinux/linux/commits/gpu/rust-wip [5] * tag 'rust-v6.1-rc1' of https://github.com/Rust-for-Linux/linux: (27 commits) MAINTAINERS: Rust samples: add first Rust examples x86: enable initial Rust support docs: add Rust documentation Kbuild: add Rust support rust: add `.rustfmt.toml` scripts: add `is_rust_module.sh` scripts: add `rust_is_available.sh` scripts: add `generate_rust_target.rs` scripts: add `generate_rust_analyzer.py` scripts: decode_stacktrace: demangle Rust symbols scripts: checkpatch: enable language-independent checks for Rust scripts: checkpatch: diagnose uses of `%pA` in the C side as errors vsprintf: add new `%pA` format specifier rust: export generated symbols rust: add `kernel` crate rust: add `bindings` crate rust: add `macros` crate rust: add `compiler_builtins` crate rust: adapt `alloc` crate to the kernel ...
2022-10-03x86: kmsan: handle CPU entry areaAlexander Potapenko3-0/+55
Among other data, CPU entry area holds exception stacks, so addresses from this area can be passed to kmsan_get_metadata(). This previously led to kmsan_get_metadata() returning NULL, which in turn resulted in a warning that triggered further attempts to call kmsan_get_metadata() in the exception context, which quickly exhausted the exception stack. This patch allocates shadow and origin for the CPU entry area on x86 and introduces arch_kmsan_get_meta_or_null(), which performs arch-specific metadata mapping. Link: https://lkml.kernel.org/r/20220928123219.1101883-1-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Fixes: 21d723a7c1409 ("kmsan: add KMSAN runtime core") Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03x86: kmsan: enable KMSAN builds for x86Alexander Potapenko2-0/+56
Make KMSAN usable by adding the necessary Kconfig bits. Also declare x86-specific functions checking address validity in arch/x86/include/asm/kmsan.h. Link: https://lkml.kernel.org/r/20220915150417.722975-44-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03x86: kmsan: don't instrument stack walking functionsAlexander Potapenko2-0/+17
Upon function exit, KMSAN marks local variables as uninitialized. Further function calls may result in the compiler creating the stack frame where these local variables resided. This results in frame pointers being marked as uninitialized data, which is normally correct, because they are not stack-allocated. However stack unwinding functions are supposed to read and dereference the frame pointers, in which case KMSAN might be reporting uses of uninitialized values. To work around that, we mark update_stack_state(), unwind_next_frame() and show_trace_log_lvl() with __no_kmsan_checks, preventing all KMSAN reports inside those functions and making them return initialized values. Link: https://lkml.kernel.org/r/20220915150417.722975-40-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03x86: fs: kmsan: disable CONFIG_DCACHE_WORD_ACCESSAlexander Potapenko1-1/+3
dentry_string_cmp() calls read_word_at_a_time(), which might read uninitialized bytes to optimize string comparisons. Disabling CONFIG_DCACHE_WORD_ACCESS should prohibit this optimization, as well as (probably) similar ones. Link: https://lkml.kernel.org/r/20220915150417.722975-39-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Suggested-by: Andrey Konovalov <andreyknvl@gmail.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03x86: kasan: kmsan: support CONFIG_GENERIC_CSUM on x86, enable it for KASAN/KMSANAlexander Potapenko3-6/+16
This is needed to allow memory tools like KASAN and KMSAN see the memory accesses from the checksum code. Without CONFIG_GENERIC_CSUM the tools can't see memory accesses originating from handwritten assembly code. For KASAN it's a question of detecting more bugs, for KMSAN using the C implementation also helps avoid false positives originating from seemingly uninitialized checksum values. Link: https://lkml.kernel.org/r/20220915150417.722975-38-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03x86: kmsan: sync metadata pages on page faultAlexander Potapenko1-1/+22
KMSAN assumes shadow and origin pages for every allocated page are accessible. For pages between [VMALLOC_START, VMALLOC_END] those metadata pages start at KMSAN_VMALLOC_SHADOW_START and KMSAN_VMALLOC_ORIGIN_START, therefore we must sync a bigger memory region. Link: https://lkml.kernel.org/r/20220915150417.722975-37-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03x86: kmsan: use __msan_ string functions where possible.Alexander Potapenko1-2/+21
Unless stated otherwise (by explicitly calling __memcpy(), __memset() or __memmove()) we want all string functions to call their __msan_ versions (e.g. __msan_memcpy() instead of memcpy()), so that shadow and origin values are updated accordingly. Bootloader must still use the default string functions to avoid crashes. Link: https://lkml.kernel.org/r/20220915150417.722975-36-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03x86: kmsan: handle open-coded assembly in lib/iomem.cAlexander Potapenko1-0/+5
KMSAN cannot intercept memory accesses within asm() statements. That's why we add kmsan_unpoison_memory() and kmsan_check_memory() to hint it how to handle memory copied from/to I/O memory. Link: https://lkml.kernel.org/r/20220915150417.722975-35-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03x86: kmsan: skip shadow checks in __switch_to()Alexander Potapenko1-0/+1
When instrumenting functions, KMSAN obtains the per-task state (mostly pointers to metadata for function arguments and return values) once per function at its beginning, using the `current` pointer. Every time the instrumented function calls another function, this state (`struct kmsan_context_state`) is updated with shadow/origin data of the passed and returned values. When `current` changes in the low-level arch code, instrumented code can not notice that, and will still refer to the old state, possibly corrupting it or using stale data. This may result in false positive reports. To deal with that, we need to apply __no_kmsan_checks to the functions performing context switching - this will result in skipping all KMSAN shadow checks and marking newly created values as initialized, preventing all false positive reports in those functions. False negatives are still possible, but we expect them to be rare and impersistent. Link: https://lkml.kernel.org/r/20220915150417.722975-34-glider@google.com Suggested-by: Marco Elver <elver@google.com> Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03x86: kmsan: disable instrumentation of unsupported codeAlexander Potapenko7-0/+11
Instrumenting some files with KMSAN will result in kernel being unable to link, boot or crashing at runtime for various reasons (e.g. infinite recursion caused by instrumentation hooks calling instrumented code again). Completely omit KMSAN instrumentation in the following places: - arch/x86/boot and arch/x86/realmode/rm, as KMSAN doesn't work for i386; - arch/x86/entry/vdso, which isn't linked with KMSAN runtime; - three files in arch/x86/kernel - boot problems; - arch/x86/mm/cpu_entry_area.c - recursion. Link: https://lkml.kernel.org/r/20220915150417.722975-33-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03mm: kmsan: maintain KMSAN metadata for page operationsAlexander Potapenko2-0/+10
Insert KMSAN hooks that make the necessary bookkeeping changes: - poison page shadow and origins in alloc_pages()/free_page(); - clear page shadow and origins in clear_page(), copy_user_highpage(); - copy page metadata in copy_highpage(), wp_page_copy(); - handle vmap()/vunmap()/iounmap(); Link: https://lkml.kernel.org/r/20220915150417.722975-15-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03x86: kmsan: pgtable: reduce vmalloc spaceAlexander Potapenko2-2/+47
KMSAN is going to use 3/4 of existing vmalloc space to hold the metadata, therefore we lower VMALLOC_END to make sure vmalloc() doesn't allocate past the first 1/4. Link: https://lkml.kernel.org/r/20220915150417.722975-10-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03x86: asm: instrument usercopy in get_user() and put_user()Alexander Potapenko1-7/+15
Use hooks from instrumented.h to notify bug detection tools about usercopy events in variations of get_user() and put_user(). Link: https://lkml.kernel.org/r/20220915150417.722975-5-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03x86: add missing include to sparsemem.hDmitry Vyukov1-0/+2
Patch series "Add KernelMemorySanitizer infrastructure", v7. KernelMemorySanitizer (KMSAN) is a detector of errors related to uses of uninitialized memory. It relies on compile-time Clang instrumentation (similar to MSan in the userspace [1]) and tracks the state of every bit of kernel memory, being able to report an error if uninitialized value is used in a condition, dereferenced, or escapes to userspace, USB or DMA. KMSAN has reported more than 300 bugs in the past few years (recently fixed bugs: [2]), most of them with the help of syzkaller. Such bugs keep getting introduced into the kernel despite new compiler warnings and other analyses (the 6.0 cycle already resulted in several KMSAN-reported bugs, e.g. [3]). Mitigations like total stack and heap initialization are unfortunately very far from being deployable. The proposed patchset contains KMSAN runtime implementation together with small changes to other subsystems needed to make KMSAN work. The latter changes fall into several categories: 1. Changes and refactorings of existing code required to add KMSAN: - [01/43] x86: add missing include to sparsemem.h - [02/43] stackdepot: reserve 5 extra bits in depot_stack_handle_t - [03/43] instrumented.h: allow instrumenting both sides of copy_from_user() - [04/43] x86: asm: instrument usercopy in get_user() and __put_user_size() - [05/43] asm-generic: instrument usercopy in cacheflush.h - [10/43] libnvdimm/pfn_dev: increase MAX_STRUCT_PAGE_SIZE 2. KMSAN-related declarations in generic code, KMSAN runtime library, docs and configs: - [06/43] kmsan: add ReST documentation - [07/43] kmsan: introduce __no_sanitize_memory and __no_kmsan_checks - [09/43] x86: kmsan: pgtable: reduce vmalloc space - [11/43] kmsan: add KMSAN runtime core - [13/43] MAINTAINERS: add entry for KMSAN - [24/43] kmsan: add tests for KMSAN - [31/43] objtool: kmsan: list KMSAN API functions as uaccess-safe - [35/43] x86: kmsan: use __msan_ string functions where possible - [43/43] x86: kmsan: enable KMSAN builds for x86 3. Adding hooks from different subsystems to notify KMSAN about memory state changes: - [14/43] mm: kmsan: maintain KMSAN metadata for page - [15/43] mm: kmsan: call KMSAN hooks from SLUB code - [16/43] kmsan: handle task creation and exiting - [17/43] init: kmsan: call KMSAN initialization routines - [18/43] instrumented.h: add KMSAN support - [19/43] kmsan: add iomap support - [20/43] Input: libps2: mark data received in __ps2_command() as initialized - [21/43] dma: kmsan: unpoison DMA mappings - [34/43] x86: kmsan: handle open-coded assembly in lib/iomem.c - [36/43] x86: kmsan: sync metadata pages on page fault 4. Changes that prevent false reports by explicitly initializing memory, disabling optimized code that may trick KMSAN, selectively skipping instrumentation: - [08/43] kmsan: mark noinstr as __no_sanitize_memory - [12/43] kmsan: disable instrumentation of unsupported common kernel code - [22/43] virtio: kmsan: check/unpoison scatterlist in vring_map_one_sg() - [23/43] kmsan: handle memory sent to/from USB - [25/43] kmsan: disable strscpy() optimization under KMSAN - [26/43] crypto: kmsan: disable accelerated configs under KMSAN - [27/43] kmsan: disable physical page merging in biovec - [28/43] block: kmsan: skip bio block merging logic for KMSAN - [29/43] kcov: kmsan: unpoison area->list in kcov_remote_area_put() - [30/43] security: kmsan: fix interoperability with auto-initialization - [32/43] x86: kmsan: disable instrumentation of unsupported code - [33/43] x86: kmsan: skip shadow checks in __switch_to() - [37/43] x86: kasan: kmsan: support CONFIG_GENERIC_CSUM on x86, enable it for KASAN/KMSAN - [38/43] x86: fs: kmsan: disable CONFIG_DCACHE_WORD_ACCESS - [39/43] x86: kmsan: don't instrument stack walking functions - [40/43] entry: kmsan: introduce kmsan_unpoison_entry_regs() 5. Fixes for bugs detected with CONFIG_KMSAN_CHECK_PARAM_RETVAL: - [41/43] bpf: kmsan: initialize BPF registers with zeroes - [42/43] mm: fs: initialize fsdata passed to write_begin/write_end interface This patchset allows one to boot and run a defconfig+KMSAN kernel on a QEMU without known false positives. It however doesn't guarantee there are no false positives in drivers of certain devices or less tested subsystems, although KMSAN is actively tested on syzbot with a large config. By default, KMSAN enforces conservative checks of most kernel function parameters passed by value (via CONFIG_KMSAN_CHECK_PARAM_RETVAL, which maps to the -fsanitize-memory-param-retval compiler flag). As discussed in [4] and [5], passing uninitialized values as function parameters is considered undefined behavior, therefore KMSAN now reports such cases as errors. Several newly added patches fix known manifestations of these errors. This patch (of 43): Including sparsemem.h from other files (e.g. transitively via asm/pgtable_64_types.h) results in compilation errors due to unknown types: sparsemem.h:34:32: error: unknown type name 'phys_addr_t' extern int phys_to_target_node(phys_addr_t start); ^ sparsemem.h:36:39: error: unknown type name 'u64' extern int memory_add_physaddr_to_nid(u64 start); ^ Fix these errors by including linux/types.h from sparsemem.h This is required for the upcoming KMSAN patches. Link: https://lkml.kernel.org/r/20220915150417.722975-1-glider@google.com Link: https://lkml.kernel.org/r/20220915150417.722975-2-glider@google.com Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Eric Biggers <ebiggers@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03x86/mm: Disable W^X detection and enforcement on 32-bitDave Hansen1-0/+8
The 32-bit code is in a weird spot. Some 32-bit builds (non-PAE) do not even have NX support. Even PAE builds that support NX have to contend with things like EFI data and code mixed in the same pages where W+X is unavoidable. The folks still running X86_32=y kernels are unlikely to care much about NX. That combined with the fundamental inability fix _all_ of the W+X things means this code had little value on X86_32=y. Disable the checks. Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Darren Hart <dvhart@infradead.org> Cc: Andy Shevchenko <andy@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: x86@kernel.org Cc: linux-efi@vger.kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/all/CAMj1kXHcF_iK_g0OZSkSv56Wmr=eQGQwNstcNjLEfS=mm7a06w@mail.gmail.com/
2022-10-03Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski2-31/+68
Daniel Borkmann says: ==================== pull-request: bpf-next 2022-10-03 We've added 143 non-merge commits during the last 27 day(s) which contain a total of 151 files changed, 8321 insertions(+), 1402 deletions(-). The main changes are: 1) Add kfuncs for PKCS#7 signature verification from BPF programs, from Roberto Sassu. 2) Add support for struct-based arguments for trampoline based BPF programs, from Yonghong Song. 3) Fix entry IP for kprobe-multi and trampoline probes under IBT enabled, from Jiri Olsa. 4) Batch of improvements to veristat selftest tool in particular to add CSV output, a comparison mode for CSV outputs and filtering, from Andrii Nakryiko. 5) Add preparatory changes needed for the BPF core for upcoming BPF HID support, from Benjamin Tissoires. 6) Support for direct writes to nf_conn's mark field from tc and XDP BPF program types, from Daniel Xu. 7) Initial batch of documentation improvements for BPF insn set spec, from Dave Thaler. 8) Add a new BPF_MAP_TYPE_USER_RINGBUF map which provides single-user-space-producer / single-kernel-consumer semantics for BPF ring buffer, from David Vernet. 9) Follow-up fixes to BPF allocator under RT to always use raw spinlock for the BPF hashtab's bucket lock, from Hou Tao. 10) Allow creating an iterator that loops through only the resources of one task/thread instead of all, from Kui-Feng Lee. 11) Add support for kptrs in the per-CPU arraymap, from Kumar Kartikeya Dwivedi. 12) Add a new kfunc helper for nf to set src/dst NAT IP/port in a newly allocated CT entry which is not yet inserted, from Lorenzo Bianconi. 13) Remove invalid recursion check for struct_ops for TCP congestion control BPF programs, from Martin KaFai Lau. 14) Fix W^X issue with BPF trampoline and BPF dispatcher, from Song Liu. 15) Fix percpu_counter leakage in BPF hashtab allocation error path, from Tetsuo Handa. 16) Various cleanups in BPF selftests to use preferred ASSERT_* macros, from Wang Yufen. 17) Add invocation for cgroup/connect{4,6} BPF programs for ICMP pings, from YiFei Zhu. 18) Lift blinding decision under bpf_jit_harden = 1 to bpf_capable(), from Yauheni Kaliuta. 19) Various libbpf fixes and cleanups including a libbpf NULL pointer deref, from Xin Liu. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (143 commits) net: netfilter: move bpf_ct_set_nat_info kfunc in nf_nat_bpf.c Documentation: bpf: Add implementation notes documentations to table of contents bpf, docs: Delete misformatted table. selftests/xsk: Fix double free bpftool: Fix error message of strerror libbpf: Fix overrun in netlink attribute iteration selftests/bpf: Fix spelling mistake "unpriviledged" -> "unprivileged" samples/bpf: Fix typo in xdp_router_ipv4 sample bpftool: Remove unused struct event_ring_info bpftool: Remove unused struct btf_attach_point bpf, docs: Add TOC and fix formatting. bpf, docs: Add Clang note about BPF_ALU bpf, docs: Move Clang notes to a separate file bpf, docs: Linux byteswap note bpf, docs: Move legacy packet instructions to a separate file selftests/bpf: Check -EBUSY for the recurred bpf_setsockopt(TCP_CONGESTION) bpf: tcp: Stop bpf_setsockopt(TCP_CONGESTION) in init ops to recur itself bpf: Refactor bpf_setsockopt(TCP_CONGESTION) handling into another function bpf: Move the "cdg" tcp-cc check to the common sol_tcp_sockopt() bpf: Add __bpf_prog_{enter,exit}_struct_ops for struct_ops trampoline ... ==================== Link: https://lore.kernel.org/r/20221003194915.11847-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03Merge tag 'kvm-riscv-6.1-1' of https://github.com/kvm-riscv/linux into HEADPaolo Bonzini3-10/+3
KVM/riscv changes for 6.1 - Improved instruction encoding infrastructure for instructions not yet supported by binutils - Svinval support for both KVM Host and KVM Guest - Zihintpause support for KVM Guest - Zicbom support for KVM Guest - Record number of signal exits as a VCPU stat - Use generic guest entry infrastructure
2022-10-03Merge tag 'kvmarm-6.1' of ↵Paolo Bonzini1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for v6.1 - Fixes for single-stepping in the presence of an async exception as well as the preservation of PSTATE.SS - Better handling of AArch32 ID registers on AArch64-only systems - Fixes for the dirty-ring API, allowing it to work on architectures with relaxed memory ordering - Advertise the new kvmarm mailing list - Various minor cleanups and spelling fixes
2022-10-03x86/hyperv: Replace kmap() with kmap_local_page()Zhao Liu1-2/+2
kmap() is being deprecated in favor of kmap_local_page()[1]. There are two main problems with kmap(): (1) It comes with an overhead as mapping space is restricted and protected by a global lock for synchronization and (2) it also requires global TLB invalidation when the kmap's pool wraps and it might block when the mapping space is fully utilized until a slot becomes available. With kmap_local_page() the mappings are per thread, CPU local, can take page faults, and can be called from any context (including interrupts). It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore, the tasks can be preempted and, when they are scheduled to run again, the kernel virtual addresses are restored and are still valid. In the fuction hyperv_init() of hyperv/hv_init.c, the mapping is used in a single thread and is short live. So, in this case, it's safe to simply use kmap_local_page() to create mapping, and this avoids the wasted cost of kmap() for global synchronization. In addtion, the fuction hyperv_init() checks if kmap() fails by BUG_ON(). From the original discussion[2], the BUG_ON() here is just used to explicitly panic NULL pointer. So still keep the BUG_ON() in place to check if kmap_local_page() fails. Based on this consideration, memcpy_to_page() is not selected here but only kmap_local_page() is used. Therefore, replace kmap() with kmap_local_page() in hyperv/hv_init.c. [1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.weiny@intel.com [2]: https://lore.kernel.org/lkml/20200915103710.cqmdvzh5lys4wsqo@liuwe-devbox-debian-v2/ Suggested-by: Dave Hansen <dave.hansen@intel.com> Suggested-by: Ira Weiny <ira.weiny@intel.com> Suggested-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/r/20220928095640.626350-1-zhao1.liu@linux.intel.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-10-02Merge tag 'perf-urgent-2022-10-02' of ↵Linus Torvalds3-3/+48
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc perf fixes from Ingo Molnar: - Fix a PMU enumeration/initialization bug on Intel Alder Lake CPUs - Fix KVM guest PEBS register handling - Fix race/reentry bug in perf_output_read_group() reading of PMU counters * tag 'perf-urgent-2022-10-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix reentry problem in perf_output_read_group() perf/x86/core: Completely disable guest PEBS via guest's global_ctrl perf/x86/intel: Fix unchecked MSR access error for Alder Lake N
2022-10-02Merge tag 'x86_urgent_for_v6.0' of ↵Linus Torvalds2-32/+38
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Add the respective UP last level cache mask accessors in order not to cause segfaults when lscpu accesses their representation in sysfs - Fix for a race in the alternatives batch patching machinery when kprobes are set * tag 'x86_urgent_for_v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cacheinfo: Add a cpu_llc_shared_mask() UP variant x86/alternative: Fix race in try_get_desc()
2022-10-02kbuild: remove head-y syntaxMasahiro Yamada1-5/+0
Kbuild puts the objects listed in head-y at the head of vmlinux. Conventionally, we do this for head*.S, which contains the kernel entry point. A counter approach is to control the section order by the linker script. Actually, the code marked as __HEAD goes into the ".head.text" section, which is placed before the normal ".text" section. I do not know if both of them are needed. From the build system perspective, head-y is not mandatory. If you can achieve the proper code placement by the linker script only, it would be cleaner. I collected the current head-y objects into head-object-list.txt. It is a whitelist. My hope is it will be reduced in the long run. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-10-02kbuild: use obj-y instead extra-y for objects placed at the headMasahiro Yamada1-5/+5
The objects placed at the head of vmlinux need special treatments: - arch/$(SRCARCH)/Makefile adds them to head-y in order to place them before other archives in the linker command line. - arch/$(SRCARCH)/kernel/Makefile adds them to extra-y instead of obj-y to avoid them going into built-in.a. This commit gets rid of the latter. Create vmlinux.a to collect all the objects that are unconditionally linked to vmlinux. The objects listed in head-y are moved to the head of vmlinux.a by using 'ar m'. With this, arch/$(SRCARCH)/kernel/Makefile can consistently use obj-y for builtin objects. There is no *.o that is directly linked to vmlinux. Drop unneeded code in scripts/clang-tools/gen_compile_commands.py. $(AR) mPi needs 'T' to workaround the llvm-ar bug. The fix was suggested by Nathan Chancellor [1]. [1]: https://lore.kernel.org/llvm/YyjjT5gQ2hGMH0ni@dev-arch.thelio-3990X/ Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-09-30Merge tag 'for-linus-6.0' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-2/+0
Pull kvm fixes from Paolo Bonzini: "A small fix to the reported set of supported CPUID bits, and selftests fixes: - Skip tests that require EPT when it is not available - Do not hang when a test fails with an empty stack trace - avoid spurious failure when running access_tracking_perf_test in a KVM guest - work around GCC's tendency to optimize loops into mem*() functions, which breaks because the guest code in selftests cannot call into PLTs - fix -Warray-bounds error in fix_hypercall_test" * tag 'for-linus-6.0' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: selftests: Compare insn opcodes directly in fix_hypercall_test KVM: selftests: Implement memcmp(), memcpy(), and memset() for guest use KVM: x86: Hide IA32_PLATFORM_DCA_CAP[31:0] from the guest KVM: selftests: Gracefully handle empty stack traces KVM: selftests: replace assertion with warning in access_tracking_perf_test KVM: selftests: Skip tests that require EPT when it is not available
2022-09-30kvm: vmx: keep constant definition format consistentPeng Hao1-1/+1
Keep all constants using lowercase "x". Signed-off-by: Peng Hao <flyingpeng@tencent.com> Message-Id: <CAPm50aKnctFL_7fZ-eqrz-QGnjW2+DTyDDrhxi7UZVO3HjD8UA@mail.gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-30kvm: mmu: fix typos in struct kvm_archPeng Hao1-6/+6
No 'kvmp_mmu_pages', it should be 'kvm_mmu_page'. And struct kvm_mmu_pages and struct kvm_mmu_page are different structures, here should be kvm_mmu_page. kvm_mmu_pages is defined in arch/x86/kvm/mmu/mmu.c. Suggested-by: David Matlack <dmatlack@google.com> Signed-off-by: Peng Hao <flyingpeng@tencent.com> Reviewed-by: David Matlack <dmatlack@google.com> Message-Id: <CAPm50aL=0smbohhjAcK=ciUwcQJ=uAQP1xNQi52YsE7U8NFpEw@mail.gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-30Merge tag 'kvm-x86-6.1-2' of https://github.com/sean-jc/linux into HEADPaolo Bonzini31-959/+1490
KVM x86 updates for 6.1, batch #2: - Misc PMU fixes and cleanups. - Fixes for Hyper-V hypercall selftest
2022-09-30KVM: x86: Hide IA32_PLATFORM_DCA_CAP[31:0] from the guestJim Mattson1-2/+0
The only thing reported by CPUID.9 is the value of IA32_PLATFORM_DCA_CAP[31:0] in EAX. This MSR doesn't even exist in the guest, since CPUID.1:ECX.DCA[bit 18] is clear in the guest. Clear CPUID.9 in KVM_GET_SUPPORTED_CPUID. Fixes: 24c82e576b78 ("KVM: Sanitize cpuid") Signed-off-by: Jim Mattson <jmattson@google.com> Message-Id: <20220922231854.249383-1-jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-29bpf: Add __bpf_prog_{enter,exit}_struct_ops for struct_ops trampolineMartin KaFai Lau1-0/+3
The struct_ops prog is to allow using bpf to implement the functions in a struct (eg. kernel module). The current usage is to implement the tcp_congestion. The kernel does not call the tcp-cc's ops (ie. the bpf prog) in a recursive way. The struct_ops is sharing the tracing-trampoline's enter/exit function which tracks prog->active to avoid recursion. It is needed for tracing prog. However, it turns out the struct_ops bpf prog will hit this prog->active and unnecessarily skipped running the struct_ops prog. eg. The '.ssthresh' may run in_task() and then interrupted by softirq that runs the same '.ssthresh'. Skip running the '.ssthresh' will end up returning random value to the caller. The patch adds __bpf_prog_{enter,exit}_struct_ops for the struct_ops trampoline. They do not track the prog->active to detect recursion. One exception is when the tcp_congestion's '.init' ops is doing bpf_setsockopt(TCP_CONGESTION) and then recurs to the same '.init' ops. This will be addressed in the following patches. Fixes: ca06f55b9002 ("bpf: Add per-program recursion prevention mechanism") Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220929070407.965581-2-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-29x86/mm: Add prot_sethuge() helper to abstract out _PAGE_PSE handlingLinus Torvalds1-9/+10
We still have some historic cases of direct fiddling of page attributes with (dangerous & fragile) type casting and address shifting. Add the prot_sethuge() helper instead that gets the types right and doesn't have to transform addresses. ( Also add a debug check to make sure this doesn't get applied to _PAGE_BIT_PAT/_PAGE_BIT_PAT_LARGE pages. ) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Dave Hansen <dave.hansen@intel.com>
2022-09-29perf/x86/amd/lbr: Adjust LBR regardless of filteringStephane Eranian1-2/+6
In case of fused compare and taken branch instructions, the AMD LBR points to the compare instruction instead of the branch. Users of LBR usually expects the from address to point to a branch instruction. The kernel has code to adjust the from address via get_branch_type_fused(). However this correction is only applied when a branch filter is applied. That means that if no filter is present, the quality of the data is lower. Fix the problem by applying the adjustment regardless of the filter setting, bringing the AMD LBR to the same level as other LBR implementations. Fixes: 245268c19f70 ("perf/x86/amd/lbr: Use fusion-aware branch classifier") Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sandipan Das <sandipan.das@amd.com> Link: https://lore.kernel.org/r/20220928184043.408364-3-eranian@google.com
2022-09-29perf/x86/utils: Fix uninitialized var in get_branch_type()Stephane Eranian1-0/+4
offset is passed as a pointer and on certain call path is not set by the function. If the caller does not re-initialize offset between calls, value could be inherited between calls. Prevent this by initializing offset on each call. This impacts the code in amd_pmu_lbr_filter() which does: for(i=0; ...) { ret = get_branch_type_fused(..., &offset); if (offset) lbr_entries[i].from += offset; } Fixes: df3e9612f758 ("perf/x86: Make branch classifier fusion-aware") Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sandipan Das <sandipan.das@amd.com> Link: https://lore.kernel.org/r/20220928184043.408364-2-eranian@google.com
2022-09-29perf/x86/amd: Support PERF_SAMPLE_PHY_ADDRRavi Bangoria1-1/+7
IBS_DC_PHYSADDR provides the physical data address for the tagged load/ store operation. Populate perf sample physical address using it. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220928095805.596-7-ravi.bangoria@amd.com
2022-09-29perf/x86/amd: Support PERF_SAMPLE_ADDRRavi Bangoria1-1/+7
IBS_DC_LINADDR provides the linear data address for the tagged load/ store operation. Populate perf sample address using it. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220928095805.596-6-ravi.bangoria@amd.com
2022-09-29perf/x86/amd: Support PERF_SAMPLE_{WEIGHT|WEIGHT_STRUCT}Ravi Bangoria1-1/+16
IbsDcMissLat indicates the number of clock cycles from when a miss is detected in the data cache to when the data was delivered to the core. Similarly, IbsTagToRetCtr provides number of cycles from when the op was tagged to when the op was retired. Consider these fields for sample->weight. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220928095805.596-5-ravi.bangoria@amd.com
2022-09-29perf/x86/amd: Support PERF_SAMPLE_DATA_SRCRavi Bangoria1-6/+312
struct perf_mem_data_src is used to pass arch specific memory access details into generic form. These details gets consumed by tools like perf mem and c2c. IBS tagged load/store sample provides most of the information needed for these tools. Add a logic to convert IBS specific raw data into perf_mem_data_src. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220928095805.596-4-ravi.bangoria@amd.com
2022-09-29perf/x86/amd: Add IBS OP_DATA2 DataSrc bit definitionsRavi Bangoria1-0/+16
IBS_OP_DATA2 DataSrc provides detail about location of the data being accessed from by load ops. Define macros for legacy and extended DataSrc values. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220928095805.596-3-ravi.bangoria@amd.com
2022-09-29perf/x86/uncore: Add new Raptor Lake S supportKan Liang1-0/+1
From the perspective of the uncore PMU, the new Raptor Lake S is the same as the other hybrid {ALDER,RAPTOP}LAKE. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220928153331.3757388-4-kan.liang@linux.intel.com
2022-09-29perf/x86/cstate: Add new Raptor Lake S supportKan Liang1-0/+1
From the perspective of Intel cstate residency counters, the new Raptor Lake S is the same as the other hybrid {ALDER,RAPTOP}LAKE. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220928153331.3757388-3-kan.liang@linux.intel.com
2022-09-29perf/x86/msr: Add new Raptor Lake S supportKan Liang1-0/+1
The same as the other hybrid {ALDER,RAPTOP}LAKE, the new Raptor Lake S also support PPERF and SMI_COUNT MSRs. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220928153331.3757388-2-kan.liang@linux.intel.com
2022-09-29perf/x86: Add new Raptor Lake S supportKan Liang1-0/+1
From PMU's perspective, the new Raptor Lake S is the same as the other of hybrid {ALDER,RAPTOP}LAKE. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220928153331.3757388-1-kan.liang@linux.intel.com
2022-09-29Merge branch 'v6.0-rc7'Peter Zijlstra30-170/+333
Merge upstream to get RAPTORLAKE_S Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2022-09-29KVM: x86: Select CONFIG_HAVE_KVM_DIRTY_RING_ACQ_RELMarc Zyngier1-0/+1
Since x86 is TSO (give or take), allow it to advertise the new ACQ_REL version of the dirty ring capability. No other change is required for it. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20220926145120.27974-4-maz@kernel.org
2022-09-29KVM: Add KVM_CAP_DIRTY_LOG_RING_ACQ_REL capability and config optionMarc Zyngier1-1/+1
In order to differenciate between architectures that require no extra synchronisation when accessing the dirty ring and those who do, add a new capability (KVM_CAP_DIRTY_LOG_RING_ACQ_REL) that identify the latter sort. TSO architectures can obviously advertise both, while relaxed architectures must only advertise the ACQ_REL version. This requires some configuration symbol rejigging, with HAVE_KVM_DIRTY_RING being only indirectly selected by two top-level config symbols: - HAVE_KVM_DIRTY_RING_TSO for strongly ordered architectures (x86) - HAVE_KVM_DIRTY_RING_ACQ_REL for weakly ordered architectures (arm64) Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20220926145120.27974-3-maz@kernel.org
2022-09-28KVM: x86/svm/pmu: Rewrite get_gp_pmc_amd() for more counters scalabilityLike Xu1-68/+20
If the number of AMD gp counters continues to grow, the code will be very clumsy and the switch-case design of inline get_gp_pmc_amd() will also bloat the kernel text size. The target code is taught to manage two groups of MSRs, each representing a different version of the AMD PMU counter MSRs. The MSR addresses of each group are contiguous, with no holes, and there is no intersection between two sets of addresses, but they are discrete in functionality by design like this: [Group A : All counter MSRs are tightly bound to all event select MSRs ] MSR_K7_EVNTSEL0 0xc0010000 MSR_K7_EVNTSELi 0xc0010000 + i ... MSR_K7_EVNTSEL3 0xc0010003 MSR_K7_PERFCTR0 0xc0010004 MSR_K7_PERFCTRi 0xc0010004 + i ... MSR_K7_PERFCTR3 0xc0010007 [Group B : The counter MSRs are interleaved with the event select MSRs ] MSR_F15H_PERF_CTL0 0xc0010200 MSR_F15H_PERF_CTR0 (0xc0010200 + 1) ... MSR_F15H_PERF_CTLi (0xc0010200 + 2 * i) MSR_F15H_PERF_CTRi (0xc0010200 + 2 * i + 1) ... MSR_F15H_PERF_CTL5 (0xc0010200 + 2 * 5) MSR_F15H_PERF_CTR5 (0xc0010200 + 2 * 5 + 1) Rewrite get_gp_pmc_amd() in this way: first determine which group of registers is accessed, then determine if it matches its requested type, applying different scaling ratios respectively, and finally get pmc_idx to pass into amd_pmc_idx_to_pmc(). Signed-off-by: Like Xu <likexu@tencent.com> Link: https://lore.kernel.org/r/20220831085328.45489-8-likexu@tencent.com Signed-off-by: Sean Christopherson <seanjc@google.com>