summaryrefslogtreecommitdiffstats
path: root/arch/arm64/kernel/process.c
AgeCommit message (Collapse)AuthorFilesLines
2022-12-12Merge tag 'random-6.2-rc1-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: - Replace prandom_u32_max() and various open-coded variants of it, there is now a new family of functions that uses fast rejection sampling to choose properly uniformly random numbers within an interval: get_random_u32_below(ceil) - [0, ceil) get_random_u32_above(floor) - (floor, U32_MAX] get_random_u32_inclusive(floor, ceil) - [floor, ceil] Coccinelle was used to convert all current users of prandom_u32_max(), as well as many open-coded patterns, resulting in improvements throughout the tree. I'll have a "late" 6.1-rc1 pull for you that removes the now unused prandom_u32_max() function, just in case any other trees add a new use case of it that needs to converted. According to linux-next, there may be two trivial cases of prandom_u32_max() reintroductions that are fixable with a 's/.../.../'. So I'll have for you a final conversion patch doing that alongside the removal patch during the second week. This is a treewide change that touches many files throughout. - More consistent use of get_random_canary(). - Updates to comments, documentation, tests, headers, and simplification in configuration. - The arch_get_random*_early() abstraction was only used by arm64 and wasn't entirely useful, so this has been replaced by code that works in all relevant contexts. - The kernel will use and manage random seeds in non-volatile EFI variables, refreshing a variable with a fresh seed when the RNG is initialized. The RNG GUID namespace is then hidden from efivarfs to prevent accidental leakage. These changes are split into random.c infrastructure code used in the EFI subsystem, in this pull request, and related support inside of EFISTUB, in Ard's EFI tree. These are co-dependent for full functionality, but the order of merging doesn't matter. - Part of the infrastructure added for the EFI support is also used for an improvement to the way vsprintf initializes its siphash key, replacing an sleep loop wart. - The hardware RNG framework now always calls its correct random.c input function, add_hwgenerator_randomness(), rather than sometimes going through helpers better suited for other cases. - The add_latent_entropy() function has long been called from the fork handler, but is a no-op when the latent entropy gcc plugin isn't used, which is fine for the purposes of latent entropy. But it was missing out on the cycle counter that was also being mixed in beside the latent entropy variable. So now, if the latent entropy gcc plugin isn't enabled, add_latent_entropy() will expand to a call to add_device_randomness(NULL, 0), which adds a cycle counter, without the absent latent entropy variable. - The RNG is now reseeded from a delayed worker, rather than on demand when used. Always running from a worker allows it to make use of the CPU RNG on platforms like S390x, whose instructions are too slow to do so from interrupts. It also has the effect of adding in new inputs more frequently with more regularity, amounting to a long term transcript of random values. Plus, it helps a bit with the upcoming vDSO implementation (which isn't yet ready for 6.2). - The jitter entropy algorithm now tries to execute on many different CPUs, round-robining, in hopes of hitting even more memory latencies and other unpredictable effects. It also will mix in a cycle counter when the entropy timer fires, in addition to being mixed in from the main loop, to account more explicitly for fluctuations in that timer firing. And the state it touches is now kept within the same cache line, so that it's assured that the different execution contexts will cause latencies. * tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (23 commits) random: include <linux/once.h> in the right header random: align entropy_timer_state to cache line random: mix in cycle counter when jitter timer fires random: spread out jitter callback to different CPUs random: remove extraneous period and add a missing one in comments efi: random: refresh non-volatile random seed when RNG is initialized vsprintf: initialize siphash key using notifier random: add back async readiness notifier random: reseed in delayed work rather than on-demand random: always mix cycle counter in add_latent_entropy() hw_random: use add_hwgenerator_randomness() for early entropy random: modernize documentation comment on get_random_bytes() random: adjust comment to account for removed function random: remove early archrandom abstraction random: use random.trust_{bootloader,cpu} command line option only stackprotector: actually use get_random_canary() stackprotector: move get_random_canary() into stackprotector.h treewide: use get_random_u32_inclusive() when possible treewide: use get_random_u32_{above,below}() instead of manual loop treewide: use get_random_u32_below() instead of deprecated function ...
2022-11-29arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVEMark Brown1-0/+2
When we save the state for the floating point registers this can be done in the form visible through either the FPSIMD V registers or the SVE Z and P registers. At present we track which format is currently used based on TIF_SVE and the SME streaming mode state but particularly in the SVE case this limits our options for optimising things, especially around syscalls. Introduce a new enum which we place together with saved floating point state in both thread_struct and the KVM guest state which explicitly states which format is active and keep it up to date when we change it. At present we do not use this state except to verify that it has the expected value when loading the state, future patches will introduce functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221115094640.112848-3-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-11-18treewide: use get_random_u32_below() instead of deprecated functionJason A. Donenfeld1-1/+1
This is a simple mechanical transformation done by: @@ expression E; @@ - prandom_u32_max + get_random_u32_below (E) Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Reviewed-by: SeongJae Park <sj@kernel.org> # for damon Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-10-16Merge tag 'random-6.1-rc1-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull more random number generator updates from Jason Donenfeld: "This time with some large scale treewide cleanups. The intent of this pull is to clean up the way callers fetch random integers. The current rules for doing this right are: - If you want a secure or an insecure random u64, use get_random_u64() - If you want a secure or an insecure random u32, use get_random_u32() The old function prandom_u32() has been deprecated for a while now and is just a wrapper around get_random_u32(). Same for get_random_int(). - If you want a secure or an insecure random u16, use get_random_u16() - If you want a secure or an insecure random u8, use get_random_u8() - If you want secure or insecure random bytes, use get_random_bytes(). The old function prandom_bytes() has been deprecated for a while now and has long been a wrapper around get_random_bytes() - If you want a non-uniform random u32, u16, or u8 bounded by a certain open interval maximum, use prandom_u32_max() I say "non-uniform", because it doesn't do any rejection sampling or divisions. Hence, it stays within the prandom_*() namespace, not the get_random_*() namespace. I'm currently investigating a "uniform" function for 6.2. We'll see what comes of that. By applying these rules uniformly, we get several benefits: - By using prandom_u32_max() with an upper-bound that the compiler can prove at compile-time is ≤65536 or ≤256, internally get_random_u16() or get_random_u8() is used, which wastes fewer batched random bytes, and hence has higher throughput. - By using prandom_u32_max() instead of %, when the upper-bound is not a constant, division is still avoided, because prandom_u32_max() uses a faster multiplication-based trick instead. - By using get_random_u16() or get_random_u8() in cases where the return value is intended to indeed be a u16 or a u8, we waste fewer batched random bytes, and hence have higher throughput. This series was originally done by hand while I was on an airplane without Internet. Later, Kees and I worked on retroactively figuring out what could be done with Coccinelle and what had to be done manually, and then we split things up based on that. So while this touches a lot of files, the actual amount of code that's hand fiddled is comfortably small" * tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: prandom: remove unused functions treewide: use get_random_bytes() when possible treewide: use get_random_u32() when possible treewide: use get_random_{u8,u16}() when possible, part 2 treewide: use get_random_{u8,u16}() when possible, part 1 treewide: use prandom_u32_max() when possible, part 2 treewide: use prandom_u32_max() when possible, part 1
2022-10-11treewide: use prandom_u32_max() when possible, part 1Jason A. Donenfeld1-1/+1
Rather than incurring a division or requesting too many random bytes for the given range, use the prandom_u32_max() function, which only takes the minimum required bytes from the RNG and avoids divisions. This was done mechanically with this coccinelle script: @basic@ expression E; type T; identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u64; @@ ( - ((T)get_random_u32() % (E)) + prandom_u32_max(E) | - ((T)get_random_u32() & ((E) - 1)) + prandom_u32_max(E * XXX_MAKE_SURE_E_IS_POW2) | - ((u64)(E) * get_random_u32() >> 32) + prandom_u32_max(E) | - ((T)get_random_u32() & ~PAGE_MASK) + prandom_u32_max(PAGE_SIZE) ) @multi_line@ identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; identifier RAND; expression E; @@ - RAND = get_random_u32(); ... when != RAND - RAND %= (E); + RAND = prandom_u32_max(E); // Find a potential literal @literal_mask@ expression LITERAL; type T; identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; position p; @@ ((T)get_random_u32()@p & (LITERAL)) // Add one to the literal. @script:python add_one@ literal << literal_mask.LITERAL; RESULT; @@ value = None if literal.startswith('0x'): value = int(literal, 16) elif literal[0] in '123456789': value = int(literal, 10) if value is None: print("I don't know how to handle %s" % (literal)) cocci.include_match(False) elif value == 2**32 - 1 or value == 2**31 - 1 or value == 2**24 - 1 or value == 2**16 - 1 or value == 2**8 - 1: print("Skipping 0x%x for cleanup elsewhere" % (value)) cocci.include_match(False) elif value & (value + 1) != 0: print("Skipping 0x%x because it's not a power of two minus one" % (value)) cocci.include_match(False) elif literal.startswith('0x'): coccinelle.RESULT = cocci.make_expr("0x%x" % (value + 1)) else: coccinelle.RESULT = cocci.make_expr("%d" % (value + 1)) // Replace the literal mask with the calculated result. @plus_one@ expression literal_mask.LITERAL; position literal_mask.p; expression add_one.RESULT; identifier FUNC; @@ - (FUNC()@p & (LITERAL)) + prandom_u32_max(RESULT) @collapse_ret@ type T; identifier VAR; expression E; @@ { - T VAR; - VAR = (E); - return VAR; + return E; } @drop_var@ type T; identifier VAR; @@ { - T VAR; ... when != VAR } Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: KP Singh <kpsingh@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 and sbitmap Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> # for drbd Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390 Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-09-11kernel: exit: cleanup release_thread()Kefeng Wang1-4/+0
Only x86 has own release_thread(), introduce a new weak release_thread() function to clean empty definitions in other ARCHs. Link: https://lkml.kernel.org/r/20220819014406.32266-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Guo Ren <guoren@kernel.org> [csky] Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Brian Cain <bcain@quicinc.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Acked-by: Stafford Horne <shorne@gmail.com> [openrisc] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Huacai Chen <chenhuacai@kernel.org> [LoongArch] Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Chris Zankel <chris@zankel.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Guo Ren <guoren@kernel.org> [csky] Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Jonas Bonn <jonas@southpole.se> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Xuerui Wang <kernel@xen0n.name> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-03Merge tag 'kthread-cleanups-for-v5.19' of ↵Linus Torvalds1-5/+7
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull kthread updates from Eric Biederman: "This updates init and user mode helper tasks to be ordinary user mode tasks. Commit 40966e316f86 ("kthread: Ensure struct kthread is present for all kthreads") caused init and the user mode helper threads that call kernel_execve to have struct kthread allocated for them. This struct kthread going away during execve in turned made a use after free of struct kthread possible. Here, commit 343f4c49f243 ("kthread: Don't allocate kthread_struct for init and umh") is enough to fix the use after free and is simple enough to be backportable. The rest of the changes pass struct kernel_clone_args to clean things up and cause the code to make sense. In making init and the user mode helpers tasks purely user mode tasks I ran into two complications. The function task_tick_numa was detecting tasks without an mm by testing for the presence of PF_KTHREAD. The initramfs code in populate_initrd_image was using flush_delayed_fput to ensuere the closing of all it's file descriptors was complete, and flush_delayed_fput does not work in a userspace thread. I have looked and looked and more complications and in my code review I have not found any, and neither has anyone else with the code sitting in linux-next" * tag 'kthread-cleanups-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: sched: Update task_tick_numa to ignore tasks without an mm fork: Stop allowing kthreads to call execve fork: Explicitly set PF_KTHREAD init: Deal with the init process being a user mode process fork: Generalize PF_IO_WORKER handling fork: Explicity test for idle tasks in copy_thread fork: Pass struct kernel_clone_args into copy_thread kthread: Don't allocate kthread_struct for init and umh
2022-05-25Merge back reboot/poweroff notifiers rework for 5.19-rc1.Rafael J. Wysocki1-2/+1
2022-05-19arm64: Use do_kernel_power_off()Dmitry Osipenko1-2/+1
Kernel now supports chained power-off handlers. Use do_kernel_power_off() that invokes chained power-off handlers. It also invokes legacy pm_power_off() for now, which will be removed once all drivers will be converted to the new sys-off API. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-07fork: Generalize PF_IO_WORKER handlingEric W. Biederman1-4/+3
Add fn and fn_arg members into struct kernel_clone_args and test for them in copy_thread (instead of testing for PF_KTHREAD | PF_IO_WORKER). This allows any task that wants to be a user space task that only runs in kernel mode to use this functionality. The code on x86 is an exception and still retains a PF_KTHREAD test because x86 unlikely everything else handles kthreads slightly differently than user space tasks that start with a function. The functions that created tasks that start with a function have been updated to set ".fn" and ".fn_arg" instead of ".stack" and ".stack_size". These functions are fork_idle(), create_io_thread(), kernel_thread(), and user_mode_thread(). Link: https://lkml.kernel.org/r/20220506141512.516114-4-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-05-07fork: Pass struct kernel_clone_args into copy_threadEric W. Biederman1-2/+5
With io_uring we have started supporting tasks that are for most purposes user space tasks that exclusively run code in kernel mode. The kernel task that exec's init and tasks that exec user mode helpers are also user mode tasks that just run kernel code until they call kernel execve. Pass kernel_clone_args into copy_thread so these oddball tasks can be supported more cleanly and easily. v2: Fix spelling of kenrel_clone_args on h8300 Link: https://lkml.kernel.org/r/20220506141512.516114-2-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-04-29arm64/sme: Fix NULL check after kzallocWan Jiabing1-1/+1
Fix following coccicheck error: ./arch/arm64/kernel/process.c:322:2-23: alloc with no test, possible model on line 326 Here should be dst->thread.sve_state. Fixes: 8bd7f91c03d8 ("arm64/sme: Implement traps and syscall handling for SME") Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Reviwed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220426113054.630983-1-wanjiabing@vivo.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-22arm64/sme: Implement traps and syscall handling for SMEMark Brown1-3/+27
By default all SME operations in userspace will trap. When this happens we allocate storage space for the SME register state, set up the SVE registers and disable traps. We do not need to initialize ZA since the architecture guarantees that it will be zeroed when enabled and when we trap ZA is disabled. On syscall we exit streaming mode if we were previously in it and ensure that all but the lower 128 bits of the registers are zeroed while preserving the state of ZA. This follows the aarch64 PCS for SME, ZA state is preserved over a function call and streaming mode is exited. Since the traps for SME do not distinguish between streaming mode SVE and ZA usage if ZA is in use rather than reenabling traps we instead zero the parts of the SVE registers not shared with FPSIMD and leave SME enabled, this simplifies handling SME traps. If ZA is not in use then we reenable SME traps and fall through to normal handling of SVE. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419112247.711548-17-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-22arm64/sme: Implement SVCR context switchingMark Brown1-0/+2
In SME the use of both streaming SVE mode and ZA are tracked through PSTATE.SM and PSTATE.ZA, visible through the system register SVCR. In order to context switch the floating point state for SME we need to context switch the contents of this register as part of context switching the floating point state. Since changing the vector length exits streaming SVE mode and disables ZA we also make sure we update SVCR appropriately when setting vector length, and similarly ensure that new threads have streaming SVE mode and ZA disabled. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419112247.711548-14-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-22arm64/sme: Implement support for TPIDR2Mark Brown1-2/+12
The Scalable Matrix Extension introduces support for a new thread specific data register TPIDR2 intended for use by libc. The kernel must save the value of TPIDR2 on context switch and should ensure that all new threads start off with a default value of 0. Add a field to the thread_struct to store TPIDR2 and context switch it with the other thread specific data. In case there are future extensions which also use TPIDR2 we introduce system_supports_tpidr2() and use that rather than system_supports_sme() for TPIDR2 handling. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419112247.711548-13-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-03-09arm64/mte: Remove asymmetric mode from the prctl() interfaceMark Brown1-2/+0
As pointed out by Evgenii Stepanov one potential issue with the new ABI for enabling asymmetric is that if there are multiple places where MTE is configured in a process, some of which were compiled with the old prctl.h and some of which were compiled with the new prctl.h, there may be problems keeping track of which MTE modes are requested. For example some code may disable only sync and async modes leaving asymmetric mode enabled when it intended to fully disable MTE. In order to avoid such mishaps remove asymmetric mode from the prctl(), instead implicitly allowing it if both sync and async modes are requested. This should not disrupt userspace since a process requesting both may already see a mix of sync and async modes due to differing defaults between CPUs or changes in default while the process is running but it does mean that userspace is unable to explicitly request asymmetric mode without changing the system default for CPUs. Reported-by: Evgenii Stepanov <eugenis@google.com> Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Evgenii Stepanov <eugenis@google.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Branislav Rankov <branislav.rankov@arm.com> Link: https://lore.kernel.org/r/20220309131200.112637-1-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-02-25arm64/mte: Add userspace interface for enabling asymmetric modeMark Brown1-1/+4
The architecture provides an asymmetric mode for MTE where tag mismatches are checked asynchronously for stores but synchronously for loads. Allow userspace processes to select this and make it available as a default mode via the existing per-CPU sysfs interface. Since there PR_MTE_TCF_ values are a bitmask (allowing the kernel to choose between the multiple modes) and there are no free bits adjacent to the existing PR_MTE_TCF_ bits the set of bits used to specify the mode becomes disjoint. Programs using the new interface should be aware of this and programs that do not use it will not see any change in behaviour. When userspace requests two possible modes but the system default for the CPU is the third mode (eg, default is synchronous but userspace requests either asynchronous or asymmetric) the preference order is: ASYMM > ASYNC > SYNC This situation is not currently possible since there are only two modes and it is mandatory to have a system default so there could be no ambiguity and there is no ABI change. The chosen order is basically arbitrary as we do not have a clear metric for what is better here. If userspace requests specifically asymmetric mode via the prctl() and the system does not support it then we will return an error, this mirrors how we handle the case where userspace enables MTE on a system that does not support MTE at all and the behaviour that will be seen if running on an older kernel that does not support userspace use of asymmetric mode. Attempts to set asymmetric mode as the default mode will result in an error if the system does not support it. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Vincenzo Frascino <Vincenzo.Frascino@arm.com> Tested-by: Branislav Rankov <branislav.rankov@arm.com> Link: https://lore.kernel.org/r/20220216173224.2342152-5-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-01-05Merge branches 'for-next/misc', 'for-next/cache-ops-dzp', ↵Catalin Marinas1-41/+43
'for-next/stacktrace', 'for-next/xor-neon', 'for-next/kasan', 'for-next/armv8_7-fp', 'for-next/atomics', 'for-next/bti', 'for-next/sve', 'for-next/kselftest' and 'for-next/kcsan', remote-tracking branch 'arm64/for-next/perf' into for-next/core * arm64/for-next/perf: (32 commits) arm64: perf: Don't register user access sysctl handler multiple times drivers: perf: marvell_cn10k: fix an IS_ERR() vs NULL check perf/smmuv3: Fix unused variable warning when CONFIG_OF=n arm64: perf: Support new DT compatibles arm64: perf: Simplify registration boilerplate arm64: perf: Support Denver and Carmel PMUs drivers/perf: hisi: Add driver for HiSilicon PCIe PMU docs: perf: Add description for HiSilicon PCIe PMU driver dt-bindings: perf: Add YAML schemas for Marvell CN10K LLC-TAD pmu bindings drivers: perf: Add LLC-TAD perf counter support perf/smmuv3: Synthesize IIDR from CoreSight ID registers perf/smmuv3: Add devicetree support dt-bindings: Add Arm SMMUv3 PMCG binding perf/arm-cmn: Add debugfs topology info perf/arm-cmn: Add CI-700 Support dt-bindings: perf: arm-cmn: Add CI-700 perf/arm-cmn: Support new IP features perf/arm-cmn: Demarcate CMN-600 specifics perf/arm-cmn: Move group validation data off-stack perf/arm-cmn: Optimise DTC counter accesses ... * for-next/misc: : Miscellaneous patches arm64: Use correct method to calculate nomap region boundaries arm64: Drop outdated links in comments arm64: errata: Fix exec handling in erratum 1418040 workaround arm64: Unhash early pointer print plus improve comment asm-generic: introduce io_stop_wc() and add implementation for ARM64 arm64: remove __dma_*_area() aliases docs/arm64: delete a space from tagged-address-abi arm64/fp: Add comments documenting the usage of state restore functions arm64: mm: Use asid feature macro for cheanup arm64: mm: Rename asid2idx() to ctxid2asid() arm64: kexec: reduce calls to page_address() arm64: extable: remove unused ex_handler_t definition arm64: entry: Use SDEI event constants arm64: Simplify checking for populated DT arm64/kvm: Fix bitrotted comment for SVE handling in handle_exit.c * for-next/cache-ops-dzp: : Avoid DC instructions when DCZID_EL0.DZP == 1 arm64: mte: DC {GVA,GZVA} shouldn't be used when DCZID_EL0.DZP == 1 arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 * for-next/stacktrace: : Unify the arm64 unwind code arm64: Make some stacktrace functions private arm64: Make dump_backtrace() use arch_stack_walk() arm64: Make profile_pc() use arch_stack_walk() arm64: Make return_address() use arch_stack_walk() arm64: Make __get_wchan() use arch_stack_walk() arm64: Make perf_callchain_kernel() use arch_stack_walk() arm64: Mark __switch_to() as __sched arm64: Add comment for stack_info::kr_cur arch: Make ARCH_STACKWALK independent of STACKTRACE * for-next/xor-neon: : Use SHA3 instructions to speed up XOR arm64/xor: use EOR3 instructions when available * for-next/kasan: : Log potential KASAN shadow aliases arm64: mm: log potential KASAN shadow alias arm64: mm: use die_kernel_fault() in do_mem_abort() * for-next/armv8_7-fp: : Add HWCAPS for ARMv8.7 FEAT_AFP amd FEAT_RPRES arm64: cpufeature: add HWCAP for FEAT_RPRES arm64: add ID_AA64ISAR2_EL1 sys register arm64: cpufeature: add HWCAP for FEAT_AFP * for-next/atomics: : arm64 atomics clean-ups and codegen improvements arm64: atomics: lse: define RETURN ops in terms of FETCH ops arm64: atomics: lse: improve constraints for simple ops arm64: atomics: lse: define ANDs in terms of ANDNOTs arm64: atomics lse: define SUBs in terms of ADDs arm64: atomics: format whitespace consistently * for-next/bti: : BTI clean-ups arm64: Ensure that the 'bti' macro is defined where linkage.h is included arm64: Use BTI C directly and unconditionally arm64: Unconditionally override SYM_FUNC macros arm64: Add macro version of the BTI instruction arm64: ftrace: add missing BTIs arm64: kexec: use __pa_symbol(empty_zero_page) arm64: update PAC description for kernel * for-next/sve: : SVE code clean-ups and refactoring in prepararation of Scalable Matrix Extensions arm64/sve: Minor clarification of ABI documentation arm64/sve: Generalise vector length configuration prctl() for SME arm64/sve: Make sysctl interface for SVE reusable by SME * for-next/kselftest: : arm64 kselftest additions kselftest/arm64: Add pidbench for floating point syscall cases kselftest/arm64: Add a test program to exercise the syscall ABI kselftest/arm64: Allow signal tests to trigger from a function kselftest/arm64: Parameterise ptrace vector length information * for-next/kcsan: : Enable KCSAN for arm64 arm64: Enable KCSAN
2021-12-22arm64: errata: Fix exec handling in erratum 1418040 workaroundD Scott Phillips1-23/+16
The erratum 1418040 workaround enables CNTVCT_EL1 access trapping in EL0 when executing compat threads. The workaround is applied when switching between tasks, but the need for the workaround could also change at an exec(), when a non-compat task execs a compat binary or vice versa. Apply the workaround in arch_setup_new_exec(). This leaves a small window of time between SET_PERSONALITY and arch_setup_new_exec where preemption could occur and confuse the old workaround logic that compares TIF_32BIT between prev and next. Instead, we can just read cntkctl to make sure it's in the state that the next task needs. I measured cntkctl read time to be about the same as a mov from a general-purpose register on N1. Update the workaround logic to examine the current value of cntkctl instead of the previous task's compat state. Fixes: d49f7d7376d0 ("arm64: Move handling of erratum 1418040 into C code") Cc: <stable@vger.kernel.org> # 5.9.x Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211220234114.3926-1-scott@os.amperecomputing.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-12-10arm64: Make __get_wchan() use arch_stack_walk()Madhavan T. Venkataraman1-17/+25
To enable RELIABLE_STACKTRACE and LIVEPATCH on arm64, we need to substantially rework arm64's unwinding code. As part of this, we want to minimize the set of unwind interfaces we expose, and avoid open-coding of unwind logic outside of stacktrace.c. Currently, __get_wchan() walks the stack of a blocked task by calling start_backtrace() with the task's saved PC and FP values, and iterating unwind steps using unwind_frame(). The initialization is functionally equivalent to calling arch_stack_walk() with the blocked task, which will start with the task's saved PC and FP values. Currently __get_wchan() always performs an initial unwind step, which will stkip __switch_to(), but as this is now marked as a __sched function, this no longer needs special handling and will be skipped in the same way as other sched functions. Make __get_wchan() use arch_stack_walk(). This simplifies __get_wchan(), and in future will alow us to make unwind_frame() private to stacktrace.c. At the same time, we can simplify the try_get_task_stack() check and avoid the unnecessary `stack_page` variable. The change to the skipping logic means we may terminate one frame earlier than previously where there are an excessive number of sched functions in the trace, but this isn't seen in practice, and wchan is best-effort anyway, so this should not be a problem. Other than the above, there should be no functional change as a result of this patch. Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> [Mark: rebase atop wchan changes, elaborate commit message, fix includes] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211129142849.3056714-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-12-10arm64: Mark __switch_to() as __schedMark Rutland1-1/+2
Unlike most architectures (and only in keeping with powerpc), arm64 has a non __sched() function on the path to our cpu_switch_to() assembly function. It is expected that for a blocked task, in_sched_functions() can be used to skip all functions between the raw context switch assembly and the scheduler functions that call into __switch_to(). This is the behaviour expected by stack_trace_consume_entry_nosched(), and the behaviour we'd like to have such that we an simplify arm64's __get_wchan() implementation to use arch_stack_walk(). This patch mark's arm64's __switch_to as __sched. This *will not* change the behaviour of arm64's current __get_wchan() implementation, which always performs an initial unwind step which skips __switch_to(). This *will* change the behaviour of stack_trace_consume_entry_nosched() and stack_trace_save_tsk() to match their expected behaviour on blocked tasks, skipping all scheduler-internal functions including __switch_to(). Other than the above, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211129142849.3056714-4-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-10-15sched: Add wrapper for get_wchan() to keep task blockedKees Cook1-3/+1
Having a stable wchan means the process must be blocked and for it to stay that way while performing stack unwinding. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> [arm] Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64] Link: https://lkml.kernel.org/r/20211008111626.332092234@infradead.org
2021-09-16arm64: Mark __stack_chk_guard as __ro_after_initDan Li1-1/+1
__stack_chk_guard is setup once while init stage and never changed after that. Although the modification of this variable at runtime will usually cause the kernel to crash (so does the attacker), it should be marked as __ro_after_init, and it should not affect performance if it is placed in the ro_after_init section. Signed-off-by: Dan Li <ashimida@linux.alibaba.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/1631612642-102881-1-git-send-email-ashimida@linux.alibaba.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-09-16arm64/kernel: remove duplicate include in process.cLv Ruyi1-1/+0
Remove all but the first include of linux/sched.h from process.c Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210902011126.29828-1-lv.ruyi@zte.com.cn Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-09-03Merge tag 'kbuild-v5.15' of ↵Linus Torvalds1-3/+0
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Add -s option (strict mode) to merge_config.sh to make it fail when any symbol is redefined. - Show a warning if a different compiler is used for building external modules. - Infer --target from ARCH for CC=clang to let you cross-compile the kernel without CROSS_COMPILE. - Make the integrated assembler default (LLVM_IAS=1) for CC=clang. - Add <linux/stdarg.h> to the kernel source instead of borrowing <stdarg.h> from the compiler. - Add Nick Desaulniers as a Kbuild reviewer. - Drop stale cc-option tests. - Fix the combination of CONFIG_TRIM_UNUSED_KSYMS and CONFIG_LTO_CLANG to handle symbols in inline assembly. - Show a warning if 'FORCE' is missing for if_changed rules. - Various cleanups * tag 'kbuild-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (39 commits) kbuild: redo fake deps at include/ksym/*.h kbuild: clean up objtool_args slightly modpost: get the *.mod file path more simply checkkconfigsymbols.py: Fix the '--ignore' option kbuild: merge vmlinux_link() between ARCH=um and other architectures kbuild: do not remove 'linux' link in scripts/link-vmlinux.sh kbuild: merge vmlinux_link() between the ordinary link and Clang LTO kbuild: remove stale *.symversions kbuild: remove unused quiet_cmd_update_lto_symversions gen_compile_commands: extract compiler command from a series of commands x86: remove cc-option-yn test for -mtune= arc: replace cc-option-yn uses with cc-option s390: replace cc-option-yn uses with cc-option ia64: move core-y in arch/ia64/Makefile to arch/ia64/Kbuild sparc: move the install rule to arch/sparc/Makefile security: remove unneeded subdir-$(CONFIG_...) kbuild: sh: remove unused install script kbuild: Fix 'no symbols' warning when CONFIG_TRIM_UNUSD_KSYMS=y kbuild: Switch to 'f' variants of integrated assembler flag kbuild: Shuffle blank line to improve comment meaning ...
2021-08-31Merge remote-tracking branch 'tip/sched/arm64' into for-next/coreCatalin Marinas1-11/+36
* tip/sched/arm64: (785 commits) Documentation: arm64: describe asymmetric 32-bit support arm64: Remove logic to kill 32-bit tasks on 64-bit-only cores arm64: Hook up cmdline parameter to allow mismatched 32-bit EL0 arm64: Advertise CPUs capable of running 32-bit applications in sysfs arm64: Prevent offlining first CPU with 32-bit EL0 on mismatched system arm64: exec: Adjust affinity for compat tasks with mismatched 32-bit EL0 arm64: Implement task_cpu_possible_mask() sched: Introduce dl_task_check_affinity() to check proposed affinity sched: Allow task CPU affinity to be restricted on asymmetric systems sched: Split the guts of sched_setaffinity() into a helper function sched: Introduce task_struct::user_cpus_ptr to track requested affinity sched: Reject CPU affinity changes based on task_cpu_possible_mask() cpuset: Cleanup cpuset_cpus_allowed_fallback() use in select_fallback_rq() cpuset: Honour task_cpu_possible_mask() in guarantee_online_cpus() cpuset: Don't use the cpu_possible_mask as a last resort for cgroup v1 sched: Introduce task_cpu_possible_mask() to limit fallback rq selection sched: Cgroup SCHED_IDLE support sched/topology: Skip updating masks for non-online nodes Linux 5.14-rc6 lib: use PFN_PHYS() in devmem_is_allowed() ...
2021-08-26Merge branches 'for-next/mte', 'for-next/misc' and 'for-next/kselftest', ↵Catalin Marinas1-17/+14
remote-tracking branch 'arm64/for-next/perf' into for-next/core * arm64/for-next/perf: arm64/perf: Replace '0xf' instances with ID_AA64DFR0_PMUVER_IMP_DEF * for-next/mte: : Miscellaneous MTE improvements. arm64/cpufeature: Optionally disable MTE via command-line arm64: kasan: mte: remove redundant mte_report_once logic arm64: kasan: mte: use a constant kernel GCR_EL1 value arm64: avoid double ISB on kernel entry arm64: mte: optimize GCR_EL1 modification on kernel entry/exit Documentation: document the preferred tag checking mode feature arm64: mte: introduce a per-CPU tag checking mode preference arm64: move preemption disablement to prctl handlers arm64: mte: change ASYNC and SYNC TCF settings into bitfields arm64: mte: rename gcr_user_excl to mte_ctrl arm64: mte: avoid TFSRE0_EL1 related operations unless in async mode * for-next/misc: : Miscellaneous updates. arm64: Do not trap PMSNEVFR_EL1 arm64: mm: fix comment typo of pud_offset_phys() arm64: signal32: Drop pointless call to sigdelsetmask() arm64/sve: Better handle failure to allocate SVE register storage arm64: Document the requirement for SCR_EL3.HCE arm64: head: avoid over-mapping in map_memory arm64/sve: Add a comment documenting the binutils needed for SVE asm arm64/sve: Add some comments for sve_save/load_state() arm64: replace in_irq() with in_hardirq() arm64: mm: Fix TLBI vs ASID rollover arm64: entry: Add SYM_CODE annotation for __bad_stack arm64: fix typo in a comment arm64: move the (z)install rules to arch/arm64/Makefile arm64/sve: Make fpsimd_bind_task_to_cpu() static arm64: unnecessary end 'return;' in void functions arm64/sme: Document boot requirements for SME arm64: use __func__ to get function name in pr_err arm64: SSBS/DIT: print SSBS and DIT bit when printing PSTATE arm64: cpufeature: Use defined macro instead of magic numbers arm64/kexec: Test page size support with new TGRAN range values * for-next/kselftest: : Kselftest additions for arm64. kselftest/arm64: signal: Add a TODO list for signal handling tests kselftest/arm64: signal: Add test case for SVE register state in signals kselftest/arm64: signal: Verify that signals can't change the SVE vector length kselftest/arm64: signal: Check SVE signal frame shows expected vector length kselftest/arm64: signal: Support signal frames with SVE register data kselftest/arm64: signal: Add SVE to the set of features we can check for kselftest/arm64: pac: Fix skipping of tests on systems without PAC kselftest/arm64: mte: Fix misleading output when skipping tests kselftest/arm64: Add a TODO list for floating point tests kselftest/arm64: Add tests for SVE vector configuration kselftest/arm64: Validate vector lengths are set in sve-probe-vls kselftest/arm64: Provide a helper binary and "library" for SVE RDVL kselftest/arm64: Ignore check_gcr_el1_cswitch binary
2021-08-20arm64: Remove logic to kill 32-bit tasks on 64-bit-only coresWill Deacon1-13/+1
The scheduler now knows enough about these braindead systems to place 32-bit tasks accordingly, so throw out the safety checks and allow the ret-to-user path to avoid do_notify_resume() if there is nothing to do. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210730112443.23245-16-will@kernel.org
2021-08-20arm64: exec: Adjust affinity for compat tasks with mismatched 32-bit EL0Will Deacon1-1/+38
When exec'ing a 32-bit task on a system with mismatched support for 32-bit EL0, try to ensure that it starts life on a CPU that can actually run it. Similarly, when exec'ing a 64-bit task on such a system, try to restore the old affinity mask if it was previously restricted. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Reviewed-by: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20210730112443.23245-12-will@kernel.org
2021-08-19isystem: trim/fixup stdarg.h and other headersAlexey Dobriyan1-3/+0
Delete/fixup few includes in anticipation of global -isystem compile option removal. Note: crypto/aegis128-neon-inner.c keeps <stddef.h> due to redefinition of uintptr_t error (one definition comes from <stddef.h>, another from <linux/types.h>). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2021-07-30arm64: SSBS/DIT: print SSBS and DIT bit when printing PSTATELingyan Huang1-3/+7
The current code to print PSTATE when generating backtraces does not include SSBS bit and DIT bit, so add this information. Cc: Vladimir Murzin <vladimir.murzin@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Lingyan Huang <huanglingyan2@huawei.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/1626920436-54816-1-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-07-28arm64: move preemption disablement to prctl handlersPeter Collingbourne1-14/+7
In the next patch, we will start reading sctlr_user from mte_update_sctlr_user and subsequently writing a new value based on the task's TCF setting and potentially the per-CPU TCF preference. This means that we need to be careful to disable preemption around any code sequences that read from sctlr_user and subsequently write to sctlr_user and/or SCTLR_EL1, so that we don't end up writing a stale value (based on the previous CPU's TCF preference) to either of them. We currently have four such sequences, in the prctl handlers for PR_SET_TAGGED_ADDR_CTRL and PR_PAC_SET_ENABLED_KEYS, as well as in the task initialization code that resets the prctl settings. Change the prctl handlers to disable preemption in the handlers themselves rather than the functions that they call, and change the task initialization code to call the respective prctl handlers instead of setting sctlr_user directly. As a result of this change, we no longer need the helper function set_task_sctlr_el1, nor does its behavior make sense any more, so remove it. Signed-off-by: Peter Collingbourne <pcc@google.com> Link: https://linux-review.googlesource.com/id/Ic0e8a0c00bb47d786c1e8011df0b7fe99bee4bb5 Link: https://lore.kernel.org/r/20210727205300.2554659-4-pcc@google.com Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-07-06Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds1-6/+1
Pull ARM development updates from Russell King: - Make it clear __swp_entry_to_pte() uses PTE_TYPE_FAULT - Updates for setting vmalloc size via command line to resolve an issue with the 8MiB hole not properly being accounted for, and clean up the code. - ftrace support for module PLTs - Spelling fixes - kbuild updates for removing generated files and pattern rules for generating files - Clang/llvm updates - Change the way the kernel is mapped, placing it in vmalloc space instead. - Remove arm_pm_restart from arm and aarch64. * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: (29 commits) ARM: 9098/1: ftrace: MODULE_PLT: Fix build problem without DYNAMIC_FTRACE ARM: 9097/1: mmu: Declare section start/end correctly ARM: 9096/1: Remove arm_pm_restart() ARM: 9095/1: ARM64: Remove arm_pm_restart() ARM: 9094/1: Register with kernel restart handler ARM: 9093/1: drivers: firmwapsci: Register with kernel restart handler ARM: 9092/1: xen: Register with kernel restart handler ARM: 9091/1: Revert "mm: qsd8x50: Fix incorrect permission faults" ARM: 9090/1: Map the lowmem and kernel separately ARM: 9089/1: Define kernel physical section start and end ARM: 9088/1: Split KERNEL_OFFSET from PAGE_OFFSET ARM: 9087/1: kprobes: test-thumb: fix for LLVM_IAS=1 ARM: 9086/1: syscalls: use pattern rules to generate syscall headers ARM: 9085/1: remove unneeded abi parameter to syscallnr.sh ARM: 9084/1: simplify the build rule of mach-types.h ARM: 9083/1: uncompress: atags_to_fdt: Spelling s/REturn/Return/ ARM: 9082/1: [v2] mark prepare_page_table as __init ARM: 9079/1: ftrace: Add MODULE_PLTS support ARM: 9078/1: Add warn suppress parameter to arm_gen_branch_link() ARM: 9077/1: PLT: Move struct plt_entries definition to header ...
2021-06-28Merge tag 'arm64-upstream' of ↵Linus Torvalds1-76/+23
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "There's a reasonable amount here and the juicy details are all below. It's worth noting that the MTE/KASAN changes strayed outside of our usual directories due to core mm changes and some associated changes to some other architectures; Andrew asked for us to carry these [1] rather that take them via the -mm tree. Summary: - Optimise SVE switching for CPUs with 128-bit implementations. - Fix output format from SVE selftest. - Add support for versions v1.2 and 1.3 of the SMC calling convention. - Allow Pointer Authentication to be configured independently for kernel and userspace. - PMU driver cleanups for managing IRQ affinity and exposing event attributes via sysfs. - KASAN optimisations for both hardware tagging (MTE) and out-of-line software tagging implementations. - Relax frame record alignment requirements to facilitate 8-byte alignment with KASAN and Clang. - Cleanup of page-table definitions and removal of unused memory types. - Reduction of ARCH_DMA_MINALIGN back to 64 bytes. - Refactoring of our instruction decoding routines and addition of some missing encodings. - Move entry code moved into C and hardened against harmful compiler instrumentation. - Update booting requirements for the FEAT_HCX feature, added to v8.7 of the architecture. - Fix resume from idle when pNMI is being used. - Additional CPU sanity checks for MTE and preparatory changes for systems where not all of the CPUs support 32-bit EL0. - Update our kernel string routines to the latest Cortex Strings implementation. - Big cleanup of our cache maintenance routines, which were confusingly named and inconsistent in their implementations. - Tweak linker flags so that GDB can understand vmlinux when using RELR relocations. - Boot path cleanups to enable early initialisation of per-cpu operations needed by KCSAN. - Non-critical fixes and miscellaneous cleanup" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (150 commits) arm64: tlb: fix the TTL value of tlb_get_level arm64: Restrict undef hook for cpufeature registers arm64/mm: Rename ARM64_SWAPPER_USES_SECTION_MAPS arm64: insn: avoid circular include dependency arm64: smp: Bump debugging information print down to KERN_DEBUG drivers/perf: fix the missed ida_simple_remove() in ddr_perf_probe() perf/arm-cmn: Fix invalid pointer when access dtc object sharing the same IRQ number arm64: suspend: Use cpuidle context helpers in cpu_suspend() PSCI: Use cpuidle context helpers in psci_cpu_suspend_enter() arm64: Convert cpu_do_idle() to using cpuidle context helpers arm64: Add cpuidle context save/restore helpers arm64: head: fix code comments in set_cpu_boot_mode_flag arm64: mm: drop unused __pa(__idmap_text_start) arm64: mm: fix the count comments in compute_indices arm64/mm: Fix ttbr0 values stored in struct thread_info for software-pan arm64: mm: Pass original fault address to handle_mm_fault() arm64/mm: Drop SECTION_[SHIFT|SIZE|MASK] arm64/mm: Use CONT_PMD_SHIFT for ARM64_MEMSTART_SHIFT arm64/mm: Drop SWAPPER_INIT_MAP_SIZE arm64: Conditionally configure PTR_AUTH key of the kernel. ...
2021-06-24Merge branch 'for-next/entry' into for-next/coreWill Deacon1-52/+0
The never-ending entry.S refactoring continues, putting us in a much better place wrt compiler instrumentation whilst moving more of the code into C. * for-next/entry: arm64: idle: don't instrument idle code with KCOV arm64: entry: don't instrument entry code with KCOV arm64: entry: make NMI entry/exit functions static arm64: entry: split SDEI entry arm64: entry: split bad stack entry arm64: entry: fold el1_inv() into el1h_64_sync_handler() arm64: entry: handle all vectors with C arm64: entry: template the entry asm functions arm64: entry: improve bad_mode() arm64: entry: move bad_mode() to entry-common.c arm64: entry: consolidate EL1 exception returns arm64: entry: organise entry vectors consistently arm64: entry: organise entry handlers consistently arm64: entry: convert IRQ+FIQ handlers to C arm64: entry: add a call_on_irq_stack helper arm64: entry: move NMI preempt logic to C arm64: entry: move arm64_preempt_schedule_irq to entry-common.c arm64: entry: convert SError handlers to C arm64: entry: unmask IRQ+FIQ after EL0 handling arm64: remove redundant local_daif_mask() in bad_mode()
2021-06-24Merge branch 'for-next/cpuidle' into for-next/coreWill Deacon1-32/+9
Fix resume from idle when pNMI is being used. * for-next/cpuidle: arm64: suspend: Use cpuidle context helpers in cpu_suspend() PSCI: Use cpuidle context helpers in psci_cpu_suspend_enter() arm64: Convert cpu_do_idle() to using cpuidle context helpers arm64: Add cpuidle context save/restore helpers
2021-06-24Merge branch 'for-next/cpufeature' into for-next/coreWill Deacon1-1/+18
Additional CPU sanity checks for MTE and preparatory changes for systems where not all of the CPUs support 32-bit EL0. * for-next/cpufeature: arm64: Restrict undef hook for cpufeature registers arm64: Kill 32-bit applications scheduled on 64-bit-only CPUs KVM: arm64: Kill 32-bit vCPUs on systems with mismatched EL0 support arm64: Allow mismatched 32-bit EL0 support arm64: cpuinfo: Split AArch32 registers out into a separate struct arm64: Check if GMID_EL1.BS is the same on all CPUs arm64: Change the cpuinfo_arm64 member type for some sysregs to u64
2021-06-18sched: Introduce task_is_running()Peter Zijlstra1-1/+1
Replace a bunch of 'p->state == TASK_RUNNING' with a new helper: task_is_running(p). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210611082838.222401495@infradead.org
2021-06-17arm64: Convert cpu_do_idle() to using cpuidle context helpersMarc Zyngier1-32/+9
Now that we have helpers that are aware of the pseudo-NMI feature, introduce them to cpu_do_idle(). This allows for some nice cleanup. No functional change intended. Tested-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210615111227.2454465-3-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-06-13ARM: 9095/1: ARM64: Remove arm_pm_restart()Guenter Roeck1-6/+1
All users of arm_pm_restart() have been converted to use the kernel restart handler. Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2021-06-11arm64: Kill 32-bit applications scheduled on 64-bit-only CPUsWill Deacon1-1/+18
Scheduling a 32-bit application on a 64-bit-only CPU is a bad idea. Ensure that 32-bit applications always take the slow-path when returning to userspace on a system with mismatched support at EL0, so that we can avoid trying to run on a 64-bit-only CPU and force a SIGKILL instead. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210608180313.11502-5-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07arm64: idle: don't instrument idle code with KCOVMark Rutland1-57/+0
The low-level idle code in arch_cpu_idle() and its callees runs at a time where where portions of the kernel environment aren't available. For example, RCU may not be watching, and lockdep state may be out-of-sync with the hardware. Due to this, it is not sound to instrument this code. We generally avoid instrumentation by marking the entry functions as `noinstr`, but currently this doesn't inhibit KCOV instrumentation. Prevent this by factoring these functions into a new idle.c so that we can disable KCOV for the entire compilation unit, as is done for the core idle code in kernel/sched/idle.c. We'd like to keep instrumentation of the rest of process.c, and for the existing code in cpuidle.c, so a new compilation unit is preferable. The arch_cpu_idle_dead() function in process.c is a cpu hotplug function that is safe to instrument, so it is left as-is in process.c. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210607094624.34689-21-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07arm64: entry: move arm64_preempt_schedule_irq to entry-common.cMark Rutland1-17/+0
Subsequent patches will pull more of the IRQ entry handling into C. To keep this in one place, let's move arm64_preempt_schedule_irq() into entry-common.c along with the other entry management functions. We no longer need to include <linux/lockdep.h> in process.c, so the include directive is removed. There should be no functional change as a result of this patch. Reviewed-by Joey Gouly <joey.gouly@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210607094624.34689-5-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: Implement stack trace termination recordMadhavan T. Venkataraman1-0/+5
Reliable stacktracing requires that we identify when a stacktrace is terminated early. We can do this by ensuring all tasks have a final frame record at a known location on their task stack, and checking that this is the final frame record in the chain. We'd like to use task_pt_regs(task)->stackframe as the final frame record, as this is already setup upon exception entry from EL0. For kernel tasks we need to consistently reserve the pt_regs and point x29 at this, which we can do with small changes to __primary_switched, __secondary_switched, and copy_process(). Since the final frame record must be at a specific location, we must create the final frame record in __primary_switched and __secondary_switched rather than leaving this to start_kernel and secondary_start_kernel. Thus, __primary_switched and __secondary_switched will now show up in stacktraces for the idle tasks. Since the final frame record is now identified by its location rather than by its contents, we identify it at the start of unwind_frame(), before we read any values from it. External debuggers may terminate the stack trace when FP == 0. In the pt_regs->stackframe, the PC is 0 as well. So, stack traces taken in the debugger may print an extra record 0x0 at the end. While this is not pretty, this does not do any harm. This is a small price to pay for having reliable stack trace termination in the kernel. That said, gdb does not show the extra record probably because it uses DWARF and not frame pointers for stack traces. Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> [Mark: rebase, use ASM_BUG(), update comments, update commit message] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210510110026.18061-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-07Merge tag 'arm64-fixes' of ↵Linus Torvalds1-6/+3
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull more arm64 updates from Catalin Marinas: "A mix of fixes and clean-ups that turned up too late for the first pull request: - Restore terminal stack frame records. Their previous removal caused traces which cross secondary_start_kernel to terminate one entry too late, with a spurious "0" entry. - Fix boot warning with pseudo-NMI due to the way we manipulate the PMR register. - ACPI fixes: avoid corruption of interrupt mappings on watchdog probe failure (GTDT), prevent unregistering of GIC SGIs. - Force SPARSEMEM_VMEMMAP as the only memory model, it saves with having to test all the other combinations. - Documentation fixes and updates: tagged address ABI exceptions on brk/mmap/mremap(), event stream frequency, update booting requirements on the configuration of traps" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: kernel: Update the stale comment arm64: Fix the documented event stream frequency arm64: entry: always set GIC_PRIO_PSR_I_SET during entry arm64: Explicitly document boot requirements for SVE arm64: Explicitly require that FPSIMD instructions do not trap arm64: Relax booting requirements for configuration of traps arm64: cpufeatures: use min and max arm64: stacktrace: restore terminal records arm64/vdso: Discard .note.gnu.property sections in vDSO arm64: doc: Add brk/mmap/mremap() to the Tagged Address ABI Exceptions psci: Remove unneeded semicolon ACPI: irq: Prevent unregistering of GIC SGIs ACPI: GTDT: Don't corrupt interrupt mappings on watchdow probe failure arm64: Show three registers per line arm64: remove HAVE_DEBUG_BUGVERBOSE arm64: alternative: simplify passing alt_region arm64: Force SPARSEMEM_VMEMMAP as the only memory management model arm64: vdso32: drop -no-integrated-as flag
2021-04-26Merge tag 'arm64-upstream' of ↵Linus Torvalds1-3/+32
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - MTE asynchronous support for KASan. Previously only synchronous (slower) mode was supported. Asynchronous is faster but does not allow precise identification of the illegal access. - Run kernel mode SIMD with softirqs disabled. This allows using NEON in softirq context for crypto performance improvements. The conditional yield support is modified to take softirqs into account and reduce the latency. - Preparatory patches for Apple M1: handle CPUs that only have the VHE mode available (host kernel running at EL2), add FIQ support. - arm64 perf updates: support for HiSilicon PA and SLLC PMU drivers, new functions for the HiSilicon HHA and L3C PMU, cleanups. - Re-introduce support for execute-only user permissions but only when the EPAN (Enhanced Privileged Access Never) architecture feature is available. - Disable fine-grained traps at boot and improve the documented boot requirements. - Support CONFIG_KASAN_VMALLOC on arm64 (only with KASAN_GENERIC). - Add hierarchical eXecute Never permissions for all page tables. - Add arm64 prctl(PR_PAC_{SET,GET}_ENABLED_KEYS) allowing user programs to control which PAC keys are enabled in a particular task. - arm64 kselftests for BTI and some improvements to the MTE tests. - Minor improvements to the compat vdso and sigpage. - Miscellaneous cleanups. * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (86 commits) arm64/sve: Add compile time checks for SVE hooks in generic functions arm64/kernel/probes: Use BUG_ON instead of if condition followed by BUG. arm64: pac: Optimize kernel entry/exit key installation code paths arm64: Introduce prctl(PR_PAC_{SET,GET}_ENABLED_KEYS) arm64: mte: make the per-task SCTLR_EL1 field usable elsewhere arm64/sve: Remove redundant system_supports_sve() tests arm64: fpsimd: run kernel mode NEON with softirqs disabled arm64: assembler: introduce wxN aliases for wN registers arm64: assembler: remove conditional NEON yield macros kasan, arm64: tests supports for HW_TAGS async mode arm64: mte: Report async tag faults before suspend arm64: mte: Enable async tag check fault arm64: mte: Conditionally compile mte_enable_kernel_*() arm64: mte: Enable TCO in functions that can read beyond buffer limits kasan: Add report for async mode arm64: mte: Drop arch_enable_tagging() kasan: Add KASAN mode kernel parameter arm64: mte: Add asynchronous mode support arm64: Get rid of CONFIG_ARM64_VHE arm64: Cope with CPUs stuck in VHE mode ...
2021-04-23arm64: Show three registers per lineMatthew Wilcox (Oracle)1-6/+3
Displaying two registers per line takes 15 lines. That improves to just 10 lines if we display three registers per line, which reduces the amount of information lost when oopses are cut off. It stays within 80 columns and matches x86-64. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210420172245.3679077-1-willy@infradead.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-15Merge branch 'for-next/pac-set-get-enabled-keys' into for-next/coreCatalin Marinas1-2/+31
* for-next/pac-set-get-enabled-keys: : Introduce arm64 prctl(PR_PAC_{SET,GET}_ENABLED_KEYS). arm64: pac: Optimize kernel entry/exit key installation code paths arm64: Introduce prctl(PR_PAC_{SET,GET}_ENABLED_KEYS) arm64: mte: make the per-task SCTLR_EL1 field usable elsewhere
2021-04-13arm64: pac: Optimize kernel entry/exit key installation code pathsPeter Collingbourne1-0/+1
The kernel does not use any keys besides IA so we don't need to install IB/DA/DB/GA on kernel exit if we arrange to install them on task switch instead, which we can expect to happen an order of magnitude less often. Furthermore we can avoid installing the user IA in the case where the user task has IA disabled and just leave the kernel IA installed. This also lets us avoid needing to install IA on kernel entry. On an Apple M1 under a hypervisor, the overhead of kernel entry/exit has been measured to be reduced by 15.6ns in the case where IA is enabled, and 31.9ns in the case where IA is disabled. Signed-off-by: Peter Collingbourne <pcc@google.com> Link: https://linux-review.googlesource.com/id/Ieddf6b580d23c9e0bed45a822dabe72d2ffc9a8e Link: https://lore.kernel.org/r/2d653d055f38f779937f2b92f8ddd5cf9e4af4f4.1616123271.git.pcc@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-13arm64: Introduce prctl(PR_PAC_{SET,GET}_ENABLED_KEYS)Peter Collingbourne1-3/+7
This change introduces a prctl that allows the user program to control which PAC keys are enabled in a particular task. The main reason why this is useful is to enable a userspace ABI that uses PAC to sign and authenticate function pointers and other pointers exposed outside of the function, while still allowing binaries conforming to the ABI to interoperate with legacy binaries that do not sign or authenticate pointers. The idea is that a dynamic loader or early startup code would issue this prctl very early after establishing that a process may load legacy binaries, but before executing any PAC instructions. This change adds a small amount of overhead to kernel entry and exit due to additional required instruction sequences. On a DragonBoard 845c (Cortex-A75) with the powersave governor, the overhead of similar instruction sequences was measured as 4.9ns when simulating the common case where IA is left enabled, or 43.7ns when simulating the uncommon case where IA is disabled. These numbers can be seen as the worst case scenario, since in more realistic scenarios a better performing governor would be used and a newer chip would be used that would support PAC unlike Cortex-A75 and would be expected to be faster than Cortex-A75. On an Apple M1 under a hypervisor, the overhead of the entry/exit instruction sequences introduced by this patch was measured as 0.3ns in the case where IA is left enabled, and 33.0ns in the case where IA is disabled. Signed-off-by: Peter Collingbourne <pcc@google.com> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Link: https://linux-review.googlesource.com/id/Ibc41a5e6a76b275efbaa126b31119dc197b927a5 Link: https://lore.kernel.org/r/d6609065f8f40397a4124654eb68c9f490b4d477.1616123271.git.pcc@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>