diff options
88 files changed, 2192 insertions, 700 deletions
diff --git a/Documentation/arm64/amu.rst b/Documentation/arm64/amu.rst new file mode 100644 index 000000000000..5057b11100ed --- /dev/null +++ b/Documentation/arm64/amu.rst @@ -0,0 +1,112 @@ +======================================================= +Activity Monitors Unit (AMU) extension in AArch64 Linux +======================================================= + +Author: Ionela Voinescu <ionela.voinescu@arm.com> + +Date: 2019-09-10 + +This document briefly describes the provision of Activity Monitors Unit +support in AArch64 Linux. + + +Architecture overview +--------------------- + +The activity monitors extension is an optional extension introduced by the +ARMv8.4 CPU architecture. + +The activity monitors unit, implemented in each CPU, provides performance +counters intended for system management use. The AMU extension provides a +system register interface to the counter registers and also supports an +optional external memory-mapped interface. + +Version 1 of the Activity Monitors architecture implements a counter group +of four fixed and architecturally defined 64-bit event counters. + - CPU cycle counter: increments at the frequency of the CPU. + - Constant counter: increments at the fixed frequency of the system + clock. + - Instructions retired: increments with every architecturally executed + instruction. + - Memory stall cycles: counts instruction dispatch stall cycles caused by + misses in the last level cache within the clock domain. + +When in WFI or WFE these counters do not increment. + +The Activity Monitors architecture provides space for up to 16 architected +event counters. Future versions of the architecture may use this space to +implement additional architected event counters. + +Additionally, version 1 implements a counter group of up to 16 auxiliary +64-bit event counters. + +On cold reset all counters reset to 0. + + +Basic support +------------- + +The kernel can safely run a mix of CPUs with and without support for the +activity monitors extension. Therefore, when CONFIG_ARM64_AMU_EXTN is +selected we unconditionally enable the capability to allow any late CPU +(secondary or hotplugged) to detect and use the feature. + +When the feature is detected on a CPU, we flag the availability of the +feature but this does not guarantee the correct functionality of the +counters, only the presence of the extension. + +Firmware (code running at higher exception levels, e.g. arm-tf) support is +needed to: + - Enable access for lower exception levels (EL2 and EL1) to the AMU + registers. + - Enable the counters. If not enabled these will read as 0. + - Save/restore the counters before/after the CPU is being put/brought up + from the 'off' power state. + +When using kernels that have this feature enabled but boot with broken +firmware the user may experience panics or lockups when accessing the +counter registers. Even if these symptoms are not observed, the values +returned by the register reads might not correctly reflect reality. Most +commonly, the counters will read as 0, indicating that they are not +enabled. + +If proper support is not provided in firmware it's best to disable +CONFIG_ARM64_AMU_EXTN. To be noted that for security reasons, this does not +bypass the setting of AMUSERENR_EL0 to trap accesses from EL0 (userspace) to +EL1 (kernel). Therefore, firmware should still ensure accesses to AMU registers +are not trapped in EL2/EL3. + +The fixed counters of AMUv1 are accessible though the following system +register definitions: + - SYS_AMEVCNTR0_CORE_EL0 + - SYS_AMEVCNTR0_CONST_EL0 + - SYS_AMEVCNTR0_INST_RET_EL0 + - SYS_AMEVCNTR0_MEM_STALL_EL0 + +Auxiliary platform specific counters can be accessed using +SYS_AMEVCNTR1_EL0(n), where n is a value between 0 and 15. + +Details can be found in: arch/arm64/include/asm/sysreg.h. + + +Userspace access +---------------- + +Currently, access from userspace to the AMU registers is disabled due to: + - Security reasons: they might expose information about code executed in + secure mode. + - Purpose: AMU counters are intended for system management use. + +Also, the presence of the feature is not visible to userspace. + + +Virtualization +-------------- + +Currently, access from userspace (EL0) and kernelspace (EL1) on the KVM +guest side is disabled due to: + - Security reasons: they might expose information about code executed + by other guests or the host. + +Any attempt to access the AMU registers will result in an UNDEFINED +exception being injected into the guest. diff --git a/Documentation/arm64/booting.rst b/Documentation/arm64/booting.rst index 5d78a6f5b0ae..a3f1a47b6f1c 100644 --- a/Documentation/arm64/booting.rst +++ b/Documentation/arm64/booting.rst @@ -248,6 +248,20 @@ Before jumping into the kernel, the following conditions must be met: - HCR_EL2.APK (bit 40) must be initialised to 0b1 - HCR_EL2.API (bit 41) must be initialised to 0b1 + For CPUs with Activity Monitors Unit v1 (AMUv1) extension present: + - If EL3 is present: + CPTR_EL3.TAM (bit 30) must be initialised to 0b0 + CPTR_EL2.TAM (bit 30) must be initialised to 0b0 + AMCNTENSET0_EL0 must be initialised to 0b1111 + AMCNTENSET1_EL0 must be initialised to a platform specific value + having 0b1 set for the corresponding bit for each of the auxiliary + counters present. + - If the kernel is entered at EL1: + AMCNTENSET0_EL0 must be initialised to 0b1111 + AMCNTENSET1_EL0 must be initialised to a platform specific value + having 0b1 set for the corresponding bit for each of the auxiliary + counters present. + The requirements described above for CPU mode, caches, MMUs, architected timers, coherency and system registers apply to all CPUs. All CPUs must enter the kernel in the same exception level. diff --git a/Documentation/arm64/index.rst b/Documentation/arm64/index.rst index 5c0c69dc58aa..09cbb4ed2237 100644 --- a/Documentation/arm64/index.rst +++ b/Documentation/arm64/index.rst @@ -6,6 +6,7 @@ ARM64 Architecture :maxdepth: 1 acpi_object_usage + amu arm-acpi booting cpu-feature-registers diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index c6c32fb7f546..6e41c4b62607 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -117,6 +117,7 @@ config ARM64 select HAVE_ALIGNED_STRUCT_PAGE if SLUB select HAVE_ARCH_AUDITSYSCALL select HAVE_ARCH_BITREVERSE + select HAVE_ARCH_COMPILER_H select HAVE_ARCH_HUGE_VMAP select HAVE_ARCH_JUMP_LABEL select HAVE_ARCH_JUMP_LABEL_RELATIVE @@ -280,6 +281,9 @@ config ZONE_DMA32 config ARCH_ENABLE_MEMORY_HOTPLUG def_bool y +config ARCH_ENABLE_MEMORY_HOTREMOVE + def_bool y + config SMP def_bool y @@ -951,11 +955,11 @@ config HOTPLUG_CPU # Common NUMA Features config NUMA - bool "Numa Memory Allocation and Scheduler Support" + bool "NUMA Memory Allocation and Scheduler Support" select ACPI_NUMA if ACPI select OF_NUMA help - Enable NUMA (Non Uniform Memory Access) support. + Enable NUMA (Non-Uniform Memory Access) support. The kernel will try to allocate memory used by a CPU on the local memory of the CPU and add some more @@ -1497,6 +1501,9 @@ config ARM64_PTR_AUTH bool "Enable support for pointer authentication" default y depends on !KVM || ARM64_VHE + depends on (CC_HAS_SIGN_RETURN_ADDRESS || CC_HAS_BRANCH_PROT_PAC_RET) && AS_HAS_PAC + depends on CC_IS_GCC || (CC_IS_CLANG && AS_HAS_CFI_NEGATE_RA_STATE) + depends on (!FUNCTION_GRAPH_TRACER || DYNAMIC_FTRACE_WITH_REGS) help Pointer authentication (part of the ARMv8.3 Extensions) provides instructions for signing and authenticating pointers against secret @@ -1504,16 +1511,72 @@ config ARM64_PTR_AUTH and other attacks. This option enables these instructions at EL0 (i.e. for userspace). - Choosing this option will cause the kernel to initialise secret keys for each process at exec() time, with these keys being context-switched along with the process. + If the compiler supports the -mbranch-protection or + -msign-return-address flag (e.g. GCC 7 or later), then this option + will also cause the kernel itself to be compiled with return address + protection. In this case, and if the target hardware is known to + support pointer authentication, then CONFIG_STACKPROTECTOR can be + disabled with minimal loss of protection. + The feature is detected at runtime. If the feature is not present in hardware it will not be advertised to userspace/KVM guest nor will it be enabled. However, KVM guest also require VHE mode and hence CONFIG_ARM64_VHE=y option to use this feature. + If the feature is present on the boot CPU but not on a late CPU, then + the late CPU will be parked. Also, if the boot CPU does not have + address auth and the late CPU has then the late CPU will still boot + but with the feature disabled. On such a system, this option should + not be selected. + + This feature works with FUNCTION_GRAPH_TRACER option only if + DYNAMIC_FTRACE_WITH_REGS is enabled. + +config CC_HAS_BRANCH_PROT_PAC_RET + # GCC 9 or later, clang 8 or later + def_bool $(cc-option,-mbranch-protection=pac-ret+leaf) + +config CC_HAS_SIGN_RETURN_ADDRESS + # GCC 7, 8 + def_bool $(cc-option,-msign-return-address=all) + +config AS_HAS_PAC + def_bool $(as-option,-Wa$(comma)-march=armv8.3-a) + +config AS_HAS_CFI_NEGATE_RA_STATE + def_bool $(as-instr,.cfi_startproc\n.cfi_negate_ra_state\n.cfi_endproc\n) + +endmenu + +menu "ARMv8.4 architectural features" + +config ARM64_AMU_EXTN + bool "Enable support for the Activity Monitors Unit CPU extension" + default y + help + The activity monitors extension is an optional extension introduced + by the ARMv8.4 CPU architecture. This enables support for version 1 + of the activity monitors architecture, AMUv1. + + To enable the use of this extension on CPUs that implement it, say Y. + + Note that for architectural reasons, firmware _must_ implement AMU + support when running on CPUs that present the activity monitors + extension. The required support is present in: + * Version 1.5 and later of the ARM Trusted Firmware + + For kernels that have this configuration enabled but boot with broken + firmware, you may need to say N here until the firmware is fixed. + Otherwise you may experience firmware panics or lockups when + accessing the counter registers. Even if you are not observing these + symptoms, the values returned by the register reads might not + correctly reflect reality. Most commonly, the value read will be 0, + indicating that the counter is not enabled. + endmenu menu "ARMv8.5 architectural features" diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile index dca1a97751ab..f15f92ba53e6 100644 --- a/arch/arm64/Makefile +++ b/arch/arm64/Makefile @@ -65,6 +65,17 @@ stack_protector_prepare: prepare0 include/generated/asm-offsets.h)) endif +ifeq ($(CONFIG_ARM64_PTR_AUTH),y) +branch-prot-flags-$(CONFIG_CC_HAS_SIGN_RETURN_ADDRESS) := -msign-return-address=all +branch-prot-flags-$(CONFIG_CC_HAS_BRANCH_PROT_PAC_RET) := -mbranch-protection=pac-ret+leaf +# -march=armv8.3-a enables the non-nops instructions for PAC, to avoid the +# compiler to generate them and consequently to break the single image contract +# we pass it only to the assembler. This option is utilized only in case of non +# integrated assemblers. +branch-prot-flags-$(CONFIG_AS_HAS_PAC) += -Wa,-march=armv8.3-a +KBUILD_CFLAGS += $(branch-prot-flags-y) +endif + ifeq ($(CONFIG_CPU_BIG_ENDIAN), y) KBUILD_CPPFLAGS += -mbig-endian CHECKFLAGS += -D__AARCH64EB__ diff --git a/arch/arm64/crypto/aes-ce.S b/arch/arm64/crypto/aes-ce.S index 45062553467f..1dc5bbbfeed2 100644 --- a/arch/arm64/crypto/aes-ce.S +++ b/arch/arm64/crypto/aes-ce.S @@ -9,8 +9,8 @@ #include <linux/linkage.h> #include <asm/assembler.h> -#define AES_ENTRY(func) SYM_FUNC_START(ce_ ## func) -#define AES_ENDPROC(func) SYM_FUNC_END(ce_ ## func) +#define AES_FUNC_START(func) SYM_FUNC_START(ce_ ## func) +#define AES_FUNC_END(func) SYM_FUNC_END(ce_ ## func) .arch armv8-a+crypto diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S index 8a2faa42b57e..cf618d8f6cec 100644 --- a/arch/arm64/crypto/aes-modes.S +++ b/arch/arm64/crypto/aes-modes.S @@ -51,7 +51,7 @@ SYM_FUNC_END(aes_decrypt_block5x) * int blocks) */ -AES_ENTRY(aes_ecb_encrypt) +AES_FUNC_START(aes_ecb_encrypt) stp x29, x30, [sp, #-16]! mov x29, sp @@ -79,10 +79,10 @@ ST5( st1 {v4.16b}, [x0], #16 ) .Lecbencout: ldp x29, x30, [sp], #16 ret -AES_ENDPROC(aes_ecb_encrypt) +AES_FUNC_END(aes_ecb_encrypt) -AES_ENTRY(aes_ecb_decrypt) +AES_FUNC_START(aes_ecb_decrypt) stp x29, x30, [sp, #-16]! mov x29, sp @@ -110,7 +110,7 @@ ST5( st1 {v4.16b}, [x0], #16 ) .Lecbdecout: ldp x29, x30, [sp], #16 ret -AES_ENDPROC(aes_ecb_decrypt) +AES_FUNC_END(aes_ecb_decrypt) /* @@ -126,7 +126,7 @@ AES_ENDPROC(aes_ecb_decrypt) * u32 const rk2[]); */ -AES_ENTRY(aes_essiv_cbc_encrypt) +AES_FUNC_START(aes_essiv_cbc_encrypt) ld1 {v4.16b}, [x5] /* get iv */ mov w8, #14 /* AES-256: 14 rounds */ @@ -135,7 +135,7 @@ AES_ENTRY(aes_essiv_cbc_encrypt) enc_switch_key w3, x2, x6 b .Lcbcencloop4x -AES_ENTRY(aes_cbc_encrypt) +AES_FUNC_START(aes_cbc_encrypt) ld1 {v4.16b}, [x5] /* get iv */ enc_prepare w3, x2, x6 @@ -167,10 +167,10 @@ AES_ENTRY(aes_cbc_encrypt) .Lcbcencout: st1 {v4.16b}, [x5] /* return iv */ ret -AES_ENDPROC(aes_cbc_encrypt) -AES_ENDPROC(aes_essiv_cbc_encrypt) +AES_FUNC_END(aes_cbc_encrypt) +AES_FUNC_END(aes_essiv_cbc_encrypt) -AES_ENTRY(aes_essiv_cbc_decrypt) +AES_FUNC_START(aes_essiv_cbc_decrypt) stp x29, x30, [sp, #-16]! mov x29, sp @@ -181,7 +181,7 @@ AES_ENTRY(aes_essiv_cbc_decrypt) encrypt_block cbciv, w8, x6, x7, w9 b .Lessivcbcdecstart -AES_ENTRY(aes_cbc_decrypt) +AES_FUNC_START(aes_cbc_decrypt) stp x29, x30, [sp, #-16]! mov x29, sp @@ -238,8 +238,8 @@ ST5( st1 {v4.16b}, [x0], #16 ) st1 {cbciv.16b}, [x5] /* return iv */ ldp x29, x30, [sp], #16 ret -AES_ENDPROC(aes_cbc_decrypt) -AES_ENDPROC(aes_essiv_cbc_decrypt) +AES_FUNC_END(aes_cbc_decrypt) +AES_FUNC_END(aes_essiv_cbc_decrypt) /* @@ -249,7 +249,7 @@ AES_ENDPROC(aes_essiv_cbc_decrypt) * int rounds, int bytes, u8 const iv[]) */ -AES_ENTRY(aes_cbc_cts_encrypt) +AES_FUNC_START(aes_cbc_cts_encrypt) adr_l x8, .Lcts_permute_table sub x4, x4, #16 add x9, x8, #32 @@ -276,9 +276,9 @@ AES_ENTRY(aes_cbc_cts_encrypt) st1 {v0.16b}, [x4] /* overlapping stores */ st1 {v1.16b}, [x0] ret -AES_ENDPROC(aes_cbc_cts_encrypt) +AES_FUNC_END(aes_cbc_cts_encrypt) -AES_ENTRY(aes_cbc_cts_decrypt) +AES_FUNC_START(aes_cbc_cts_decrypt) adr_l x8, .Lcts_permute_table sub x4, x4, #16 add x9, x8, #32 @@ -305,7 +305,7 @@ AES_ENTRY(aes_cbc_cts_decrypt) st1 {v2.16b}, [x4] /* overlapping stores */ st1 {v0.16b}, [x0] ret -AES_ENDPROC(aes_cbc_cts_decrypt) +AES_FUNC_END(aes_cbc_cts_decrypt) .section ".rodata", "a" .align 6 @@ -324,7 +324,7 @@ AES_ENDPROC(aes_cbc_cts_decrypt) * int blocks, u8 ctr[]) */ -AES_ENTRY(aes_ctr_encrypt) +AES_FUNC_START(aes_ctr_encrypt) stp x29, x30, [sp, #-16]! mov x29, sp @@ -409,7 +409,7 @@ ST5( st1 {v4.16b}, [x0], #16 ) rev x7, x7 ins vctr.d[0], x7 b .Lctrcarrydone -AES_ENDPROC(aes_ctr_encrypt) +AES_FUNC_END(aes_ctr_encrypt) /* @@ -433,7 +433,7 @@ AES_ENDPROC(aes_ctr_encrypt) uzp1 xtsmask.4s, xtsmask.4s, \tmp\().4s .endm -AES_ENTRY(aes_xts_encrypt) +AES_FUNC_START(aes_xts_encrypt) stp x29, x30, [sp, #-16]! mov x29, sp @@ -518,9 +518,9 @@ AES_ENTRY(aes_xts_encrypt) st1 {v2.16b}, [x4] /* overlapping stores */ mov w4, wzr b .Lxtsencctsout -AES_ENDPROC(aes_xts_encrypt) +AES_FUNC_END(aes_xts_encrypt) -AES_ENTRY(aes_xts_decrypt) +AES_FUNC_START(aes_xts_decrypt) stp x29, x30, [sp, #-16]! mov x29, sp @@ -612,13 +612,13 @@ AES_ENTRY(aes_xts_decrypt) st1 {v2.16b}, [x4] /* overlapping stores */ mov w4, wzr b .Lxtsdecctsout -AES_ENDPROC(aes_xts_decrypt) +AES_FUNC_END(aes_xts_decrypt) /* * aes_mac_update(u8 const in[], u32 const rk[], int rounds, * int blocks, u8 dg[], int enc_before, int enc_after) */ -AES_ENTRY(aes_mac_update) +AES_FUNC_START(aes_mac_update) frame_push 6 mov x19, x0 @@ -676,4 +676,4 @@ AES_ENTRY(aes_mac_update) ld1 {v0.16b}, [x23] /* get dg */ enc_prepare w21, x20, x0 b .Lmacloop4x -AES_ENDPROC(aes_mac_update) +AES_FUNC_END(aes_mac_update) diff --git a/arch/arm64/crypto/aes-neon.S b/arch/arm64/crypto/aes-neon.S index 247d34ddaab0..e47d3ec2cfb4 100644 --- a/arch/arm64/crypto/aes-neon.S +++ b/arch/arm64/crypto/aes-neon.S @@ -8,8 +8,8 @@ #include <linux/linkage.h> #include <asm/assembler.h> -#define AES_ENTRY(func) SYM_FUNC_START(neon_ ## func) -#define AES_ENDPROC(func) SYM_FUNC_END(neon_ ## func) +#define AES_FUNC_START(func) SYM_FUNC_START(neon_ ## func) +#define AES_FUNC_END(func) SYM_FUNC_END(neon_ ## func) xtsmask .req v7 cbciv .req v7 diff --git a/arch/arm64/crypto/ghash-ce-core.S b/arch/arm64/crypto/ghash-ce-core.S index 084c6a30b03a..6b958dcdf136 100644 --- a/arch/arm64/crypto/ghash-ce-core.S +++ b/arch/arm64/crypto/ghash-ce-core.S @@ -587,20 +587,20 @@ CPU_LE( rev w8, w8 ) * struct ghash_key const *k, u64 dg[], u8 ctr[], * int rounds, u8 tag) */ -ENTRY(pmull_gcm_encrypt) +SYM_FUNC_START(pmull_gcm_encrypt) pmull_gcm_do_crypt 1 -ENDPROC(pmull_gcm_encrypt) +SYM_FUNC_END(pmull_gcm_encrypt) /* * void pmull_gcm_decrypt(int blocks, u8 dst[], const u8 src[], * struct ghash_key const *k, u64 dg[], u8 ctr[], * int rounds, u8 tag) */ -ENTRY(pmull_gcm_decrypt) +SYM_FUNC_START(pmull_gcm_decrypt) pmull_gcm_do_crypt 0 -ENDPROC(pmull_gcm_decrypt) +SYM_FUNC_END(pmull_gcm_decrypt) -pmull_gcm_ghash_4x: +SYM_FUNC_START_LOCAL(pmull_gcm_ghash_4x) movi MASK.16b, #0xe1 shl MASK.2d, MASK.2d, #57 @@ -681,9 +681,9 @@ pmull_gcm_ghash_4x: eor XL.16b, XL.16b, T2.16b ret -ENDPROC(pmull_gcm_ghash_4x) +SYM_FUNC_END(pmull_gcm_ghash_4x) -pmull_gcm_enc_4x: +SYM_FUNC_START_LOCAL(pmull_gcm_enc_4x) ld1 {KS0.16b}, [x5] // load upper counter sub w10, w8, #4 sub w11, w8, #3 @@ -746,7 +746,7 @@ pmull_gcm_enc_4x: eor INP3.16b, INP3.16b, KS3.16b ret -ENDPROC(pmull_gcm_enc_4x) +SYM_FUNC_END(pmull_gcm_enc_4x) .section ".rodata", "a" .align 6 diff --git a/arch/arm64/include/asm/asm_pointer_auth.h b/arch/arm64/include/asm/asm_pointer_auth.h new file mode 100644 index 000000000000..ce2a8486992b --- /dev/null +++ b/arch/arm64/include/asm/asm_pointer_auth.h @@ -0,0 +1,65 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __ASM_ASM_POINTER_AUTH_H +#define __ASM_ASM_POINTER_AUTH_H + +#include <asm/alternative.h> +#include <asm/asm-offsets.h> +#include <asm/cpufeature.h> +#include <asm/sysreg.h> + +#ifdef CONFIG_ARM64_PTR_AUTH +/* + * thread.keys_user.ap* as offset exceeds the #imm offset range + * so use the base value of ldp as thread.keys_user and offset as + * thread.keys_user.ap*. + */ + .macro ptrauth_keys_install_user tsk, tmp1, tmp2, tmp3 + mov \tmp1, #THREAD_KEYS_USER + add \tmp1, \tsk, \tmp1 +alternative_if_not ARM64_HAS_ADDRESS_AUTH + b .Laddr_auth_skip_\@ +alternative_else_nop_endif + ldp \tmp2, \tmp3, [\tmp1, #PTRAUTH_USER_KEY_APIA] + msr_s SYS_APIAKEYLO_EL1, \tmp2 + msr_s SYS_APIAKEYHI_EL1, \tmp3 + ldp \tmp2, \tmp3, [\tmp1, #PTRAUTH_USER_KEY_APIB] + msr_s SYS_APIBKEYLO_EL1, \tmp2 + msr_s SYS_APIBKEYHI_EL1, \tmp3 + ldp \tmp2, \tmp3, [\tmp1, #PTRAUTH_USER_KEY_APDA] + msr_s SYS_APDAKEYLO_EL1, \tmp2 + msr_s SYS_APDAKEYHI_EL1, \tmp3 + ldp \tmp2, \tmp3, [\tmp1, #PTRAUTH_USER_KEY_APDB] + msr_s SYS_APDBKEYLO_EL1, \tmp2 + msr_s SYS_APDBKEYHI_EL1, \tmp3 +.Laddr_auth_skip_\@: +alternative_if ARM64_HAS_GENERIC_AUTH + ldp \tmp2, \tmp3, [\tmp1, #PTRAUTH_USER_KEY_APGA] + msr_s SYS_APGAKEYLO_EL1, \tmp2 + msr_s SYS_APGAKEYHI_EL1, \tmp3 +alternative_else_nop_endif + .endm + + .macro ptrauth_keys_install_kernel tsk, sync, tmp1, tmp2, tmp3 +alternative_if ARM64_HAS_ADDRESS_AUTH + mov \tmp1, #THREAD_KEYS_KERNEL + add \tmp1, \tsk, \tmp1 + ldp \tmp2, \tmp3, [\tmp1, #PTRAUTH_KERNEL_KEY_APIA] + msr_s SYS_APIAKEYLO_EL1, \tmp2 + msr_s SYS_APIAKEYHI_EL1, \tmp3 + .if \sync == 1 + isb + .endif +alternative_else_nop_endif + .endm + +#else /* CONFIG_ARM64_PTR_AUTH */ + + .macro ptrauth_keys_install_user tsk, tmp1, tmp2, tmp3 + .endm + + .macro ptrauth_keys_install_kernel tsk, sync, tmp1, tmp2, tmp3 + .endm + +#endif /* CONFIG_ARM64_PTR_AUTH */ + +#endif /* __ASM_ASM_POINTER_AUTH_H */ diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index aca337d79d12..0bff325117b4 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -257,12 +257,6 @@ alternative_endif .endm /* - * mmid - get context id from mm pointer (mm->context.id) - */ - .macro mmid, rd, rn - ldr \rd, [\rn, #MM_CONTEXT_ID] - .endm -/* * read_ctr - read CTR_EL0. If the system has mismatched register fields, * provide the system wide safe value from arm64_ftr_reg_ctrel0.sys_val */ @@ -431,6 +425,16 @@ USER(\label, ic ivau, \tmp2) // invalidate I line PoU .endm /* + * reset_amuserenr_el0 - reset AMUSERENR_EL0 if AMUv1 present + */ + .macro reset_amuserenr_el0, tmpreg + mrs \tmpreg, id_aa64pfr0_el1 // Check ID_AA64PFR0_EL1 + ubfx \tmpreg, \tmpreg, #ID_AA64PFR0_AMU_SHIFT, #4 + cbz \tmpreg, .Lskip_\@ // Skip if no AMU present + msr_s SYS_AMUSERENR_EL0, xzr // Disable AMU access from EL0 +.Lskip_\@: + .endm +/* * copy_page - copy src to dest using temp registers t1-t8 */ .macro copy_page dest:req src:req t1:req t2:req t3:req t4:req t5:req t6:req t7:req t8:req diff --git a/arch/arm64/include/asm/checksum.h b/arch/arm64/include/asm/checksum.h index 8d2a7de39744..b6f7bc6da5fb 100644 --- a/arch/arm64/include/asm/checksum.h +++ b/arch/arm64/include/asm/checksum.h @@ -5,7 +5,12 @@ #ifndef __ASM_CHECKSUM_H #define __ASM_CHECKSUM_H -#include <linux/types.h> +#include <linux/in6.h> + +#define _HAVE_ARCH_IPV6_CSUM +__sum16 csum_ipv6_magic(const struct in6_addr *saddr, + const struct in6_addr *daddr, + __u32 len, __u8 proto, __wsum sum); static inline __sum16 csum_fold(__wsum csum) { diff --git a/arch/arm64/include/asm/compiler.h b/arch/arm64/include/asm/compiler.h new file mode 100644 index 000000000000..eece20d2c55f --- /dev/null +++ b/arch/arm64/include/asm/compiler.h @@ -0,0 +1,24 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __ASM_COMPILER_H +#define __ASM_COMPILER_H + +#if defined(CONFIG_ARM64_PTR_AUTH) + +/* + * The EL0/EL1 pointer bits used by a pointer authentication code. + * This is dependent on TBI0/TBI1 being enabled, or bits 63:56 would also apply. + */ +#define ptrauth_user_pac_mask() GENMASK_ULL(54, vabits_actual) +#define ptrauth_kernel_pac_mask() GENMASK_ULL(63, vabits_actual) + +/* Valid for EL0 TTBR0 and EL1 TTBR1 instruction pointers */ +#define ptrauth_clear_pac(ptr) \ + ((ptr & BIT_ULL(55)) ? (ptr | ptrauth_kernel_pac_mask()) : \ + (ptr & ~ptrauth_user_pac_mask())) + +#define __builtin_return_address(val) \ + (void *)(ptrauth_clear_pac((unsigned long)__builtin_return_address(val))) + +#endif /* CONFIG_ARM64_PTR_AUTH */ + +#endif /* __ASM_COMPILER_H */ diff --git a/arch/arm64/include/asm/cpu_ops.h b/arch/arm64/include/asm/cpu_ops.h index 86aabf1e0199..d28e8f37d3b4 100644 --- a/arch/arm64/include/asm/cpu_ops.h +++ b/arch/arm64/include/asm/cpu_ops.h @@ -55,12 +55,12 @@ struct cpu_operations { #endif }; -extern const struct cpu_operations *cpu_ops[NR_CPUS]; -int __init cpu_read_ops(int cpu); +int __init init_cpu_ops(int cpu); +extern const struct cpu_operations *get_cpu_ops(int cpu); -static inline void __init cpu_read_bootcpu_ops(void) +static inline void __init init_bootcpu_ops(void) { - cpu_read_ops(0); + init_cpu_ops(0); } #endif /* ifndef __ASM_CPU_OPS_H */ diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index 865e0253fc1e..8eb5a088ae65 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -58,7 +58,10 @@ #define ARM64_WORKAROUND_SPECULATIVE_AT_NVHE 48 #define ARM64_HAS_E0PD 49 #define ARM64_HAS_RNG 50 +#define ARM64_HAS_AMU_EXTN 51 +#define ARM64_HAS_ADDRESS_AUTH 52 +#define ARM64_HAS_GENERIC_AUTH 53 -#define ARM64_NCAPS 51 +#define ARM64_NCAPS 54 #endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index 2a746b99e937..afe08251ff95 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -208,6 +208,10 @@ extern struct arm64_ftr_reg arm64_ftr_reg_ctrel0; * In some non-typical cases either both (a) and (b), or neither, * should be permitted. This can be described by including neither * or both flags in the capability's type field. + * + * In case of a conflict, the CPU is prevented from booting. If the + * ARM64_CPUCAP_PANIC_ON_CONFLICT flag is specified for the capability, + * then a kernel panic is triggered. */ @@ -240,6 +244,8 @@ extern struct arm64_ftr_reg arm64_ftr_reg_ctrel0; #define ARM64_CPUCAP_PERMITTED_FOR_LATE_CPU ((u16)BIT(4)) /* Is it safe for a late CPU to miss this capability when system has it */ #define ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU ((u16)BIT(5)) +/* Panic when a conflict is detected */ +#define ARM64_CPUCAP_PANIC_ON_CONFLICT ((u16)BIT(6)) /* * CPU errata workarounds that need to be enabled at boot time if one or @@ -279,9 +285,20 @@ extern struct arm64_ftr_reg arm64_ftr_reg_ctrel0; /* * CPU feature used early in the boot based on the boot CPU. All secondary - * CPUs must match the state of the capability as detected by the boot CPU. + * CPUs must match the state of the capability as detected by the boot CPU. In + * case of a conflict, a kernel panic is triggered. + */ +#define ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE \ + (ARM64_CPUCAP_SCOPE_BOOT_CPU | ARM64_CPUCAP_PANIC_ON_CONFLICT) + +/* + * CPU feature used early in the boot based on the boot CPU. It is safe for a + * late CPU to have this feature even though the boot CPU hasn't enabled it, + * although the feature will not be used by Linux in this case. If the boot CPU + * has enabled this feature already, then every late CPU must have it. */ -#define ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE ARM64_CPUCAP_SCOPE_BOOT_CPU +#define ARM64_CPUCAP_BOOT_CPU_FEATURE \ + (ARM64_CPUCAP_SCOPE_BOOT_CPU | ARM64_CPUCAP_PERMITTED_FOR_LATE_CPU) struct arm64_cpu_capabilities { const char *desc; @@ -340,18 +357,6 @@ static inline int cpucap_default_scope(const struct arm64_cpu_capabilities *cap) return cap->type & ARM64_CPUCAP_SCOPE_MASK; } -static inline bool -cpucap_late_cpu_optional(const struct arm64_cpu_capabilities *cap) -{ - return !!(cap->type & ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU); -} - -static inline bool -cpucap_late_cpu_permitted(const struct arm64_cpu_capabilities *cap) -{ - return !!(cap->type & ARM64_CPUCAP_PERMITTED_FOR_LATE_CPU); -} - /* * Generic helper for handling capabilties with multiple (match,enable) pairs * of call backs, sharing the same capability bit. @@ -390,14 +395,16 @@ unsigned long cpu_get_elf_hwcap2(void); #define cpu_set_named_feature(name) cpu_set_feature(cpu_feature(name)) #define cpu_have_named_feature(name) cpu_have_feature(cpu_feature(name)) -/* System capability check for constant caps */ -static __always_inline bool __cpus_have_const_cap(int num) +static __always_inline bool system_capabilities_finalized(void) { - if (num >= ARM64_NCAPS) - return false; - return static_branch_unlikely(&cpu_hwcap_keys[num]); + return static_branch_likely(&arm64_const_caps_ready); } +/* + * Test for a capability with a runtime check. + * + * Before the capability is detected, this returns false. + */ static inline bool cpus_have_cap(unsigned int num) { if (num >= ARM64_NCAPS) @@ -405,14 +412,53 @@ static inline bool cpus_have_cap(unsigned int num) return test_bit(num, cpu_hwcaps); } +/* + * Test for a capability without a runtime check. + * + * Before capabilities are finalized, this returns false. + * After capabilities are finalized, this is patched to avoid a runtime check. + * + * @num must be a compile-time constant. + */ +static __always_inline bool __cpus_have_const_cap(int num) +{ + if (num >= ARM64_NCAPS) + return false; + return static_branch_unlikely(&cpu_hwcap_keys[num]); +} + +/* + * Test for a capability, possibly with a runtime check. + * + * Before capabilities are finalized, this behaves as cpus_have_cap(). + * After capabilities are finalized, this is patched to avoid a runtime check. + * + * @num must be a compile-time constant. + */ static __always_inline bool cpus_have_const_cap(int num) { - if (static_branch_likely(&arm64_const_caps_ready)) + if (system_capabilities_finalized()) return __cpus_have_const_cap(num); else return cpus_have_cap(num); } +/* + * Test for a capability without a runtime check. + * + * Before capabilities are finalized, this will BUG(). + * After capabilities are finalized, this is patched to avoid a runtime check. + * + * @num must be a compile-time constant. + */ +static __always_inline bool cpus_have_final_cap(int num) +{ + if (system_capabilities_finalized()) + return __cpus_have_const_cap(num); + else + BUG(); +} + static inline void cpus_set_cap(unsigned int num) { if (num >= ARM64_NCAPS) { @@ -447,6 +493,29 @@ cpuid_feature_extract_unsigned_field(u64 features, int field) return cpuid_feature_extract_unsigned_field_width(features, field, 4); } +/* + * Fields that identify the version of the Performance Monitors Extension do + * not follow the standard ID scheme. See ARM DDI 0487E.a page D13-2825, + * "Alternative ID scheme used for the Performance Monitors Extension version". + */ +static inline u64 __attribute_const__ +cpuid_feature_cap_perfmon_field(u64 features, int field, u64 cap) +{ + u64 val = cpuid_feature_extract_unsigned_field(features, field); + u64 mask = GENMASK_ULL(field + 3, field); + + /* Treat IMPLEMENTATION DEFINED functionality as unimplemented */ + if (val == 0xf) + val = 0; + + if (val > cap) { + features &= ~mask; + features |= (cap << field) & mask; + } + + return features; +} + static inline u64 arm64_ftr_mask(const struct arm64_ftr_bits *ftrp) { return (u64)GENMASK(ftrp->shift + ftrp->width - 1, ftrp->shift); @@ -590,15 +659,13 @@ static __always_inline bool system_supports_cnp(void) static inline bool system_supports_address_auth(void) { return IS_ENABLED(CONFIG_ARM64_PTR_AUTH) && - (cpus_have_const_cap(ARM64_HAS_ADDRESS_AUTH_ARCH) || - cpus_have_const_cap(ARM64_HAS_ADDRESS_AUTH_IMP_DEF)); + cpus_have_const_cap(ARM64_HAS_ADDRESS_AUTH); } static inline bool system_supports_generic_auth(void) { return IS_ENABLED(CONFIG_ARM64_PTR_AUTH) && - (cpus_have_const_cap(ARM64_HAS_GENERIC_AUTH_ARCH) || - cpus_have_const_cap(ARM64_HAS_GENERIC_AUTH_IMP_DEF)); + cpus_have_const_cap(ARM64_HAS_GENERIC_AUTH); } static inline bool system_uses_irq_prio_masking(void) @@ -613,11 +680,6 @@ static inline bool system_has_prio_mask_debugging(void) system_uses_irq_prio_masking(); } -static inline bool system_capabilities_finalized(void) -{ - return static_branch_likely(&arm64_const_caps_ready); -} - #define ARM64_BP_HARDEN_UNKNOWN -1 #define ARM64_BP_HARDEN_WA_NEEDED 0 #define ARM64_BP_HARDEN_NOT_REQUIRED 1 @@ -678,6 +740,11 @@ static inline bool cpu_has_hw_af(void) ID_AA64MMFR1_HADBS_SHIFT); } +#ifdef CONFIG_ARM64_AMU_EXTN +/* Check whether the cpu supports the Activity Monitors Unit (AMU) */ +extern bool cpu_has_amu_feat(int cpu); +#endif + #endif /* __ASSEMBLY__ */ #endif diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h index cb29253ae86b..6a395a7e6707 100644 --- a/arch/arm64/include/asm/esr.h +++ b/arch/arm64/include/asm/esr.h @@ -60,7 +60,7 @@ #define ESR_ELx_EC_BKPT32 (0x38) /* Unallocated EC: 0x39 */ #define ESR_ELx_EC_VECTOR32 (0x3A) /* EL2 only */ -/* Unallocted EC: 0x3B */ +/* Unallocated EC: 0x3B */ #define ESR_ELx_EC_BRK64 (0x3C) /* Unallocated EC: 0x3D - 0x3F */ #define ESR_ELx_EC_MAX (0x3F) diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 6e5d839f42b5..51c1d9918999 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -267,6 +267,7 @@ /* Hyp Coprocessor Trap Register */ #define CPTR_EL2_TCPAC (1 << 31) +#define CPTR_EL2_TAM (1 << 30) #define CPTR_EL2_TTA (1 << 20) #define CPTR_EL2_TFP (1 << CPTR_EL2_TFP_SHIFT) #define CPTR_EL2_TZ (1 << 8) diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index 44a243754c1b..7c7eeeaab9fa 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -36,6 +36,8 @@ */ #define KVM_VECTOR_PREAMBLE (2 * AARCH64_INSN_SIZE) +#define __SMCCC_WORKAROUND_1_SMC_SZ 36 + #ifndef __ASSEMBLY__ #include <linux/mm.h> @@ -75,6 +77,8 @@ extern void __vgic_v3_init_lrs(void); extern u32 __kvm_get_mdcr_el2(void); +extern char __smccc_workaround_1_smc[__SMCCC_WORKAROUND_1_SMC_SZ]; + /* Home-grown __this_cpu_{ptr,read} variants that always work at HYP */ #define __hyp_this_cpu_ptr(sym) \ ({ \ diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index 785762860c63..30b0e8d6b895 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -481,7 +481,7 @@ static inline void *kvm_get_hyp_vector(void) int slot = -1; if (cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR) && data->fn) { - vect = kern_hyp_va(kvm_ksym_ref(__bp_harden_hyp_vecs_start)); + vect = kern_hyp_va(kvm_ksym_ref(__bp_harden_hyp_vecs)); slot = data->hyp_vectors_slot; } @@ -510,14 +510,13 @@ static inline int kvm_map_vectors(void) * HBP + HEL2 -> use hardened vertors and use exec mapping */ if (cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR)) { - __kvm_bp_vect_base = kvm_ksym_ref(__bp_harden_hyp_vecs_start); + __kvm_bp_vect_base = kvm_ksym_ref(__bp_harden_hyp_vecs); __kvm_bp_vect_base = kern_hyp_va(__kvm_bp_vect_base); } if (cpus_have_const_cap(ARM64_HARDEN_EL2_VECTORS)) { - phys_addr_t vect_pa = __pa_symbol(__bp_harden_hyp_vecs_start); - unsigned long size = (__bp_harden_hyp_vecs_end - - __bp_harden_hyp_vecs_start); + phys_addr_t vect_pa = __pa_symbol(__bp_harden_hyp_vecs); + unsigned long size = __BP_HARDEN_HYP_VECS_SZ; /* * Always allocate a spare vector slot, as we don't diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index 4d94676e5a8b..2be67b232499 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -54,6 +54,7 @@ #define MODULES_VADDR (BPF_JIT_REGION_END) #define MODULES_VSIZE (SZ_128M) #define VMEMMAP_START (-VMEMMAP_SIZE - SZ_2M) +#define VMEMMAP_END (VMEMMAP_START + VMEMMAP_SIZE) #define PCI_IO_END (VMEMMAP_START - SZ_2M) #define PCI_IO_START (PCI_IO_END - PCI_IO_SIZE) #define FIXADDR_TOP (PCI_IO_START - SZ_2M) diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index d79ce6df9e12..68140fdd89d6 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -13,6 +13,7 @@ #define TTBR_ASID_MASK (UL(0xffff) << 48) #define BP_HARDEN_EL2_SLOTS 4 +#define __BP_HARDEN_HYP_VECS_SZ (BP_HARDEN_EL2_SLOTS * SZ_2K) #ifndef __ASSEMBLY__ @@ -23,9 +24,9 @@ typedef struct { } mm_context_t; /* - * This macro is only used by the TLBI code, which cannot race with an - * ASID change and therefore doesn't need to reload the counter using - * atomic64_read. + * This macro is only used by the TLBI and low-level switch_mm() code, + * neither of which can race with an ASID change. We therefore don't + * need to reload the counter using atomic64_read(). */ #define ASID(mm) ((mm)->context.id.counter & 0xffff) @@ -43,7 +44,8 @@ struct bp_hardening_data { #if (defined(CONFIG_HARDEN_BRANCH_PREDICTOR) || \ defined(CONFIG_HARDEN_EL2_VECTORS)) -extern char __bp_harden_hyp_vecs_start[], __bp_harden_hyp_vecs_end[]; + +extern char __bp_harden_hyp_vecs[]; extern atomic_t arm64_el2_vector_last_slot; #endif /* CONFIG_HARDEN_BRANCH_PREDICTOR || CONFIG_HARDEN_EL2_VECTORS */ diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h index 3827ff4040a3..ab46187c6300 100644 --- a/arch/arm64/include/asm/mmu_context.h +++ b/arch/arm64/include/asm/mmu_context.h @@ -46,6 +46,8 @@ static inline void cpu_set_reserved_ttbr0(void) isb(); } +void cpu_do_switch_mm(phys_addr_t pgd_phys, struct mm_struct *mm); + static inline void cpu_switch_mm(pgd_t *pgd, struct mm_struct *mm) { BUG_ON(pgd == swapper_pg_dir); diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h index d39ddb258a04..75d6cd23a679 100644 --- a/arch/arm64/include/asm/page.h +++ b/arch/arm64/include/asm/page.h @@ -21,6 +21,10 @@ extern void __cpu_copy_user_page(void *to, const void *from, extern void copy_page(void *to, const void *from); extern void clear_page(void *to); +#define __alloc_zeroed_user_highpage(movableflags, vma, vaddr) \ + alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr) +#define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE + #define clear_user_page(addr,vaddr,pg) __cpu_clear_user_page(addr, vaddr) #define copy_user_page(to,from,vaddr,pg) __cpu_copy_user_page(to, from, vaddr) diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h index 2bdbc79bbd01..e7765b62c712 100644 --- a/arch/arm64/include/asm/perf_event.h +++ b/arch/arm64/include/asm/perf_event.h @@ -176,9 +176,10 @@ #define ARMV8_PMU_PMCR_X (1 << 4) /* Export to ETM */ #define ARMV8_PMU_PMCR_DP (1 << 5) /* Disable CCNT if non-invasive debug*/ #define ARMV8_PMU_PMCR_LC (1 << 6) /* Overflow on 64 bit cycle counter */ +#define ARMV8_PMU_PMCR_LP (1 << 7) /* Long event counter enable */ #define ARMV8_PMU_PMCR_N_SHIFT 11 /* Number of counters supported */ #define ARMV8_PMU_PMCR_N_MASK 0x1f -#define ARMV8_PMU_PMCR_MASK 0x7f /* Mask for writable bits */ +#define ARMV8_PMU_PMCR_MASK 0xff /* Mask for writable bits */ /* * PMOVSR: counters overflow flag status reg diff --git a/arch/arm64/include/asm/pointer_auth.h b/arch/arm64/include/asm/pointer_auth.h index 7a24bad1a58b..70c47156e54b 100644 --- a/arch/arm64/include/asm/pointer_auth.h +++ b/arch/arm64/include/asm/pointer_auth.h @@ -22,7 +22,7 @@ struct ptrauth_key { * We give each process its own keys, which are shared by all threads. The keys * are inherited upon fork(), and reinitialised upon exec*(). */ -struct ptrauth_keys { +struct ptrauth_keys_user { struct ptrauth_key apia; struct ptrauth_key apib; struct ptrauth_key apda; @@ -30,7 +30,11 @@ struct ptrauth_keys { struct ptrauth_key apga; }; -static inline void ptrauth_keys_init(struct ptrauth_keys *keys) +struct ptrauth_keys_kernel { + struct ptrauth_key apia; +}; + +static inline void ptrauth_keys_init_user(struct ptrauth_keys_user *keys) { if (system_supports_address_auth()) { get_random_bytes(&keys->apia, sizeof(keys->apia)); @@ -50,48 +54,38 @@ do { \ write_sysreg_s(__pki_v.hi, SYS_ ## k ## KEYHI_EL1); \ } while (0) -static inline void ptrauth_keys_switch(struct ptrauth_keys *keys) +static __always_inline void ptrauth_keys_init_kernel(struct ptrauth_keys_kernel *keys) { - if (system_supports_address_auth()) { - __ptrauth_key_install(APIA, keys->apia); - __ptrauth_key_install(APIB, keys->apib); - __ptrauth_key_install(APDA, keys->apda); - __ptrauth_key_install(APDB, keys->apdb); - } + if (system_supports_address_auth()) + get_random_bytes(&keys->apia, sizeof(keys->apia)); +} - if (system_supports_generic_auth()) - __ptrauth_key_install(APGA, keys->apga); +static __always_inline void ptrauth_keys_switch_kernel(struct ptrauth_keys_kernel *keys) +{ + if (system_supports_address_auth()) + __ptrauth_key_install(APIA, keys->apia); } extern int ptrauth_prctl_reset_keys(struct task_struct *tsk, unsigned long arg); -/* - * The EL0 pointer bits used by a pointer authentication code. - * This is dependent on TBI0 being enabled, or bits 63:56 would also apply. - */ -#define ptrauth_user_pac_mask() GENMASK(54, vabits_actual) - -/* Only valid for EL0 TTBR0 instruction pointers */ static inline unsigned long ptrauth_strip_insn_pac(unsigned long ptr) { - return ptr & ~ptrauth_user_pac_mask(); + return ptrauth_clear_pac(ptr); } #define ptrauth_thread_init_user(tsk) \ -do { \ - struct task_struct *__ptiu_tsk = (tsk); \ - ptrauth_keys_init(&__ptiu_tsk->thread.keys_user); \ - ptrauth_keys_switch(&__ptiu_tsk->thread.keys_user); \ -} while (0) - -#define ptrauth_thread_switch(tsk) \ - ptrauth_keys_switch(&(tsk)->thread.keys_user) + ptrauth_keys_init_user(&(tsk)->thread.keys_user) +#define ptrauth_thread_init_kernel(tsk) \ + ptrauth_keys_init_kernel(&(tsk)->thread.keys_kernel) +#define ptrauth_thread_switch_kernel(tsk) \ + ptrauth_keys_switch_kernel(&(tsk)->thread.keys_kernel) #else /* CONFIG_ARM64_PTR_AUTH */ #define ptrauth_prctl_reset_keys(tsk, arg) (-EINVAL) #define ptrauth_strip_insn_pac(lr) (lr) #define ptrauth_thread_init_user(tsk) -#define ptrauth_thread_switch(tsk) +#define ptrauth_thread_init_kernel(tsk) +#define ptrauth_thread_switch_kernel(tsk) #endif /* CONFIG_ARM64_PTR_AUTH */ #endif /* __ASM_POINTER_AUTH_H */ diff --git a/arch/arm64/include/asm/proc-fns.h b/arch/arm64/include/asm/proc-fns.h index a2ce65a0c1fa..0d5d1f0525eb 100644 --- a/arch/arm64/include/asm/proc-fns.h +++ b/arch/arm64/include/asm/proc-fns.h @@ -13,11 +13,9 @@ #include <asm/page.h> -struct mm_struct; struct cpu_suspend_ctx; extern void cpu_do_idle(void); -extern void cpu_do_switch_mm(unsigned long pgd_phys, struct mm_struct *mm); extern void cpu_do_suspend(struct cpu_suspend_ctx *ptr); extern u64 cpu_do_resume(phys_addr_t ptr, u64 idmap_ttbr); diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index e51ef2dc5749..240fe5e5b720 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -148,7 +148,8 @@ struct thread_struct { unsigned long fault_code; /* ESR_EL1 value */ struct debug_info debug; /* debugging */ #ifdef CONFIG_ARM64_PTR_AUTH - struct ptrauth_keys keys_user; + struct ptrauth_keys_user keys_user; + struct ptrauth_keys_kernel keys_kernel; #endif }; diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h index a0c8a0b65259..40d5ba029615 100644 --- a/arch/arm64/include/asm/smp.h +++ b/arch/arm64/include/asm/smp.h @@ -23,6 +23,14 @@ #define CPU_STUCK_REASON_52_BIT_VA (UL(1) << CPU_STUCK_REASON_SHIFT) #define CPU_STUCK_REASON_NO_GRAN (UL(2) << CPU_STUCK_REASON_SHIFT) +/* Possible options for __cpu_setup */ +/* Option to setup primary cpu */ +#define ARM64_CPU_BOOT_PRIMARY (1) +/* Option to setup secondary cpus */ +#define ARM64_CPU_BOOT_SECONDARY (2) +/* Option to setup cpus for different cpu run time services */ +#define ARM64_CPU_RUNTIME (3) + #ifndef __ASSEMBLY__ #include <asm/percpu.h> @@ -30,6 +38,7 @@ #include <linux/threads.h> #include <linux/cpumask.h> #include <linux/thread_info.h> +#include <asm/pointer_auth.h> DECLARE_PER_CPU_READ_MOSTLY(int, cpu_number); @@ -87,6 +96,9 @@ asmlinkage void secondary_start_kernel(void); struct secondary_data { void *stack; struct task_struct *task; +#ifdef CONFIG_ARM64_PTR_AUTH + struct ptrauth_keys_kernel ptrauth_key; +#endif long status; }; diff --git a/arch/arm64/include/asm/stackprotector.h b/arch/arm64/include/asm/stackprotector.h index 5884a2b02827..7263e0bac680 100644 --- a/arch/arm64/include/asm/stackprotector.h +++ b/arch/arm64/include/asm/stackprotector.h @@ -15,6 +15,7 @@ #include <linux/random.h> #include <linux/version.h> +#include <asm/pointer_auth.h> extern unsigned long __stack_chk_guard; @@ -26,6 +27,7 @@ extern unsigned long __stack_chk_guard; */ static __always_inline void boot_init_stack_canary(void) { +#if defined(CONFIG_STACKPROTECTOR) unsigned long canary; /* Try to get a semi random initial value. */ @@ -36,6 +38,9 @@ static __always_inline void boot_init_stack_canary(void) current->stack_canary = canary; if (!IS_ENABLED(CONFIG_STACKPROTECTOR_PER_TASK)) __stack_chk_guard = current->stack_canary; +#endif + ptrauth_thread_init_kernel(current); + ptrauth_thread_switch_kernel(current); } #endif /* _ASM_STACKPROTECTOR_H */ diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index b91570ff9db1..ebc622432831 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -386,6 +386,42 @@ #define SYS_TPIDR_EL0 sys_reg(3, 3, 13, 0, 2) #define SYS_TPIDRRO_EL0 sys_reg(3, 3, 13, 0, 3) +/* Definitions for system register interface to AMU for ARMv8.4 onwards */ +#define SYS_AM_EL0(crm, op2) sys_reg(3, 3, 13, (crm), (op2)) +#define SYS_AMCR_EL0 SYS_AM_EL0(2, 0) +#define SYS_AMCFGR_EL0 SYS_AM_EL0(2, 1) +#define SYS_AMCGCR_EL0 SYS_AM_EL0(2, 2) +#define SYS_AMUSERENR_EL0 SYS_AM_EL0(2, 3) +#define SYS_AMCNTENCLR0_EL0 SYS_AM_EL0(2, 4) +#define SYS_AMCNTENSET0_EL0 SYS_AM_EL0(2, 5) +#define SYS_AMCNTENCLR1_EL0 SYS_AM_EL0(3, 0) +#define SYS_AMCNTENSET1_EL0 SYS_AM_EL0(3, 1) + +/* + * Group 0 of activity monitors (architected): + * op0 op1 CRn CRm op2 + * Counter: 11 011 1101 010:n<3> n<2:0> + * Type: 11 011 1101 011:n<3> n<2:0> + * n: 0-15 + * + * Group 1 of activity monitors (auxiliary): + * op0 op1 CRn CRm op2 + * Counter: 11 011 1101 110:n<3> n<2:0> + * Type: 11 011 1101 111:n<3> n<2:0> + * n: 0-15 + */ + +#define SYS_AMEVCNTR0_EL0(n) SYS_AM_EL0(4 + ((n) >> 3), (n) & 7) +#define SYS_AMEVTYPE0_EL0(n) SYS_AM_EL0(6 + ((n) >> 3), (n) & 7) +#define SYS_AMEVCNTR1_EL0(n) SYS_AM_EL0(12 + ((n) >> 3), (n) & 7) +#define SYS_AMEVTYPE1_EL0(n) SYS_AM_EL0(14 + ((n) >> 3), (n) & 7) + +/* AMU v1: Fixed (architecturally defined) activity monitors */ +#define SYS_AMEVCNTR0_CORE_EL0 SYS_AMEVCNTR0_EL0(0) +#define SYS_AMEVCNTR0_CONST_EL0 SYS_AMEVCNTR0_EL0(1) +#define SYS_AMEVCNTR0_INST_RET_EL0 SYS_AMEVCNTR0_EL0(2) +#define SYS_AMEVCNTR0_MEM_STALL SYS_AMEVCNTR0_EL0(3) + #define SYS_CNTFRQ_EL0 sys_reg(3, 3, 14, 0, 0) #define SYS_CNTP_TVAL_EL0 sys_reg(3, 3, 14, 2, 0) @@ -598,6 +634,7 @@ #define ID_AA64PFR0_CSV3_SHIFT 60 #define ID_AA64PFR0_CSV2_SHIFT 56 #define ID_AA64PFR0_DIT_SHIFT 48 +#define ID_AA64PFR0_AMU_SHIFT 44 #define ID_AA64PFR0_SVE_SHIFT 32 #define ID_AA64PFR0_RAS_SHIFT 28 #define ID_AA64PFR0_GIC_SHIFT 24 @@ -608,6 +645,7 @@ #define ID_AA64PFR0_EL1_SHIFT 4 #define ID_AA64PFR0_EL0_SHIFT 0 +#define ID_AA64PFR0_AMU 0x1 #define ID_AA64PFR0_SVE 0x1 #define ID_AA64PFR0_RAS_V1 0x1 #define ID_AA64PFR0_FP_NI 0xf @@ -702,6 +740,16 @@ #define ID_AA64DFR0_TRACEVER_SHIFT 4 #define ID_AA64DFR0_DEBUGVER_SHIFT 0 +#define ID_AA64DFR0_PMUVER_8_0 0x1 +#define ID_AA64DFR0_PMUVER_8_1 0x4 +#define ID_AA64DFR0_PMUVER_8_4 0x5 +#define ID_AA64DFR0_PMUVER_8_5 0x6 +#define ID_AA64DFR0_PMUVER_IMP_DEF 0xf + +#define ID_DFR0_PERFMON_SHIFT 24 + +#define ID_DFR0_PERFMON_8_1 0x4 + #define ID_ISAR5_RDM_SHIFT 24 #define ID_ISAR5_CRC32_SHIFT 16 #define ID_ISAR5_SHA2_SHIFT 12 diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h index cbd70d78ef15..0cc835ddfcd1 100644 --- a/arch/arm64/include/asm/topology.h +++ b/arch/arm64/include/asm/topology.h @@ -16,6 +16,15 @@ int pcibus_to_node(struct pci_bus *bus); #include <linux/arch_topology.h> +#ifdef CONFIG_ARM64_AMU_EXTN +/* + * Replace task scheduler's default counter-based + * frequency-invariance scale factor setting. + */ +void topology_scale_freq_tick(void); +#define arch_scale_freq_tick topology_scale_freq_tick +#endif /* CONFIG_ARM64_AMU_EXTN */ + /* Replace task scheduler's default frequency-invariant accounting */ #define arch_scale_freq_capacity topology_get_freq_scale diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index fc6488660f64..4e5b8ee31442 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -21,7 +21,7 @@ obj-y := debug-monitors.o entry.o irq.o fpsimd.o \ smp.o smp_spin_table.o topology.o smccc-call.o \ syscall.o -extra-$(CONFIG_EFI) := efi-entry.o +targets += efi-entry.o OBJCOPYFLAGS := --prefix-symbols=__efistub_ $(obj)/%.stub.o: $(obj)/%.o FORCE diff --git a/arch/arm64/kernel/armv8_deprecated.c b/arch/arm64/kernel/armv8_deprecated.c index 7832b3216370..4cc581af2d96 100644 --- a/arch/arm64/kernel/armv8_deprecated.c +++ b/arch/arm64/kernel/armv8_deprecated.c @@ -630,7 +630,7 @@ static int __init armv8_deprecated_init(void) register_insn_emulation(&cp15_barrier_ops); if (IS_ENABLED(CONFIG_SETEND_EMULATION)) { - if(system_supports_mixed_endian_el0()) + if (system_supports_mixed_endian_el0()) register_insn_emulation(&setend_ops); else pr_info("setend instruction emulation is not supported on this system\n"); diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c index a5bdce8af65b..9981a0a5a87f 100644 --- a/arch/arm64/kernel/asm-offsets.c +++ b/arch/arm64/kernel/asm-offsets.c @@ -40,6 +40,10 @@ int main(void) #endif BLANK(); DEFINE(THREAD_CPU_CONTEXT, offsetof(struct task_struct, thread.cpu_context)); +#ifdef CONFIG_ARM64_PTR_AUTH + DEFINE(THREAD_KEYS_USER, offsetof(struct task_struct, thread.keys_user)); + DEFINE(THREAD_KEYS_KERNEL, offsetof(struct task_struct, thread.keys_kernel)); +#endif BLANK(); DEFINE(S_X0, offsetof(struct pt_regs, regs[0])); DEFINE(S_X2, offsetof(struct pt_regs, regs[2])); @@ -88,6 +92,9 @@ int main(void) BLANK(); DEFINE(CPU_BOOT_STACK, offsetof(struct secondary_data, stack)); DEFINE(CPU_BOOT_TASK, offsetof(struct secondary_data, task)); +#ifdef CONFIG_ARM64_PTR_AUTH + DEFINE(CPU_BOOT_PTRAUTH_KEY, offsetof(struct secondary_data, ptrauth_key)); +#endif BLANK(); #ifdef CONFIG_KVM_ARM_HOST DEFINE(VCPU_CONTEXT, offsetof(struct kvm_vcpu, arch.ctxt)); @@ -128,5 +135,14 @@ int main(void) DEFINE(SDEI_EVENT_INTREGS, offsetof(struct sdei_registered_event, interrupted_regs)); DEFINE(SDEI_EVENT_PRIORITY, offsetof(struct sdei_registered_event, priority)); #endif +#ifdef CONFIG_ARM64_PTR_AUTH + DEFINE(PTRAUTH_USER_KEY_APIA, offsetof(struct ptrauth_keys_user, apia)); + DEFINE(PTRAUTH_USER_KEY_APIB, offsetof(struct ptrauth_keys_user, apib)); + DEFINE(PTRAUTH_USER_KEY_APDA, offsetof(struct ptrauth_keys_user, apda)); + DEFINE(PTRAUTH_USER_KEY_APDB, offsetof(struct ptrauth_keys_user, apdb)); + DEFINE(PTRAUTH_USER_KEY_APGA, offsetof(struct ptrauth_keys_user, apga)); + DEFINE(PTRAUTH_KERNEL_KEY_APIA, offsetof(struct ptrauth_keys_kernel, apia)); + BLANK(); +#endif return 0; } diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S index 32c7bf858dd9..38087b4c0432 100644 --- a/arch/arm64/kernel/cpu-reset.S +++ b/arch/arm64/kernel/cpu-reset.S @@ -32,7 +32,7 @@ ENTRY(__cpu_soft_restart) /* Clear sctlr_el1 flags. */ mrs x12, sctlr_el1 - ldr x13, =SCTLR_ELx_FLAGS + mov_q x13, SCTLR_ELx_FLAGS bic x12, x12, x13 pre_disable_mmu_workaround msr sctlr_el1, x12 diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index 703ad0a84f99..df56d2295d16 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -11,6 +11,7 @@ #include <asm/cpu.h> #include <asm/cputype.h> #include <asm/cpufeature.h> +#include <asm/kvm_asm.h> #include <asm/smp_plat.h> static bool __maybe_unused @@ -113,13 +114,10 @@ atomic_t arm64_el2_vector_last_slot = ATOMIC_INIT(-1); DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data); #ifdef CONFIG_KVM_INDIRECT_VECTORS -extern char __smccc_workaround_1_smc_start[]; -extern char __smccc_workaround_1_smc_end[]; - static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start, const char *hyp_vecs_end) { - void *dst = lm_alias(__bp_harden_hyp_vecs_start + slot * SZ_2K); + void *dst = lm_alias(__bp_harden_hyp_vecs + slot * SZ_2K); int i; for (i = 0; i < SZ_2K; i += 0x80) @@ -163,9 +161,6 @@ static void install_bp_hardening_cb(bp_hardening_cb_t fn, raw_spin_unlock(&bp_lock); } #else -#define __smccc_workaround_1_smc_start NULL -#define __smccc_workaround_1_smc_end NULL - static void install_bp_hardening_cb(bp_hardening_cb_t fn, const char *hyp_vecs_start, const char *hyp_vecs_end) @@ -176,7 +171,7 @@ static void install_bp_hardening_cb(bp_hardening_cb_t fn, #include <linux/arm-smccc.h> -static void call_smc_arch_workaround_1(void) +static void __maybe_unused call_smc_arch_workaround_1(void) { arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_1, NULL); } @@ -239,11 +234,14 @@ static int detect_harden_bp_fw(void) smccc_end = NULL; break; +#if IS_ENABLED(CONFIG_KVM_ARM_HOST) case SMCCC_CONDUIT_SMC: cb = call_smc_arch_workaround_1; - smccc_start = __smccc_workaround_1_smc_start; - smccc_end = __smccc_workaround_1_smc_end; + smccc_start = __smccc_workaround_1_smc; + smccc_end = __smccc_workaround_1_smc + + __SMCCC_WORKAROUND_1_SMC_SZ; break; +#endif default: return -1; diff --git a/arch/arm64/kernel/cpu_ops.c b/arch/arm64/kernel/cpu_ops.c index 7e07072757af..e133011f64b5 100644 --- a/arch/arm64/kernel/cpu_ops.c +++ b/arch/arm64/kernel/cpu_ops.c @@ -15,10 +15,12 @@ #include <asm/smp_plat.h> extern const struct cpu_operations smp_spin_table_ops; +#ifdef CONFIG_ARM64_ACPI_PARKING_PROTOCOL extern const struct cpu_operations acpi_parking_protocol_ops; +#endif extern const struct cpu_operations cpu_psci_ops; -const struct cpu_operations *cpu_ops[NR_CPUS] __ro_after_init; +static const struct cpu_operations *cpu_ops[NR_CPUS] __ro_after_init; static const struct cpu_operations *const dt_supported_cpu_ops[] __initconst = { &smp_spin_table_ops, @@ -94,7 +96,7 @@ static const char *__init cpu_read_enable_method(int cpu) /* * Read a cpu's enable method and record it in cpu_ops. */ -int __init cpu_read_ops(int cpu) +int __init init_cpu_ops(int cpu) { const char *enable_method = cpu_read_enable_method(cpu); @@ -109,3 +111,8 @@ int __init cpu_read_ops(int cpu) return 0; } + +const struct cpu_operations *get_cpu_ops(int cpu) +{ + return cpu_ops[cpu]; +} diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 0b6715625cf6..9fac745aa7bb 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -116,6 +116,8 @@ cpufeature_pan_not_uao(const struct arm64_cpu_capabilities *entry, int __unused) static void cpu_enable_cnp(struct arm64_cpu_capabilities const *cap); +static bool __system_matches_cap(unsigned int n); + /* * NOTE: Any changes to the visibility of features should be kept in * sync with the documentation of the CPU feature register ABI. @@ -163,6 +165,7 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = { ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV3_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV2_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_DIT_SHIFT, 4, 0), + ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_AMU_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE), FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_SVE_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_RAS_SHIFT, 4, 0), @@ -551,7 +554,7 @@ static void __init init_cpu_ftr_reg(u32 sys_reg, u64 new) BUG_ON(!reg); - for (ftrp = reg->ftr_bits; ftrp->width; ftrp++) { + for (ftrp = reg->ftr_bits; ftrp->width; ftrp++) { u64 ftr_mask = arm64_ftr_mask(ftrp); s64 ftr_new = arm64_ftr_value(ftrp, new); @@ -1222,6 +1225,57 @@ static bool has_hw_dbm(const struct arm64_cpu_capabilities *cap, #endif +#ifdef CONFIG_ARM64_AMU_EXTN + +/* + * The "amu_cpus" cpumask only signals that the CPU implementation for the + * flagged CPUs supports the Activity Monitors Unit (AMU) but does not provide + * information regarding all the events that it supports. When a CPU bit is + * set in the cpumask, the user of this feature can only rely on the presence + * of the 4 fixed counters for that CPU. But this does not guarantee that the + * counters are enabled or access to these counters is enabled by code + * executed at higher exception levels (firmware). + */ +static struct cpumask amu_cpus __read_mostly; + +bool cpu_has_amu_feat(int cpu) +{ + return cpumask_test_cpu(cpu, &amu_cpus); +} + +/* Initialize the use of AMU counters for frequency invariance */ +extern void init_cpu_freq_invariance_counters(void); + +static void cpu_amu_enable(struct arm64_cpu_capabilities const *cap) +{ + if (has_cpuid_feature(cap, SCOPE_LOCAL_CPU)) { + pr_info("detected CPU%d: Activity Monitors Unit (AMU)\n", + smp_processor_id()); + cpumask_set_cpu(smp_processor_id(), &amu_cpus); + init_cpu_freq_invariance_counters(); + } +} + +static bool has_amu(const struct arm64_cpu_capabilities *cap, + int __unused) +{ + /* + * The AMU extension is a non-conflicting feature: the kernel can + * safely run a mix of CPUs with and without support for the + * activity monitors extension. Therefore, unconditionally enable + * the capability to allow any late CPU to use the feature. + * + * With this feature unconditionally enabled, the cpu_enable + * function will be called for all CPUs that match the criteria, + * including secondary and hotplugged, marking this feature as + * present on that respective CPU. The enable function will also + * print a detection message. + */ + + return true; +} +#endif + #ifdef CONFIG_ARM64_VHE static bool runs_at_el2(const struct arm64_cpu_capabilities *entry, int __unused) { @@ -1316,10 +1370,18 @@ static void cpu_clear_disr(const struct arm64_cpu_capabilities *__unused) #endif /* CONFIG_ARM64_RAS_EXTN */ #ifdef CONFIG_ARM64_PTR_AUTH -static void cpu_enable_address_auth(struct arm64_cpu_capabilities const *cap) +static bool has_address_auth(const struct arm64_cpu_capabilities *entry, + int __unused) { - sysreg_clear_set(sctlr_el1, 0, SCTLR_ELx_ENIA | SCTLR_ELx_ENIB | - SCTLR_ELx_ENDA | SCTLR_ELx_ENDB); + return __system_matches_cap(ARM64_HAS_ADDRESS_AUTH_ARCH) || + __system_matches_cap(ARM64_HAS_ADDRESS_AUTH_IMP_DEF); +} + +static bool has_generic_auth(const struct arm64_cpu_capabilities *entry, + int __unused) +{ + return __system_matches_cap(ARM64_HAS_GENERIC_AUTH_ARCH) || + __system_matches_cap(ARM64_HAS_GENERIC_AUTH_IMP_DEF); } #endif /* CONFIG_ARM64_PTR_AUTH */ @@ -1347,6 +1409,25 @@ static bool can_use_gic_priorities(const struct arm64_cpu_capabilities *entry, } #endif +/* Internal helper functions to match cpu capability type */ +static bool +cpucap_late_cpu_optional(const struct arm64_cpu_capabilities *cap) +{ + return !!(cap->type & ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU); +} + +static bool +cpucap_late_cpu_permitted(const struct arm64_cpu_capabilities *cap) +{ + return !!(cap->type & ARM64_CPUCAP_PERMITTED_FOR_LATE_CPU); +} + +static bool +cpucap_panic_on_conflict(const struct arm64_cpu_capabilities *cap) +{ + return !!(cap->type & ARM64_CPUCAP_PANIC_ON_CONFLICT); +} + static const struct arm64_cpu_capabilities arm64_features[] = { { .desc = "GIC system register CPU interface", @@ -1499,6 +1580,24 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .cpu_enable = cpu_clear_disr, }, #endif /* CONFIG_ARM64_RAS_EXTN */ +#ifdef CONFIG_ARM64_AMU_EXTN + { + /* + * The feature is enabled by default if CONFIG_ARM64_AMU_EXTN=y. + * Therefore, don't provide .desc as we don't want the detection + * message to be shown until at least one CPU is detected to + * support the feature. + */ + .capability = ARM64_HAS_AMU_EXTN, + .type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE, + .matches = has_amu, + .sys_reg = SYS_ID_AA64PFR0_EL1, + .sign = FTR_UNSIGNED, + .field_pos = ID_AA64PFR0_AMU_SHIFT, + .min_field_value = ID_AA64PFR0_AMU, + .cpu_enable = cpu_amu_enable, + }, +#endif /* CONFIG_ARM64_AMU_EXTN */ { .desc = "Data cache clean to the PoU not required for I/D coherence", .capability = ARM64_HAS_CACHE_IDC, @@ -1592,24 +1691,27 @@ static const struct arm64_cpu_capabilities arm64_features[] = { { .desc = "Address authentication (architected algorithm)", .capability = ARM64_HAS_ADDRESS_AUTH_ARCH, - .type = ARM64_CPUCAP_SYSTEM_FEATURE, + .type = ARM64_CPUCAP_BOOT_CPU_FEATURE, .sys_reg = SYS_ID_AA64ISAR1_EL1, .sign = FTR_UNSIGNED, .field_pos = ID_AA64ISAR1_APA_SHIFT, .min_field_value = ID_AA64ISAR1_APA_ARCHITECTED, .matches = has_cpuid_feature, - .cpu_enable = cpu_enable_address_auth, }, { .desc = "Address authentication (IMP DEF algorithm)", .capability = ARM64_HAS_ADDRESS_AUTH_IMP_DEF, - .type = ARM64_CPUCAP_SYSTEM_FEATURE, + .type = ARM64_CPUCAP_BOOT_CPU_FEATURE, .sys_reg = SYS_ID_AA64ISAR1_EL1, .sign = FTR_UNSIGNED, .field_pos = ID_AA64ISAR1_API_SHIFT, .min_field_value = ID_AA64ISAR1_API_IMP_DEF, .matches = has_cpuid_feature, - .cpu_enable = cpu_enable_address_auth, + }, + { + .capability = ARM64_HAS_ADDRESS_AUTH, + .type = ARM64_CPUCAP_BOOT_CPU_FEATURE, + .matches = has_address_auth, }, { .desc = "Generic authentication (architected algorithm)", @@ -1631,6 +1733,11 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .min_field_value = ID_AA64ISAR1_GPI_IMP_DEF, .matches = has_cpuid_feature, }, + { + .capability = ARM64_HAS_GENERIC_AUTH, + .type = ARM64_CPUCAP_SYSTEM_FEATURE, + .matches = has_generic_auth, + }, #endif /* CONFIG_ARM64_PTR_AUTH */ #ifdef CONFIG_ARM64_PSEUDO_NMI { @@ -1980,10 +2087,8 @@ static void __init enable_cpu_capabilities(u16 scope_mask) * Run through the list of capabilities to check for conflicts. * If the system has already detected a capability, take necessary * action on this CPU. - * - * Returns "false" on conflicts. */ -static bool verify_local_cpu_caps(u16 scope_mask) +static void verify_local_cpu_caps(u16 scope_mask) { int i; bool cpu_has_cap, system_has_cap; @@ -2028,10 +2133,12 @@ static bool verify_local_cpu_caps(u16 scope_mask) pr_crit("CPU%d: Detected conflict for capability %d (%s), System: %d, CPU: %d\n", smp_processor_id(), caps->capability, caps->desc, system_has_cap, cpu_has_cap); - return false; - } - return true; + if (cpucap_panic_on_conflict(caps)) + cpu_panic_kernel(); + else + cpu_die_early(); + } } /* @@ -2041,12 +2148,8 @@ static bool verify_local_cpu_caps(u16 scope_mask) static void check_early_cpu_features(void) { verify_cpu_asid_bits(); - /* - * Early features are used by the kernel already. If there - * is a conflict, we cannot proceed further. - */ - if (!verify_local_cpu_caps(SCOPE_BOOT_CPU)) - cpu_panic_kernel(); + + verify_local_cpu_caps(SCOPE_BOOT_CPU); } static void @@ -2094,8 +2197,7 @@ static void verify_local_cpu_capabilities(void) * check_early_cpu_features(), as they need to be verified * on all secondary CPUs. */ - if (!verify_local_cpu_caps(SCOPE_ALL & ~SCOPE_BOOT_CPU)) - cpu_die_early(); + verify_local_cpu_caps(SCOPE_ALL & ~SCOPE_BOOT_CPU); verify_local_elf_hwcaps(arm64_elf_hwcaps); @@ -2146,6 +2248,23 @@ bool this_cpu_has_cap(unsigned int n) return false; } +/* + * This helper function is used in a narrow window when, + * - The system wide safe registers are set with all the SMP CPUs and, + * - The SYSTEM_FEATURE cpu_hwcaps may not have been set. + * In all other cases cpus_have_{const_}cap() should be used. + */ +static bool __system_matches_cap(unsigned int n) +{ + if (n < ARM64_NCAPS) { + const struct arm64_cpu_capabilities *cap = cpu_hwcaps_ptrs[n]; + + if (cap) + return cap->matches(cap, SCOPE_SYSTEM); + } + return false; +} + void cpu_set_feature(unsigned int num) { WARN_ON(num >= MAX_CPU_FEATURES); @@ -2218,7 +2337,7 @@ void __init setup_cpu_features(void) static bool __maybe_unused cpufeature_pan_not_uao(const struct arm64_cpu_capabilities *entry, int __unused) { - return (cpus_have_const_cap(ARM64_HAS_PAN) && !cpus_have_const_cap(ARM64_HAS_UAO)); + return (__system_matches_cap(ARM64_HAS_PAN) && !__system_matches_cap(ARM64_HAS_UAO)); } static void __maybe_unused cpu_enable_cnp(struct arm64_cpu_capabilities const *cap) diff --git a/arch/arm64/kernel/cpuidle.c b/arch/arm64/kernel/cpuidle.c index e4d6af2fdec7..b512b5503f6e 100644 --- a/arch/arm64/kernel/cpuidle.c +++ b/arch/arm64/kernel/cpuidle.c @@ -18,11 +18,11 @@ int arm_cpuidle_init(unsigned int cpu) { + const struct cpu_operations *ops = get_cpu_ops(cpu); int ret = -EOPNOTSUPP; - if (cpu_ops[cpu] && cpu_ops[cpu]->cpu_suspend && - cpu_ops[cpu]->cpu_init_idle) - ret = cpu_ops[cpu]->cpu_init_idle(cpu); + if (ops && ops->cpu_suspend && ops->cpu_init_idle) + ret = ops->cpu_init_idle(cpu); return ret; } @@ -37,8 +37,9 @@ int arm_cpuidle_init(unsigned int cpu) int arm_cpuidle_suspend(int index) { int cpu = smp_processor_id(); + const struct cpu_operations *ops = get_cpu_ops(cpu); - return cpu_ops[cpu]->cpu_suspend(index); + return ops->cpu_suspend(index); } #ifdef CONFIG_ACPI diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c index fde59981445c..c839b5bf1904 100644 --- a/arch/arm64/kernel/entry-common.c +++ b/arch/arm64/kernel/entry-common.c @@ -175,7 +175,7 @@ NOKPROBE_SYMBOL(el0_pc); static void notrace el0_sp(struct pt_regs *regs, unsigned long esr) { user_exit_irqoff(); - local_daif_restore(DAIF_PROCCTX_NOIRQ); + local_daif_restore(DAIF_PROCCTX); do_sp_pc_abort(regs->sp, esr, regs); } NOKPROBE_SYMBOL(el0_sp); diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S index 7d02f9966d34..833d48c9acb5 100644 --- a/arch/arm64/kernel/entry-ftrace.S +++ b/arch/arm64/kernel/entry-ftrace.S @@ -75,27 +75,27 @@ add x29, sp, #S_STACKFRAME .endm -ENTRY(ftrace_regs_caller) +SYM_CODE_START(ftrace_regs_caller) ftrace_regs_entry 1 b ftrace_common -ENDPROC(ftrace_regs_caller) +SYM_CODE_END(ftrace_regs_caller) -ENTRY(ftrace_caller) +SYM_CODE_START(ftrace_caller) ftrace_regs_entry 0 b ftrace_common -ENDPROC(ftrace_caller) +SYM_CODE_END(ftrace_caller) -ENTRY(ftrace_common) +SYM_CODE_START(ftrace_common) sub x0, x30, #AARCH64_INSN_SIZE // ip (callsite's BL insn) mov x1, x9 // parent_ip (callsite's LR) ldr_l x2, function_trace_op // op mov x3, sp // regs -GLOBAL(ftrace_call) +SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL) bl ftrace_stub #ifdef CONFIG_FUNCTION_GRAPH_TRACER -GLOBAL(ftrace_graph_call) // ftrace_graph_caller(); +SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL) // ftrace_graph_caller(); nop // If enabled, this will be replaced // "b ftrace_graph_caller" #endif @@ -122,17 +122,17 @@ ftrace_common_return: add sp, sp, #S_FRAME_SIZE + 16 ret x9 -ENDPROC(ftrace_common) +SYM_CODE_END(ftrace_common) #ifdef CONFIG_FUNCTION_GRAPH_TRACER -ENTRY(ftrace_graph_caller) +SYM_CODE_START(ftrace_graph_caller) ldr x0, [sp, #S_PC] sub x0, x0, #AARCH64_INSN_SIZE // ip (callsite's BL insn) add x1, sp, #S_LR // parent_ip (callsite's LR) ldr x2, [sp, #S_FRAME_SIZE] // parent fp (callsite's FP) bl prepare_ftrace_return b ftrace_common_return -ENDPROC(ftrace_graph_caller) +SYM_CODE_END(ftrace_graph_caller) #endif #else /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */ @@ -218,7 +218,7 @@ ENDPROC(ftrace_graph_caller) * - tracer function to probe instrumented function's entry, * - ftrace_graph_caller to set up an exit hook */ -ENTRY(_mcount) +SYM_FUNC_START(_mcount) mcount_enter ldr_l x2, ftrace_trace_function @@ -242,7 +242,7 @@ skip_ftrace_call: // } b.ne ftrace_graph_caller // ftrace_graph_caller(); #endif /* CONFIG_FUNCTION_GRAPH_TRACER */ mcount_exit -ENDPROC(_mcount) +SYM_FUNC_END(_mcount) EXPORT_SYMBOL(_mcount) NOKPROBE(_mcount) @@ -253,9 +253,9 @@ NOKPROBE(_mcount) * and later on, NOP to branch to ftrace_caller() when enabled or branch to * NOP when disabled per-function base. */ -ENTRY(_mcount) +SYM_FUNC_START(_mcount) ret -ENDPROC(_mcount) +SYM_FUNC_END(_mcount) EXPORT_SYMBOL(_mcount) NOKPROBE(_mcount) @@ -268,24 +268,24 @@ NOKPROBE(_mcount) * - tracer function to probe instrumented function's entry, * - ftrace_graph_caller to set up an exit hook */ -ENTRY(ftrace_caller) +SYM_FUNC_START(ftrace_caller) mcount_enter mcount_get_pc0 x0 // function's pc mcount_get_lr x1 // function's lr -GLOBAL(ftrace_call) // tracer(pc, lr); +SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL) // tracer(pc, lr); nop // This will be replaced with "bl xxx" // where xxx can be any kind of tracer. #ifdef CONFIG_FUNCTION_GRAPH_TRACER -GLOBAL(ftrace_graph_call) // ftrace_graph_caller(); +SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL) // ftrace_graph_caller(); nop // If enabled, this will be replaced // "b ftrace_graph_caller" #endif mcount_exit -ENDPROC(ftrace_caller) +SYM_FUNC_END(ftrace_caller) #endif /* CONFIG_DYNAMIC_FTRACE */ #ifdef CONFIG_FUNCTION_GRAPH_TRACER @@ -298,20 +298,20 @@ ENDPROC(ftrace_caller) * the call stack in order to intercept instrumented function's return path * and run return_to_handler() later on its exit. */ -ENTRY(ftrace_graph_caller) +SYM_FUNC_START(ftrace_graph_caller) mcount_get_pc x0 // function's pc mcount_get_lr_addr x1 // pointer to function's saved lr mcount_get_parent_fp x2 // parent's fp bl prepare_ftrace_return // prepare_ftrace_return(pc, &lr, fp) mcount_exit -ENDPROC(ftrace_graph_caller) +SYM_FUNC_END(ftrace_graph_caller) #endif /* CONFIG_FUNCTION_GRAPH_TRACER */ #endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */ -ENTRY(ftrace_stub) +SYM_FUNC_START(ftrace_stub) ret -ENDPROC(ftrace_stub) +SYM_FUNC_END(ftrace_stub) #ifdef CONFIG_FUNCTION_GRAPH_TRACER /* @@ -320,7 +320,7 @@ ENDPROC(ftrace_stub) * Run ftrace_return_to_handler() before going back to parent. * @fp is checked against the value passed by ftrace_graph_caller(). */ -ENTRY(return_to_handler) +SYM_CODE_START(return_to_handler) /* save return value regs */ sub sp, sp, #64 stp x0, x1, [sp] @@ -340,5 +340,5 @@ ENTRY(return_to_handler) add sp, sp, #64 ret -END(return_to_handler) +SYM_CODE_END(return_to_handler) #endif /* CONFIG_FUNCTION_GRAPH_TRACER */ diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 9461d812ae27..ddcde093c433 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -14,6 +14,7 @@ #include <asm/alternative.h> #include <asm/assembler.h> #include <asm/asm-offsets.h> +#include <asm/asm_pointer_auth.h> #include <asm/cpufeature.h> #include <asm/errno.h> #include <asm/esr.h> @@ -177,6 +178,7 @@ alternative_cb_end apply_ssbd 1, x22, x23 + ptrauth_keys_install_kernel tsk, 1, x20, x22, x23 .else add x21, sp, #S_FRAME_SIZE get_current_task tsk @@ -341,6 +343,9 @@ alternative_else_nop_endif msr cntkctl_el1, x1 4: #endif + /* No kernel C function calls after this as user keys are set. */ + ptrauth_keys_install_user tsk, x0, x1, x2 + apply_ssbd 0, x0, x1 .endif @@ -465,7 +470,7 @@ alternative_endif .pushsection ".entry.text", "ax" .align 11 -ENTRY(vectors) +SYM_CODE_START(vectors) kernel_ventry 1, sync_invalid // Synchronous EL1t kernel_ventry 1, irq_invalid // IRQ EL1t kernel_ventry 1, fiq_invalid // FIQ EL1t @@ -492,7 +497,7 @@ ENTRY(vectors) kernel_ventry 0, fiq_invalid, 32 // FIQ 32-bit EL0 kernel_ventry 0, error_invalid, 32 // Error 32-bit EL0 #endif -END(vectors) +SYM_CODE_END(vectors) #ifdef CONFIG_VMAP_STACK /* @@ -534,57 +539,57 @@ __bad_stack: ASM_BUG() .endm -el0_sync_invalid: +SYM_CODE_START_LOCAL(el0_sync_invalid) inv_entry 0, BAD_SYNC -ENDPROC(el0_sync_invalid) +SYM_CODE_END(el0_sync_invalid) -el0_irq_invalid: +SYM_CODE_START_LOCAL(el0_irq_invalid) inv_entry 0, BAD_IRQ -ENDPROC(el0_irq_invalid) +SYM_CODE_END(el0_irq_invalid) -el0_fiq_invalid: +SYM_CODE_START_LOCAL(el0_fiq_invalid) inv_entry 0, BAD_FIQ -ENDPROC(el0_fiq_invalid) +SYM_CODE_END(el0_fiq_invalid) -el0_error_invalid: +SYM_CODE_START_LOCAL(el0_error_invalid) inv_entry 0, BAD_ERROR -ENDPROC(el0_error_invalid) +SYM_CODE_END(el0_error_invalid) #ifdef CONFIG_COMPAT -el0_fiq_invalid_compat: +SYM_CODE_START_LOCAL(el0_fiq_invalid_compat) inv_entry 0, BAD_FIQ, 32 -ENDPROC(el0_fiq_invalid_compat) +SYM_CODE_END(el0_fiq_invalid_compat) #endif -el1_sync_invalid: +SYM_CODE_START_LOCAL(el1_sync_invalid) inv_entry 1, BAD_SYNC -ENDPROC(el1_sync_invalid) +SYM_CODE_END(el1_sync_invalid) -el1_irq_invalid: +SYM_CODE_START_LOCAL(el1_irq_invalid) inv_entry 1, BAD_IRQ -ENDPROC(el1_irq_invalid) +SYM_CODE_END(el1_irq_invalid) -el1_fiq_invalid: +SYM_CODE_START_LOCAL(el1_fiq_invalid) inv_entry 1, BAD_FIQ -ENDPROC(el1_fiq_invalid) +SYM_CODE_END(el1_fiq_invalid) -el1_error_invalid: +SYM_CODE_START_LOCAL(el1_error_invalid) inv_entry 1, BAD_ERROR -ENDPROC(el1_error_invalid) +SYM_CODE_END(el1_error_invalid) /* * EL1 mode handlers. */ .align 6 -el1_sync: +SYM_CODE_START_LOCAL_NOALIGN(el1_sync) kernel_entry 1 mov x0, sp bl el1_sync_handler kernel_exit 1 -ENDPROC(el1_sync) +SYM_CODE_END(el1_sync) .align 6 -el1_irq: +SYM_CODE_START_LOCAL_NOALIGN(el1_irq) kernel_entry 1 gic_prio_irq_setup pmr=x20, tmp=x1 enable_da_f @@ -639,42 +644,42 @@ alternative_else_nop_endif #endif kernel_exit 1 -ENDPROC(el1_irq) +SYM_CODE_END(el1_irq) /* * EL0 mode handlers. */ .align 6 -el0_sync: +SYM_CODE_START_LOCAL_NOALIGN(el0_sync) kernel_entry 0 mov x0, sp bl el0_sync_handler b ret_to_user -ENDPROC(el0_sync) +SYM_CODE_END(el0_sync) #ifdef CONFIG_COMPAT .align 6 -el0_sync_compat: +SYM_CODE_START_LOCAL_NOALIGN(el0_sync_compat) kernel_entry 0, 32 mov x0, sp bl el0_sync_compat_handler b ret_to_user -ENDPROC(el0_sync_compat) +SYM_CODE_END(el0_sync_compat) .align 6 -el0_irq_compat: +SYM_CODE_START_LOCAL_NOALIGN(el0_irq_compat) kernel_entry 0, 32 b el0_irq_naked -ENDPROC(el0_irq_compat) +SYM_CODE_END(el0_irq_compat) -el0_error_compat: +SYM_CODE_START_LOCAL_NOALIGN(el0_error_compat) kernel_entry 0, 32 b el0_error_naked -ENDPROC(el0_error_compat) +SYM_CODE_END(el0_error_compat) #endif .align 6 -el0_irq: +SYM_CODE_START_LOCAL_NOALIGN(el0_irq) kernel_entry 0 el0_irq_naked: gic_prio_irq_setup pmr=x20, tmp=x0 @@ -696,9 +701,9 @@ el0_irq_naked: bl trace_hardirqs_on #endif b ret_to_user -ENDPROC(el0_irq) +SYM_CODE_END(el0_irq) -el1_error: +SYM_CODE_START_LOCAL(el1_error) kernel_entry 1 mrs x1, esr_el1 gic_prio_kentry_setup tmp=x2 @@ -706,9 +711,9 @@ el1_error: mov x0, sp bl do_serror kernel_exit 1 -ENDPROC(el1_error) +SYM_CODE_END(el1_error) -el0_error: +SYM_CODE_START_LOCAL(el0_error) kernel_entry 0 el0_error_naked: mrs x25, esr_el1 @@ -720,7 +725,7 @@ el0_error_naked: bl do_serror enable_da_f b ret_to_user -ENDPROC(el0_error) +SYM_CODE_END(el0_error) /* * Ok, we need to do extra processing, enter the slow path. @@ -832,7 +837,7 @@ alternative_else_nop_endif .endm .align 11 -ENTRY(tramp_vectors) +SYM_CODE_START_NOALIGN(tramp_vectors) .space 0x400 tramp_ventry @@ -844,24 +849,24 @@ ENTRY(tramp_vectors) tramp_ventry 32 tramp_ventry 32 tramp_ventry 32 -END(tramp_vectors) +SYM_CODE_END(tramp_vectors) -ENTRY(tramp_exit_native) +SYM_CODE_START(tramp_exit_native) tramp_exit -END(tramp_exit_native) +SYM_CODE_END(tramp_exit_native) -ENTRY(tramp_exit_compat) +SYM_CODE_START(tramp_exit_compat) tramp_exit 32 -END(tramp_exit_compat) +SYM_CODE_END(tramp_exit_compat) .ltorg .popsection // .entry.tramp.text #ifdef CONFIG_RANDOMIZE_BASE .pushsection ".rodata", "a" .align PAGE_SHIFT - .globl __entry_tramp_data_start -__entry_tramp_data_start: +SYM_DATA_START(__entry_tramp_data_start) .quad vectors +SYM_DATA_END(__entry_tramp_data_start) .popsection // .rodata #endif /* CONFIG_RANDOMIZE_BASE */ #endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */ @@ -874,7 +879,7 @@ __entry_tramp_data_start: * Previous and next are guaranteed not to be the same. * */ -ENTRY(cpu_switch_to) +SYM_FUNC_START(cpu_switch_to) mov x10, #THREAD_CPU_CONTEXT add x8, x0, x10 mov x9, sp @@ -895,21 +900,22 @@ ENTRY(cpu_switch_to) ldr lr, [x8] mov sp, x9 msr sp_el0, x1 + ptrauth_keys_install_kernel x1, 1, x8, x9, x10 ret -ENDPROC(cpu_switch_to) +SYM_FUNC_END(cpu_switch_to) NOKPROBE(cpu_switch_to) /* * This is how we return from a fork. */ -ENTRY(ret_from_fork) +SYM_CODE_START(ret_from_fork) bl schedule_tail cbz x19, 1f // not a kernel thread mov x0, x20 blr x19 1: get_current_task tsk b ret_to_user -ENDPROC(ret_from_fork) +SYM_CODE_END(ret_from_fork) NOKPROBE(ret_from_fork) #ifdef CONFIG_ARM_SDE_INTERFACE @@ -938,7 +944,7 @@ NOKPROBE(ret_from_fork) */ .ltorg .pushsection ".entry.tramp.text", "ax" -ENTRY(__sdei_asm_entry_trampoline) +SYM_CODE_START(__sdei_asm_entry_trampoline) mrs x4, ttbr1_el1 tbz x4, #USER_ASID_BIT, 1f @@ -960,7 +966,7 @@ ENTRY(__sdei_asm_entry_trampoline) ldr x4, =__sdei_asm_handler #endif br x4 -ENDPROC(__sdei_asm_entry_trampoline) +SYM_CODE_END(__sdei_asm_entry_trampoline) NOKPROBE(__sdei_asm_entry_trampoline) /* @@ -970,21 +976,22 @@ NOKPROBE(__sdei_asm_entry_trampoline) * x2: exit_mode * x4: struct sdei_registered_event argument from registration time. */ -ENTRY(__sdei_asm_exit_trampoline) +SYM_CODE_START(__sdei_asm_exit_trampoline) ldr x4, [x4, #(SDEI_EVENT_INTREGS + S_ORIG_ADDR_LIMIT)] cbnz x4, 1f tramp_unmap_kernel tmp=x4 1: sdei_handler_exit exit_mode=x2 -ENDPROC(__sdei_asm_exit_trampoline) +SYM_CODE_END(__sdei_asm_exit_trampoline) NOKPROBE(__sdei_asm_exit_trampoline) .ltorg .popsection // .entry.tramp.text #ifdef CONFIG_RANDOMIZE_BASE .pushsection ".rodata", "a" -__sdei_asm_trampoline_next_handler: +SYM_DATA_START(__sdei_asm_trampoline_next_handler) .quad __sdei_asm_handler +SYM_DATA_END(__sdei_asm_trampoline_next_handler) .popsection // .rodata #endif /* CONFIG_RANDOMIZE_BASE */ #endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */ @@ -1002,7 +1009,7 @@ __sdei_asm_trampoline_next_handler: * follow SMC-CC. We save (or retrieve) all the registers as the handler may * want them. */ -ENTRY(__sdei_asm_handler) +SYM_CODE_START(__sdei_asm_handler) stp x2, x3, [x1, #SDEI_EVENT_INTREGS + S_PC] stp x4, x5, [x1, #SDEI_EVENT_INTREGS + 16 * 2] stp x6, x7, [x1, #SDEI_EVENT_INTREGS + 16 * 3] @@ -1085,6 +1092,6 @@ alternative_else_nop_endif tramp_alias dst=x5, sym=__sdei_asm_exit_trampoline br x5 #endif -ENDPROC(__sdei_asm_handler) +SYM_CODE_END(__sdei_asm_handler) NOKPROBE(__sdei_asm_handler) #endif /* CONFIG_ARM_SDE_INTERFACE */ diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S index 989b1944cb71..57a91032b4c2 100644 --- a/arch/arm64/kernel/head.S +++ b/arch/arm64/kernel/head.S @@ -105,7 +105,7 @@ pe_header: * x24 __primary_switch() .. relocate_kernel() * current RELR displacement */ -ENTRY(stext) +SYM_CODE_START(stext) bl preserve_boot_args bl el2_setup // Drop to EL1, w0=cpu_boot_mode adrp x23, __PHYS_OFFSET @@ -118,14 +118,15 @@ ENTRY(stext) * On return, the CPU will be ready for the MMU to be turned on and * the TCR will have been set. */ + mov x0, #ARM64_CPU_BOOT_PRIMARY bl __cpu_setup // initialise processor b __primary_switch -ENDPROC(stext) +SYM_CODE_END(stext) /* * Preserve the arguments passed by the bootloader in x0 .. x3 */ -preserve_boot_args: +SYM_CODE_START_LOCAL(preserve_boot_args) mov x21, x0 // x21=FDT adr_l x0, boot_args // record the contents of @@ -137,7 +138,7 @@ preserve_boot_args: mov x1, #0x20 // 4 x 8 bytes b __inval_dcache_area // tail call -ENDPROC(preserve_boot_args) +SYM_CODE_END(preserve_boot_args) /* * Macro to create a table entry to the next page. @@ -275,7 +276,7 @@ ENDPROC(preserve_boot_args) * - first few MB of the kernel linear mapping to jump to once the MMU has * been enabled */ -__create_page_tables: +SYM_FUNC_START_LOCAL(__create_page_tables) mov x28, lr /* @@ -403,15 +404,14 @@ __create_page_tables: bl __inval_dcache_area ret x28 -ENDPROC(__create_page_tables) - .ltorg +SYM_FUNC_END(__create_page_tables) /* * The following fragment of code is executed with the MMU enabled. * * x0 = __PHYS_OFFSET */ -__primary_switched: +SYM_FUNC_START_LOCAL(__primary_switched) adrp x4, init_thread_union add sp, x4, #THREAD_SIZE adr_l x5, init_task @@ -456,7 +456,14 @@ __primary_switched: mov x29, #0 mov x30, #0 b start_kernel -ENDPROC(__primary_switched) +SYM_FUNC_END(__primary_switched) + + .pushsection ".rodata", "a" +SYM_DATA_START(kimage_vaddr) + .quad _text - TEXT_OFFSET +SYM_DATA_END(kimage_vaddr) +EXPORT_SYMBOL(kimage_vaddr) + .popsection /* * end early head section, begin head code that is also used for @@ -464,10 +471,6 @@ ENDPROC(__primary_switched) */ .section ".idmap.text","awx" -ENTRY(kimage_vaddr) - .quad _text - TEXT_OFFSET -EXPORT_SYMBOL(kimage_vaddr) - /* * If we're fortunate enough to boot at EL2, ensure that the world is * sane before dropping to EL1. @@ -475,7 +478,7 @@ EXPORT_SYMBOL(kimage_vaddr) * Returns either BOOT_CPU_MODE_EL1 or BOOT_CPU_MODE_EL2 in w0 if * booted in EL1 or EL2 respectively. */ -ENTRY(el2_setup) +SYM_FUNC_START(el2_setup) msr SPsel, #1 // We want to use SP_EL{1,2} mrs x0, CurrentEL cmp x0, #CurrentEL_EL2 @@ -599,7 +602,7 @@ set_hcr: isb ret -install_el2_stub: +SYM_INNER_LABEL(install_el2_stub, SYM_L_LOCAL) /* * When VHE is not in use, early init of EL2 and EL1 needs to be * done here. @@ -636,13 +639,13 @@ install_el2_stub: msr elr_el2, lr mov w0, #BOOT_CPU_MODE_EL2 // This CPU booted in EL2 eret -ENDPROC(el2_setup) +SYM_FUNC_END(el2_setup) /* * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed * in w0. See arch/arm64/include/asm/virt.h for more info. */ -set_cpu_boot_mode_flag: +SYM_FUNC_START_LOCAL(set_cpu_boot_mode_flag) adr_l x1, __boot_cpu_mode cmp w0, #BOOT_CPU_MODE_EL2 b.ne 1f @@ -651,7 +654,7 @@ set_cpu_boot_mode_flag: dmb sy dc ivac, x1 // Invalidate potentially stale cache line ret -ENDPROC(set_cpu_boot_mode_flag) +SYM_FUNC_END(set_cpu_boot_mode_flag) /* * These values are written with the MMU off, but read with the MMU on. @@ -667,15 +670,17 @@ ENDPROC(set_cpu_boot_mode_flag) * This is not in .bss, because we set it sufficiently early that the boot-time * zeroing of .bss would clobber it. */ -ENTRY(__boot_cpu_mode) +SYM_DATA_START(__boot_cpu_mode) .long BOOT_CPU_MODE_EL2 .long BOOT_CPU_MODE_EL1 +SYM_DATA_END(__boot_cpu_mode) /* * The booting CPU updates the failed status @__early_cpu_boot_status, * with MMU turned off. */ -ENTRY(__early_cpu_boot_status) +SYM_DATA_START(__early_cpu_boot_status) .quad 0 +SYM_DATA_END(__early_cpu_boot_status) .popsection @@ -683,7 +688,7 @@ ENTRY(__early_cpu_boot_status) * This provides a "holding pen" for platforms to hold all secondary * cores are held until we're ready for them to initialise. */ -ENTRY(secondary_holding_pen) +SYM_FUNC_START(secondary_holding_pen) bl el2_setup // Drop to EL1, w0=cpu_boot_mode bl set_cpu_boot_mode_flag mrs x0, mpidr_el1 @@ -695,31 +700,32 @@ pen: ldr x4, [x3] b.eq secondary_startup wfe b pen -ENDPROC(secondary_holding_pen) +SYM_FUNC_END(secondary_holding_pen) /* * Secondary entry point that jumps straight into the kernel. Only to * be used where CPUs are brought online dynamically by the kernel. */ -ENTRY(secondary_entry) +SYM_FUNC_START(secondary_entry) bl el2_setup // Drop to EL1 bl set_cpu_boot_mode_flag b secondary_startup -ENDPROC(secondary_entry) +SYM_FUNC_END(secondary_entry) -secondary_startup: +SYM_FUNC_START_LOCAL(secondary_startup) /* * Common entry point for secondary CPUs. */ bl __cpu_secondary_check52bitva + mov x0, #ARM64_CPU_BOOT_SECONDARY bl __cpu_setup // initialise processor adrp x1, swapper_pg_dir bl __enable_mmu ldr x8, =__secondary_switched br x8 -ENDPROC(secondary_startup) +SYM_FUNC_END(secondary_startup) -__secondary_switched: +SYM_FUNC_START_LOCAL(__secondary_switched) adr_l x5, vectors msr vbar_el1, x5 isb @@ -734,13 +740,13 @@ __secondary_switched: mov x29, #0 mov x30, #0 b secondary_start_kernel -ENDPROC(__secondary_switched) +SYM_FUNC_END(__secondary_switched) -__secondary_too_slow: +SYM_FUNC_START_LOCAL(__secondary_too_slow) wfe wfi b __secondary_too_slow -ENDPROC(__secondary_too_slow) +SYM_FUNC_END(__secondary_too_slow) /* * The booting CPU updates the failed status @__early_cpu_boot_status, @@ -772,7 +778,7 @@ ENDPROC(__secondary_too_slow) * Checks if the selected granule size is supported by the CPU. * If it isn't, park the CPU */ -ENTRY(__enable_mmu) +SYM_FUNC_START(__enable_mmu) mrs x2, ID_AA64MMFR0_EL1 ubfx x2, x2, #ID_AA64MMFR0_TGRAN_SHIFT, 4 cmp x2, #ID_AA64MMFR0_TGRAN_SUPPORTED @@ -796,9 +802,9 @@ ENTRY(__enable_mmu) dsb nsh isb ret -ENDPROC(__enable_mmu) +SYM_FUNC_END(__enable_mmu) -ENTRY(__cpu_secondary_check52bitva) +SYM_FUNC_START(__cpu_secondary_check52bitva) #ifdef CONFIG_ARM64_VA_BITS_52 ldr_l x0, vabits_actual cmp x0, #52 @@ -816,9 +822,9 @@ ENTRY(__cpu_secondary_check52bitva) #endif 2: ret -ENDPROC(__cpu_secondary_check52bitva) +SYM_FUNC_END(__cpu_secondary_check52bitva) -__no_granule_support: +SYM_FUNC_START_LOCAL(__no_granule_support) /* Indicate that this CPU can't boot and is stuck in the kernel */ update_early_cpu_boot_status \ CPU_STUCK_IN_KERNEL | CPU_STUCK_REASON_NO_GRAN, x1, x2 @@ -826,10 +832,10 @@ __no_granule_support: wfe wfi b 1b -ENDPROC(__no_granule_support) +SYM_FUNC_END(__no_granule_support) #ifdef CONFIG_RELOCATABLE -__relocate_kernel: +SYM_FUNC_START_LOCAL(__relocate_kernel) /* * Iterate over each entry in the relocation table, and apply the * relocations in place. @@ -931,10 +937,10 @@ __relocate_kernel: #endif ret -ENDPROC(__relocate_kernel) +SYM_FUNC_END(__relocate_kernel) #endif -__primary_switch: +SYM_FUNC_START_LOCAL(__primary_switch) #ifdef CONFIG_RANDOMIZE_BASE mov x19, x0 // preserve new SCTLR_EL1 value mrs x20, sctlr_el1 // preserve old SCTLR_EL1 value @@ -977,4 +983,4 @@ __primary_switch: ldr x8, =__primary_switched adrp x0, __PHYS_OFFSET br x8 -ENDPROC(__primary_switch) +SYM_FUNC_END(__primary_switch) diff --git a/arch/arm64/kernel/hibernate-asm.S b/arch/arm64/kernel/hibernate-asm.S index 38bcd4d4e43b..6532105b3e32 100644 --- a/arch/arm64/kernel/hibernate-asm.S +++ b/arch/arm64/kernel/hibernate-asm.S @@ -110,8 +110,6 @@ ENTRY(swsusp_arch_suspend_exit) cbz x24, 3f /* Do we need to re-initialise EL2? */ hvc #0 3: ret - - .ltorg ENDPROC(swsusp_arch_suspend_exit) /* diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S index 73d46070b315..e473ead806ed 100644 --- a/arch/arm64/kernel/hyp-stub.S +++ b/arch/arm64/kernel/hyp-stub.S @@ -63,7 +63,7 @@ el1_sync: beq 9f // Nothing to reset! /* Someone called kvm_call_hyp() against the hyp-stub... */ - ldr x0, =HVC_STUB_ERR + mov_q x0, HVC_STUB_ERR eret 9: mov x0, xzr diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c index dd3ae8081b38..b40c3b0def92 100644 --- a/arch/arm64/kernel/machine_kexec_file.c +++ b/arch/arm64/kernel/machine_kexec_file.c @@ -121,7 +121,7 @@ static int setup_dtb(struct kimage *image, /* add kaslr-seed */ ret = fdt_delprop(dtb, off, FDT_PROP_KASLR_SEED); - if (ret == -FDT_ERR_NOTFOUND) + if (ret == -FDT_ERR_NOTFOUND) ret = 0; else if (ret) goto out; diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index e40b65645c86..4d7879484cec 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -285,6 +285,17 @@ static struct attribute_group armv8_pmuv3_format_attr_group = { #define ARMV8_IDX_COUNTER_LAST(cpu_pmu) \ (ARMV8_IDX_CYCLE_COUNTER + cpu_pmu->num_events - 1) + +/* + * We unconditionally enable ARMv8.5-PMU long event counter support + * (64-bit events) where supported. Indicate if this arm_pmu has long + * event counter support. + */ +static bool armv8pmu_has_long_event(struct arm_pmu *cpu_pmu) +{ + return (cpu_pmu->pmuver >= ID_AA64DFR0_PMUVER_8_5); +} + /* * We must chain two programmable counters for 64 bit events, * except when we have allocated the 64bit cycle counter (for CPU @@ -294,9 +305,11 @@ static struct attribute_group armv8_pmuv3_format_attr_group = { static inline bool armv8pmu_event_is_chained(struct perf_event *event) { int idx = event->hw.idx; + struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); return !WARN_ON(idx < 0) && armv8pmu_event_is_64bit(event) && + !armv8pmu_has_long_event(cpu_pmu) && (idx != ARMV8_IDX_CYCLE_COUNTER); } @@ -345,7 +358,7 @@ static inline void armv8pmu_select_counter(int idx) isb(); } -static inline u32 armv8pmu_read_evcntr(int idx) +static inline u64 armv8pmu_read_evcntr(int idx) { armv8pmu_select_counter(idx); return read_sysreg(pmxevcntr_el0); @@ -362,6 +375,44 @@ static inline u64 armv8pmu_read_hw_counter(struct perf_event *event) return val; } +/* + * The cycle counter is always a 64-bit counter. When ARMV8_PMU_PMCR_LP + * is set the event counters also become 64-bit counters. Unless the + * user has requested a long counter (attr.config1) then we want to + * interrupt upon 32-bit overflow - we achieve this by applying a bias. + */ +static bool armv8pmu_event_needs_bias(struct perf_event *event) +{ + struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); + struct hw_perf_event *hwc = &event->hw; + int idx = hwc->idx; + + if (armv8pmu_event_is_64bit(event)) + return false; + + if (armv8pmu_has_long_event(cpu_pmu) || + idx == ARMV8_IDX_CYCLE_COUNTER) + return true; + + return false; +} + +static u64 armv8pmu_bias_long_counter(struct perf_event *event, u64 value) +{ + if (armv8pmu_event_needs_bias(event)) + value |= GENMASK(63, 32); + + return value; +} + +static u64 armv8pmu_unbias_long_counter(struct perf_event *event, u64 value) +{ + if (armv8pmu_event_needs_bias(event)) + value &= ~GENMASK(63, 32); + + return value; +} + static u64 armv8pmu_read_counter(struct perf_event *event) { struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); @@ -377,10 +428,10 @@ static u64 armv8pmu_read_counter(struct perf_event *event) else value = armv8pmu_read_hw_counter(event); - return value; + return armv8pmu_unbias_long_counter(event, value); } -static inline void armv8pmu_write_evcntr(int idx, u32 value) +static inline void armv8pmu_write_evcntr(int idx, u64 value) { armv8pmu_select_counter(idx); write_sysreg(value, pmxevcntr_el0); @@ -405,20 +456,14 @@ static void armv8pmu_write_counter(struct perf_event *event, u64 value) struct hw_perf_event *hwc = &event->hw; int idx = hwc->idx; + value = armv8pmu_bias_long_counter(event, value); + if (!armv8pmu_counter_valid(cpu_pmu, idx)) pr_err("CPU%u writing wrong counter %d\n", smp_processor_id(), idx); - else if (idx == ARMV8_IDX_CYCLE_COUNTER) { - /* - * The cycles counter is really a 64-bit counter. - * When treating it as a 32-bit counter, we only count - * the lower 32 bits, and set the upper 32-bits so that - * we get an interrupt upon 32-bit overflow. - */ - if (!armv8pmu_event_is_64bit(event)) - value |= 0xffffffff00000000ULL; + else if (idx == ARMV8_IDX_CYCLE_COUNTER) write_sysreg(value, pmccntr_el0); - } else + else armv8pmu_write_hw_counter(event, value); } @@ -450,86 +495,74 @@ static inline void armv8pmu_write_event_type(struct perf_event *event) } } -static inline int armv8pmu_enable_counter(int idx) +static u32 armv8pmu_event_cnten_mask(struct perf_event *event) { - u32 counter = ARMV8_IDX_TO_COUNTER(idx); - write_sysreg(BIT(counter), pmcntenset_el0); - return idx; + int counter = ARMV8_IDX_TO_COUNTER(event->hw.idx); + u32 mask = BIT(counter); + + if (armv8pmu_event_is_chained(event)) + mask |= BIT(counter - 1); + return mask; +} + +static inline void armv8pmu_enable_counter(u32 mask) +{ + write_sysreg(mask, pmcntenset_el0); } static inline void armv8pmu_enable_event_counter(struct perf_event *event) { struct perf_event_attr *attr = &event->attr; - int idx = event->hw.idx; - u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx)); + u32 mask = armv8pmu_event_cnten_mask(event); - if (armv8pmu_event_is_chained(event)) - counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1)); - - kvm_set_pmu_events(counter_bits, attr); + kvm_set_pmu_events(mask, attr); /* We rely on the hypervisor switch code to enable guest counters */ - if (!kvm_pmu_counter_deferred(attr)) { - armv8pmu_enable_counter(idx); - if (armv8pmu_event_is_chained(event)) - armv8pmu_enable_counter(idx - 1); - } + if (!kvm_pmu_counter_deferred(attr)) + armv8pmu_enable_counter(mask); } -static inline int armv8pmu_disable_counter(int idx) +static inline void armv8pmu_disable_counter(u32 mask) { - u32 counter = ARMV8_IDX_TO_COUNTER(idx); - write_sysreg(BIT(counter), pmcntenclr_el0); - return idx; + write_sysreg(mask, pmcntenclr_el0); } static inline void armv8pmu_disable_event_counter(struct perf_event *event) { - struct hw_perf_event *hwc = &event->hw; struct perf_event_attr *attr = &event->attr; - int idx = hwc->idx; - u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx)); - - if (armv8pmu_event_is_chained(event)) - counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1)); + u32 mask = armv8pmu_event_cnten_mask(event); - kvm_clr_pmu_events(counter_bits); + kvm_clr_pmu_events(mask); /* We rely on the hypervisor switch code to disable guest counters */ - if (!kvm_pmu_counter_deferred(attr)) { - if (armv8pmu_event_is_chained(event)) - armv8pmu_disable_counter(idx - 1); - armv8pmu_disable_counter(idx); - } + if (!kvm_pmu_counter_deferred(attr)) + armv8pmu_disable_counter(mask); } -static inline int armv8pmu_enable_intens(int idx) +static inline void armv8pmu_enable_intens(u32 mask) { - u32 counter = ARMV8_IDX_TO_COUNTER(idx); - write_sysreg(BIT(counter), pmintenset_el1); - return idx; + write_sysreg(mask, pmintenset_el1); } -static inline int armv8pmu_enable_event_irq(struct perf_event *event) +static inline void armv8pmu_enable_event_irq(struct perf_event *event) { - return armv8pmu_enable_intens(event->hw.idx); + u32 counter = ARMV8_IDX_TO_COUNTER(event->hw.idx); + armv8pmu_enable_intens(BIT(counter)); } -static inline int armv8pmu_disable_intens(int idx) +static inline void armv8pmu_disable_intens(u32 mask) { - u32 counter = ARMV8_IDX_TO_COUNTER(idx); - write_sysreg(BIT(counter), pmintenclr_el1); + write_sysreg(mask, pmintenclr_el1); isb(); /* Clear the overflow flag in case an interrupt is pending. */ - write_sysreg(BIT(counter), pmovsclr_el0); + write_sysreg(mask, pmovsclr_el0); isb(); - - return idx; } -static inline int armv8pmu_disable_event_irq(struct perf_event *event) +static inline void armv8pmu_disable_event_irq(struct perf_event *event) { - return armv8pmu_disable_intens(event->hw.idx); + u32 counter = ARMV8_IDX_TO_COUNTER(event->hw.idx); + armv8pmu_disable_intens(BIT(counter)); } static inline u32 armv8pmu_getreset_flags(void) @@ -743,7 +776,8 @@ static int armv8pmu_get_event_idx(struct pmu_hw_events *cpuc, /* * Otherwise use events counters */ - if (armv8pmu_event_is_64bit(event)) + if (armv8pmu_event_is_64bit(event) && + !armv8pmu_has_long_event(cpu_pmu)) return armv8pmu_get_chain_idx(cpuc, cpu_pmu); else return armv8pmu_get_single_idx(cpuc, cpu_pmu); @@ -815,13 +849,11 @@ static int armv8pmu_filter_match(struct perf_event *event) static void armv8pmu_reset(void *info) { struct arm_pmu *cpu_pmu = (struct arm_pmu *)info; - u32 idx, nb_cnt = cpu_pmu->num_events; + u32 pmcr; /* The counter and interrupt enable registers are unknown at reset. */ - for (idx = ARMV8_IDX_CYCLE_COUNTER; idx < nb_cnt; ++idx) { - armv8pmu_disable_counter(idx); - armv8pmu_disable_intens(idx); - } + armv8pmu_disable_counter(U32_MAX); + armv8pmu_disable_intens(U32_MAX); /* Clear the counters we flip at guest entry/exit */ kvm_clr_pmu_events(U32_MAX); @@ -830,8 +862,13 @@ static void armv8pmu_reset(void *info) * Initialize & Reset PMNC. Request overflow interrupt for * 64 bit cycle counter but cheat in armv8pmu_write_counter(). */ - armv8pmu_pmcr_write(ARMV8_PMU_PMCR_P | ARMV8_PMU_PMCR_C | - ARMV8_PMU_PMCR_LC); + pmcr = ARMV8_PMU_PMCR_P | ARMV8_PMU_PMCR_C | ARMV8_PMU_PMCR_LC; + + /* Enable long event counter support where available */ + if (armv8pmu_has_long_event(cpu_pmu)) + pmcr |= ARMV8_PMU_PMCR_LP; + + armv8pmu_pmcr_write(pmcr); } static int __armv8_pmuv3_map_event(struct perf_event *event, @@ -914,6 +951,7 @@ static void __armv8pmu_probe_pmu(void *info) if (pmuver == 0xf || pmuver == 0) return; + cpu_pmu->pmuver = pmuver; probe->present = true; /* Read the nb of CNTx counters supported from PMNC */ @@ -953,7 +991,10 @@ static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu) return probe.present ? 0 : -ENODEV; } -static int armv8_pmu_init(struct arm_pmu *cpu_pmu) +static int armv8_pmu_init(struct arm_pmu *cpu_pmu, char *name, + int (*map_event)(struct perf_event *event), + const struct attribute_group *events, + const struct attribute_group *format) { int ret = armv8pmu_probe_pmu(cpu_pmu); if (ret) @@ -972,144 +1013,127 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu) cpu_pmu->set_event_filter = armv8pmu_set_event_filter; cpu_pmu->filter_match = armv8pmu_filter_match; + cpu_pmu->name = name; + cpu_pmu->map_event = map_event; + cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] = events ? + events : &armv8_pmuv3_events_attr_group; + cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] = format ? + format : &armv8_pmuv3_format_attr_group; + return 0; } static int armv8_pmuv3_init(struct arm_pmu *cpu_pmu) { - int ret = armv8_pmu_init(cpu_pmu); - if (ret) - return ret; - - cpu_pmu->name = "armv8_pmuv3"; - cpu_pmu->map_event = armv8_pmuv3_map_event; - cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] = - &armv8_pmuv3_events_attr_group; - cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] = - &armv8_pmuv3_format_attr_group; + return armv8_pmu_init(cpu_pmu, "armv8_pmuv3", + armv8_pmuv3_map_event, NULL, NULL); +} - return 0; +static int armv8_a34_pmu_init(struct arm_pmu *cpu_pmu) +{ + return armv8_pmu_init(cpu_pmu, "armv8_cortex_a34", + armv8_pmuv3_map_event, NULL, NULL); } static int armv8_a35_pmu_init(struct arm_pmu *cpu_pmu) { - int ret = armv8_pmu_init(cpu_pmu); - if (ret) - return ret; - - cpu_pmu->name = "armv8_cortex_a35"; - cpu_pmu->map_event = armv8_a53_map_event; - cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] = - &armv8_pmuv3_events_attr_group; - cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] = - &armv8_pmuv3_format_attr_group; - - return 0; + return armv8_pmu_init(cpu_pmu, "armv8_cortex_a35", + armv8_a53_map_event, NULL, NULL); } static int armv8_a53_pmu_init(struct arm_pmu *cpu_pmu) { - int ret = armv8_pmu_init(cpu_pmu); - if (ret) - return ret; - - cpu_pmu->name = "armv8_cortex_a53"; - cpu_pmu->map_event = armv8_a53_map_event; - cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] = - &armv8_pmuv3_events_attr_group; - cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] = - &armv8_pmuv3_format_attr_group; + return armv8_pmu_init(cpu_pmu, "armv8_cortex_a53", + armv8_a53_map_event, NULL, NULL); +} - return 0; +static int armv8_a55_pmu_init(struct arm_pmu *cpu_pmu) +{ + return armv8_pmu_init(cpu_pmu, "armv8_cortex_a55", + armv8_pmuv3_map_event, NULL, NULL); } static int armv8_a57_pmu_init(struct arm_pmu *cpu_pmu) { - int ret = armv8_pmu_init(cpu_pmu); - if (ret) - return ret; - - cpu_pmu->name = "armv8_cortex_a57"; - cpu_pmu->map_event = armv8_a57_map_event; - cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] = - &armv8_pmuv3_events_attr_group; - cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] = - &armv8_pmuv3_format_attr_group; + return armv8_pmu_init(cpu_pmu, "armv8_cortex_a57", + armv8_a57_map_event, NULL, NULL); +} - return 0; +static int armv8_a65_pmu_init(struct arm_pmu *cpu_pmu) +{ + return armv8_pmu_init(cpu_pmu, "armv8_cortex_a65", + armv8_pmuv3_map_event, NULL, NULL); } static int armv8_a72_pmu_init(struct arm_pmu *cpu_pmu) { - int ret = armv8_pmu_init(cpu_pmu); - if (ret) - return ret; - - cpu_pmu->name = "armv8_cortex_a72"; - cpu_pmu->map_event = armv8_a57_map_event; - cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] = - &armv8_pmuv3_events_attr_group; - cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] = - &armv8_pmuv3_format_attr_group; - - return 0; + return armv8_pmu_init(cpu_pmu, "armv8_cortex_a72", + armv8_a57_map_event, NULL, NULL); } static int armv8_a73_pmu_init(struct arm_pmu *cpu_pmu) { - int ret = armv8_pmu_init(cpu_pmu); - if (ret) - return ret; - - cpu_pmu->name = "armv8_cortex_a73"; - cpu_pmu->map_event = armv8_a73_map_event; - cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] = - &armv8_pmuv3_events_attr_group; - cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] = - &armv8_pmuv3_format_attr_group; + return armv8_pmu_init(cpu_pmu, "armv8_cortex_a73", + armv8_a73_map_event, NULL, NULL); +} - return 0; +static int armv8_a75_pmu_init(struct arm_pmu *cpu_pmu) +{ + return armv8_pmu_init(cpu_pmu, "armv8_cortex_a75", + armv8_pmuv3_map_event, NULL, NULL); } -static int armv8_thunder_pmu_init(struct arm_pmu *cpu_pmu) +static int armv8_a76_pmu_init(struct arm_pmu *cpu_pmu) { - int ret = armv8_pmu_init(cpu_pmu); - if (ret) - return ret; + return armv8_pmu_init(cpu_pmu, "armv8_cortex_a76", + armv8_pmuv3_map_event, NULL, NULL); +} - cpu_pmu->name = "armv8_cavium_thunder"; - cpu_pmu->map_event = armv8_thunder_map_event; - cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] = - &armv8_pmuv3_events_attr_group; - cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] = - &armv8_pmuv3_format_attr_group; +static int armv8_a77_pmu_init(struct arm_pmu *cpu_pmu) +{ + return armv8_pmu_init(cpu_pmu, "armv8_cortex_a77", + armv8_pmuv3_map_event, NULL, NULL); +} - return 0; +static int armv8_e1_pmu_init(struct arm_pmu *cpu_pmu) +{ + return armv8_pmu_init(cpu_pmu, "armv8_neoverse_e1", + armv8_pmuv3_map_event, NULL, NULL); } -static int armv8_vulcan_pmu_init(struct arm_pmu *cpu_pmu) +static int armv8_n1_pmu_init(struct arm_pmu *cpu_pmu) { - int ret = armv8_pmu_init(cpu_pmu); - if (ret) - return ret; + return armv8_pmu_init(cpu_pmu, "armv8_neoverse_n1", + armv8_pmuv3_map_event, NULL, NULL); +} - cpu_pmu->name = "armv8_brcm_vulcan"; - cpu_pmu->map_event = armv8_vulcan_map_event; - cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] = - &armv8_pmuv3_events_attr_group; - cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] = - &armv8_pmuv3_format_attr_group; +static int armv8_thunder_pmu_init(struct arm_pmu *cpu_pmu) +{ + return armv8_pmu_init(cpu_pmu, "armv8_cavium_thunder", + armv8_thunder_map_event, NULL, NULL); +} - return 0; +static int armv8_vulcan_pmu_init(struct arm_pmu *cpu_pmu) +{ + return armv8_pmu_init(cpu_pmu, "armv8_brcm_vulcan", + armv8_vulcan_map_event, NULL, NULL); } static const struct of_device_id armv8_pmu_of_device_ids[] = { {.compatible = "arm,armv8-pmuv3", .data = armv8_pmuv3_init}, + {.compatible = "arm,cortex-a34-pmu", .data = armv8_a34_pmu_init}, {.compatible = "arm,cortex-a35-pmu", .data = armv8_a35_pmu_init}, {.compatible = "arm,cortex-a53-pmu", .data = armv8_a53_pmu_init}, + {.compatible = "arm,cortex-a55-pmu", .data = armv8_a55_pmu_init}, {.compatible = "arm,cortex-a57-pmu", .data = armv8_a57_pmu_init}, + {.compatible = "arm,cortex-a65-pmu", .data = armv8_a65_pmu_init}, {.compatible = "arm,cortex-a72-pmu", .data = armv8_a72_pmu_init}, {.compatible = "arm,cortex-a73-pmu", .data = armv8_a73_pmu_init}, + {.compatible = "arm,cortex-a75-pmu", .data = armv8_a75_pmu_init}, + {.compatible = "arm,cortex-a76-pmu", .data = armv8_a76_pmu_init}, + {.compatible = "arm,cortex-a77-pmu", .data = armv8_a77_pmu_init}, + {.compatible = "arm,neoverse-e1-pmu", .data = armv8_e1_pmu_init}, + {.compatible = "arm,neoverse-n1-pmu", .data = armv8_n1_pmu_init}, {.compatible = "cavium,thunder-pmu", .data = armv8_thunder_pmu_init}, {.compatible = "brcm,vulcan-pmu", .data = armv8_vulcan_pmu_init}, {}, diff --git a/arch/arm64/kernel/pointer_auth.c b/arch/arm64/kernel/pointer_auth.c index c507b584259d..1e77736a4f66 100644 --- a/arch/arm64/kernel/pointer_auth.c +++ b/arch/arm64/kernel/pointer_auth.c @@ -9,7 +9,7 @@ int ptrauth_prctl_reset_keys(struct task_struct *tsk, unsigned long arg) { - struct ptrauth_keys *keys = &tsk->thread.keys_user; + struct ptrauth_keys_user *keys = &tsk->thread.keys_user; unsigned long addr_key_mask = PR_PAC_APIAKEY | PR_PAC_APIBKEY | PR_PAC_APDAKEY | PR_PAC_APDBKEY; unsigned long key_mask = addr_key_mask | PR_PAC_APGAKEY; @@ -18,8 +18,7 @@ int ptrauth_prctl_reset_keys(struct task_struct *tsk, unsigned long arg) return -EINVAL; if (!arg) { - ptrauth_keys_init(keys); - ptrauth_keys_switch(keys); + ptrauth_keys_init_user(keys); return 0; } @@ -41,7 +40,5 @@ int ptrauth_prctl_reset_keys(struct task_struct *tsk, unsigned long arg) if (arg & PR_PAC_APGAKEY) get_random_bytes(&keys->apga, sizeof(keys->apga)); - ptrauth_keys_switch(keys); - return 0; } diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 3e5a6ad66cbe..56be4cbf771f 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -262,7 +262,7 @@ void __show_regs(struct pt_regs *regs) if (!user_mode(regs)) { printk("pc : %pS\n", (void *)regs->pc); - printk("lr : %pS\n", (void *)lr); + printk("lr : %pS\n", (void *)ptrauth_strip_insn_pac(lr)); } else { printk("pc : %016llx\n", regs->pc); printk("lr : %016llx\n", lr); @@ -376,6 +376,8 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long stack_start, */ fpsimd_flush_task_state(p); + ptrauth_thread_init_kernel(p); + if (likely(!(p->flags & PF_KTHREAD))) { *childregs = *current_pt_regs(); childregs->regs[0] = 0; @@ -512,7 +514,6 @@ __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev, contextidr_thread_switch(next); entry_task_switch(next); uao_thread_switch(next); - ptrauth_thread_switch(next); ssbs_thread_switch(next); /* diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index cd6e5fa48b9c..b3d3005d9515 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -999,7 +999,7 @@ static struct ptrauth_key pac_key_from_user(__uint128_t ukey) } static void pac_address_keys_to_user(struct user_pac_address_keys *ukeys, - const struct ptrauth_keys *keys) + const struct ptrauth_keys_user *keys) { ukeys->apiakey = pac_key_to_user(&keys->apia); ukeys->apibkey = pac_key_to_user(&keys->apib); @@ -1007,7 +1007,7 @@ static void pac_address_keys_to_user(struct user_pac_address_keys *ukeys, ukeys->apdbkey = pac_key_to_user(&keys->apdb); } -static void pac_address_keys_from_user(struct ptrauth_keys *keys, +static void pac_address_keys_from_user(struct ptrauth_keys_user *keys, const struct user_pac_address_keys *ukeys) { keys->apia = pac_key_from_user(ukeys->apiakey); @@ -1021,7 +1021,7 @@ static int pac_address_keys_get(struct task_struct *target, unsigned int pos, unsigned int count, void *kbuf, void __user *ubuf) { - struct ptrauth_keys *keys = &target->thread.keys_user; + struct ptrauth_keys_user *keys = &target->thread.keys_user; struct user_pac_address_keys user_keys; if (!system_supports_address_auth()) @@ -1038,7 +1038,7 @@ static int pac_address_keys_set(struct task_struct *target, unsigned int pos, unsigned int count, const void *kbuf, const void __user *ubuf) { - struct ptrauth_keys *keys = &target->thread.keys_user; + struct ptrauth_keys_user *keys = &target->thread.keys_user; struct user_pac_address_keys user_keys; int ret; @@ -1056,12 +1056,12 @@ static int pac_address_keys_set(struct task_struct *target, } static void pac_generic_keys_to_user(struct user_pac_generic_keys *ukeys, - const struct ptrauth_keys *keys) + const struct ptrauth_keys_user *keys) { ukeys->apgakey = pac_key_to_user(&keys->apga); } -static void pac_generic_keys_from_user(struct ptrauth_keys *keys, +static void pac_generic_keys_from_user(struct ptrauth_keys_user *keys, const struct user_pac_generic_keys *ukeys) { keys->apga = pac_key_from_user(ukeys->apgakey); @@ -1072,7 +1072,7 @@ static int pac_generic_keys_get(struct task_struct *target, unsigned int pos, unsigned int count, void *kbuf, void __user *ubuf) { - struct ptrauth_keys *keys = &target->thread.keys_user; + struct ptrauth_keys_user *keys = &target->thread.keys_user; struct user_pac_generic_keys user_keys; if (!system_supports_generic_auth()) @@ -1089,7 +1089,7 @@ static int pac_generic_keys_set(struct task_struct *target, unsigned int pos, unsigned int count, const void *kbuf, const void __user *ubuf) { - struct ptrauth_keys *keys = &target->thread.keys_user; + struct ptrauth_keys_user *keys = &target->thread.keys_user; struct user_pac_generic_keys user_keys; int ret; diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S index c1d7db71a726..c40ce496c78b 100644 --- a/arch/arm64/kernel/relocate_kernel.S +++ b/arch/arm64/kernel/relocate_kernel.S @@ -41,7 +41,7 @@ ENTRY(arm64_relocate_new_kernel) cmp x0, #CurrentEL_EL2 b.ne 1f mrs x0, sctlr_el2 - ldr x1, =SCTLR_ELx_FLAGS + mov_q x1, SCTLR_ELx_FLAGS bic x0, x0, x1 pre_disable_mmu_workaround msr sctlr_el2, x0 @@ -113,8 +113,6 @@ ENTRY(arm64_relocate_new_kernel) ENDPROC(arm64_relocate_new_kernel) -.ltorg - .align 3 /* To keep the 64-bit values below naturally aligned. */ .Lcopy_end: diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index a34890bf309f..3fd2c11c09fc 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -344,7 +344,7 @@ void __init setup_arch(char **cmdline_p) else psci_acpi_init(); - cpu_read_bootcpu_ops(); + init_bootcpu_ops(); smp_init_cpus(); smp_build_mpidr_hash(); @@ -371,8 +371,10 @@ void __init setup_arch(char **cmdline_p) static inline bool cpu_can_disable(unsigned int cpu) { #ifdef CONFIG_HOTPLUG_CPU - if (cpu_ops[cpu] && cpu_ops[cpu]->cpu_can_disable) - return cpu_ops[cpu]->cpu_can_disable(cpu); + const struct cpu_operations *ops = get_cpu_ops(cpu); + + if (ops && ops->cpu_can_disable) + return ops->cpu_can_disable(cpu); #endif return false; } diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S index f5b04dd8a710..7b2f2e650c44 100644 --- a/arch/arm64/kernel/sleep.S +++ b/arch/arm64/kernel/sleep.S @@ -3,6 +3,7 @@ #include <linux/linkage.h> #include <asm/asm-offsets.h> #include <asm/assembler.h> +#include <asm/smp.h> .text /* @@ -99,6 +100,7 @@ ENDPROC(__cpu_suspend_enter) .pushsection ".idmap.text", "awx" ENTRY(cpu_resume) bl el2_setup // if in EL2 drop to EL1 cleanly + mov x0, #ARM64_CPU_RUNTIME bl __cpu_setup /* enable the MMU early - so we can access sleep_save_stash by va */ adrp x1, swapper_pg_dir diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 5407bf5d98ac..061f60fe452f 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -93,8 +93,10 @@ static inline int op_cpu_kill(unsigned int cpu) */ static int boot_secondary(unsigned int cpu, struct task_struct *idle) { - if (cpu_ops[cpu]->cpu_boot) - return cpu_ops[cpu]->cpu_boot(cpu); + const struct cpu_operations *ops = get_cpu_ops(cpu); + + if (ops->cpu_boot) + return ops->cpu_boot(cpu); return -EOPNOTSUPP; } @@ -112,63 +114,66 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle) */ secondary_data.task = idle; secondary_data.stack = task_stack_page(idle) + THREAD_SIZE; +#if defined(CONFIG_ARM64_PTR_AUTH) + secondary_data.ptrauth_key.apia.lo = idle->thread.keys_kernel.apia.lo; + secondary_data.ptrauth_key.apia.hi = idle->thread.keys_kernel.apia.hi; +#endif update_cpu_boot_status(CPU_MMU_OFF); __flush_dcache_area(&secondary_data, sizeof(secondary_data)); - /* - * Now bring the CPU into our world. - */ + /* Now bring the CPU into our world */ ret = boot_secondary(cpu, idle); - if (ret == 0) { - /* - * CPU was successfully started, wait for it to come online or - * time out. - */ - wait_for_completion_timeout(&cpu_running, - msecs_to_jiffies(5000)); - - if (!cpu_online(cpu)) { - pr_crit("CPU%u: failed to come online\n", cpu); - ret = -EIO; - } - } else { + if (ret) { pr_err("CPU%u: failed to boot: %d\n", cpu, ret); return ret; } + /* + * CPU was successfully started, wait for it to come online or + * time out. + */ + wait_for_completion_timeout(&cpu_running, + msecs_to_jiffies(5000)); + if (cpu_online(cpu)) + return 0; + + pr_crit("CPU%u: failed to come online\n", cpu); secondary_data.task = NULL; secondary_data.stack = NULL; +#if defined(CONFIG_ARM64_PTR_AUTH) + secondary_data.ptrauth_key.apia.lo = 0; + secondary_data.ptrauth_key.apia.hi = 0; +#endif __flush_dcache_area(&secondary_data, sizeof(secondary_data)); status = READ_ONCE(secondary_data.status); - if (ret && status) { - - if (status == CPU_MMU_OFF) - status = READ_ONCE(__early_cpu_boot_status); + if (status == CPU_MMU_OFF) + status = READ_ONCE(__early_cpu_boot_status); - switch (status & CPU_BOOT_STATUS_MASK) { - default: - pr_err("CPU%u: failed in unknown state : 0x%lx\n", - cpu, status); - cpus_stuck_in_kernel++; - break; - case CPU_KILL_ME: - if (!op_cpu_kill(cpu)) { - pr_crit("CPU%u: died during early boot\n", cpu); - break; - } - pr_crit("CPU%u: may not have shut down cleanly\n", cpu); - /* Fall through */ - case CPU_STUCK_IN_KERNEL: - pr_crit("CPU%u: is stuck in kernel\n", cpu); - if (status & CPU_STUCK_REASON_52_BIT_VA) - pr_crit("CPU%u: does not support 52-bit VAs\n", cpu); - if (status & CPU_STUCK_REASON_NO_GRAN) - pr_crit("CPU%u: does not support %luK granule \n", cpu, PAGE_SIZE / SZ_1K); - cpus_stuck_in_kernel++; + switch (status & CPU_BOOT_STATUS_MASK) { + default: + pr_err("CPU%u: failed in unknown state : 0x%lx\n", + cpu, status); + cpus_stuck_in_kernel++; + break; + case CPU_KILL_ME: + if (!op_cpu_kill(cpu)) { + pr_crit("CPU%u: died during early boot\n", cpu); break; - case CPU_PANIC_KERNEL: - panic("CPU%u detected unsupported configuration\n", cpu); } + pr_crit("CPU%u: may not have shut down cleanly\n", cpu); + /* Fall through */ + case CPU_STUCK_IN_KERNEL: + pr_crit("CPU%u: is stuck in kernel\n", cpu); + if (status & CPU_STUCK_REASON_52_BIT_VA) + pr_crit("CPU%u: does not support 52-bit VAs\n", cpu); + if (status & CPU_STUCK_REASON_NO_GRAN) { + pr_crit("CPU%u: does not support %luK granule\n", + cpu, PAGE_SIZE / SZ_1K); + } + cpus_stuck_in_kernel++; + break; + case CPU_PANIC_KERNEL: + panic("CPU%u detected unsupported configuration\n", cpu); } return ret; @@ -196,6 +201,7 @@ asmlinkage notrace void secondary_start_kernel(void) { u64 mpidr = read_cpuid_mpidr() & MPIDR_HWID_BITMASK; struct mm_struct *mm = &init_mm; + const struct cpu_operations *ops; unsigned int cpu; cpu = task_cpu(current); @@ -227,8 +233,9 @@ asmlinkage notrace void secondary_start_kernel(void) */ check_local_cpu_capabilities(); - if (cpu_ops[cpu]->cpu_postboot) - cpu_ops[cpu]->cpu_postboot(); + ops = get_cpu_ops(cpu); + if (ops->cpu_postboot) + ops->cpu_postboot(); /* * Log the CPU info before it is marked online and might get read. @@ -266,19 +273,21 @@ asmlinkage notrace void secondary_start_kernel(void) #ifdef CONFIG_HOTPLUG_CPU static int op_cpu_disable(unsigned int cpu) { + const struct cpu_operations *ops = get_cpu_ops(cpu); + /* * If we don't have a cpu_die method, abort before we reach the point * of no return. CPU0 may not have an cpu_ops, so test for it. */ - if (!cpu_ops[cpu] || !cpu_ops[cpu]->cpu_die) + if (!ops || !ops->cpu_die) return -EOPNOTSUPP; /* * We may need to abort a hot unplug for some other mechanism-specific * reason. */ - if (cpu_ops[cpu]->cpu_disable) - return cpu_ops[cpu]->cpu_disable(cpu); + if (ops->cpu_disable) + return ops->cpu_disable(cpu); return 0; } @@ -314,15 +323,17 @@ int __cpu_disable(void) static int op_cpu_kill(unsigned int cpu) { + const struct cpu_operations *ops = get_cpu_ops(cpu); + /* * If we have no means of synchronising with the dying CPU, then assume * that it is really dead. We can only wait for an arbitrary length of * time and hope that it's dead, so let's skip the wait and just hope. */ - if (!cpu_ops[cpu]->cpu_kill) + if (!ops->cpu_kill) return 0; - return cpu_ops[cpu]->cpu_kill(cpu); + return ops->cpu_kill(cpu); } /* @@ -357,6 +368,7 @@ void __cpu_die(unsigned int cpu) void cpu_die(void) { unsigned int cpu = smp_processor_id(); + const struct cpu_operations *ops = get_cpu_ops(cpu); idle_task_exit(); @@ -370,12 +382,22 @@ void cpu_die(void) * mechanism must perform all required cache maintenance to ensure that * no dirty lines are lost in the process of shutting down the CPU. */ - cpu_ops[cpu]->cpu_die(cpu); + ops->cpu_die(cpu); BUG(); } #endif +static void __cpu_try_die(int cpu) +{ +#ifdef CONFIG_HOTPLUG_CPU + const struct cpu_operations *ops = get_cpu_ops(cpu); + + if (ops && ops->cpu_die) + ops->cpu_die(cpu); +#endif +} + /* * Kill the calling secondary CPU, early in bringup before it is turned * online. @@ -389,12 +411,11 @@ void cpu_die_early(void) /* Mark this CPU absent */ set_cpu_present(cpu, 0); -#ifdef CONFIG_HOTPLUG_CPU - update_cpu_boot_status(CPU_KILL_ME); - /* Check if we can park ourselves */ - if (cpu_ops[cpu] && cpu_ops[cpu]->cpu_die) - cpu_ops[cpu]->cpu_die(cpu); -#endif + if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) { + update_cpu_boot_status(CPU_KILL_ME); + __cpu_try_die(cpu); + } + update_cpu_boot_status(CPU_STUCK_IN_KERNEL); cpu_park_loop(); @@ -488,10 +509,13 @@ static bool __init is_mpidr_duplicate(unsigned int cpu, u64 hwid) */ static int __init smp_cpu_setup(int cpu) { - if (cpu_read_ops(cpu)) + const struct cpu_operations *ops; + + if (init_cpu_ops(cpu)) return -ENODEV; - if (cpu_ops[cpu]->cpu_init(cpu)) + ops = get_cpu_ops(cpu); + if (ops->cpu_init(cpu)) return -ENODEV; set_cpu_possible(cpu, true); @@ -714,6 +738,7 @@ void __init smp_init_cpus(void) void __init smp_prepare_cpus(unsigned int max_cpus) { + const struct cpu_operations *ops; int err; unsigned int cpu; unsigned int this_cpu; @@ -744,10 +769,11 @@ void __init smp_prepare_cpus(unsigned int max_cpus) if (cpu == smp_processor_id()) continue; - if (!cpu_ops[cpu]) + ops = get_cpu_ops(cpu); + if (!ops) continue; - err = cpu_ops[cpu]->cpu_prepare(cpu); + err = ops->cpu_prepare(cpu); if (err) continue; @@ -863,10 +889,8 @@ static void ipi_cpu_crash_stop(unsigned int cpu, struct pt_regs *regs) local_irq_disable(); sdei_mask_local_cpu(); -#ifdef CONFIG_HOTPLUG_CPU - if (cpu_ops[cpu]->cpu_die) - cpu_ops[cpu]->cpu_die(cpu); -#endif + if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) + __cpu_try_die(cpu); /* just in case */ cpu_park_loop(); @@ -1059,8 +1083,9 @@ static bool have_cpu_die(void) { #ifdef CONFIG_HOTPLUG_CPU int any_cpu = raw_smp_processor_id(); + const struct cpu_operations *ops = get_cpu_ops(any_cpu); - if (cpu_ops[any_cpu] && cpu_ops[any_cpu]->cpu_die) + if (ops && ops->cpu_die) return true; #endif return false; diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c index a336cb124320..139679c745bf 100644 --- a/arch/arm64/kernel/stacktrace.c +++ b/arch/arm64/kernel/stacktrace.c @@ -14,6 +14,7 @@ #include <linux/stacktrace.h> #include <asm/irq.h> +#include <asm/pointer_auth.h> #include <asm/stack_pointer.h> #include <asm/stacktrace.h> @@ -86,7 +87,7 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame) #ifdef CONFIG_FUNCTION_GRAPH_TRACER if (tsk->ret_stack && - (frame->pc == (unsigned long)return_to_handler)) { + (ptrauth_strip_insn_pac(frame->pc) == (unsigned long)return_to_handler)) { struct ftrace_ret_stack *ret_stack; /* * This is a case where function graph tracer has @@ -101,6 +102,8 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame) } #endif /* CONFIG_FUNCTION_GRAPH_TRACER */ + frame->pc = ptrauth_strip_insn_pac(frame->pc); + /* * Frames created upon entry from EL0 have NULL FP and PC values, so * don't bother reporting these. Frames created by __noreturn functions diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c index fa9528dfd0ce..0801a0f3c156 100644 --- a/arch/arm64/kernel/topology.c +++ b/arch/arm64/kernel/topology.c @@ -14,6 +14,7 @@ #include <linux/acpi.h> #include <linux/arch_topology.h> #include <linux/cacheinfo.h> +#include <linux/cpufreq.h> #include <linux/init.h> #include <linux/percpu.h> @@ -120,4 +121,183 @@ int __init parse_acpi_topology(void) } #endif +#ifdef CONFIG_ARM64_AMU_EXTN +#undef pr_fmt +#define pr_fmt(fmt) "AMU: " fmt + +static DEFINE_PER_CPU_READ_MOSTLY(unsigned long, arch_max_freq_scale); +static DEFINE_PER_CPU(u64, arch_const_cycles_prev); +static DEFINE_PER_CPU(u64, arch_core_cycles_prev); +static cpumask_var_t amu_fie_cpus; + +/* Initialize counter reference per-cpu variables for the current CPU */ +void init_cpu_freq_invariance_counters(void) +{ + this_cpu_write(arch_core_cycles_prev, + read_sysreg_s(SYS_AMEVCNTR0_CORE_EL0)); + this_cpu_write(arch_const_cycles_prev, + read_sysreg_s(SYS_AMEVCNTR0_CONST_EL0)); +} + +static int validate_cpu_freq_invariance_counters(int cpu) +{ + u64 max_freq_hz, ratio; + + if (!cpu_has_amu_feat(cpu)) { + pr_debug("CPU%d: counters are not supported.\n", cpu); + return -EINVAL; + } + + if (unlikely(!per_cpu(arch_const_cycles_prev, cpu) || + !per_cpu(arch_core_cycles_prev, cpu))) { + pr_debug("CPU%d: cycle counters are not enabled.\n", cpu); + return -EINVAL; + } + + /* Convert maximum frequency from KHz to Hz and validate */ + max_freq_hz = cpufreq_get_hw_max_freq(cpu) * 1000; + if (unlikely(!max_freq_hz)) { + pr_debug("CPU%d: invalid maximum frequency.\n", cpu); + return -EINVAL; + } + + /* + * Pre-compute the fixed ratio between the frequency of the constant + * counter and the maximum frequency of the CPU. + * + * const_freq + * arch_max_freq_scale = ---------------- * SCHED_CAPACITY_SCALE² + * cpuinfo_max_freq + * + * We use a factor of 2 * SCHED_CAPACITY_SHIFT -> SCHED_CAPACITY_SCALE² + * in order to ensure a good resolution for arch_max_freq_scale for + * very low arch timer frequencies (down to the KHz range which should + * be unlikely). + */ + ratio = (u64)arch_timer_get_rate() << (2 * SCHED_CAPACITY_SHIFT); + ratio = div64_u64(ratio, max_freq_hz); + if (!ratio) { + WARN_ONCE(1, "System timer frequency too low.\n"); + return -EINVAL; + } + + per_cpu(arch_max_freq_scale, cpu) = (unsigned long)ratio; + + return 0; +} + +static inline bool +enable_policy_freq_counters(int cpu, cpumask_var_t valid_cpus) +{ + struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); + + if (!policy) { + pr_debug("CPU%d: No cpufreq policy found.\n", cpu); + return false; + } + + if (cpumask_subset(policy->related_cpus, valid_cpus)) + cpumask_or(amu_fie_cpus, policy->related_cpus, + amu_fie_cpus); + + cpufreq_cpu_put(policy); + + return true; +} + +static DEFINE_STATIC_KEY_FALSE(amu_fie_key); +#define amu_freq_invariant() static_branch_unlikely(&amu_fie_key) + +static int __init init_amu_fie(void) +{ + cpumask_var_t valid_cpus; + bool have_policy = false; + int ret = 0; + int cpu; + + if (!zalloc_cpumask_var(&valid_cpus, GFP_KERNEL)) + return -ENOMEM; + + if (!zalloc_cpumask_var(&amu_fie_cpus, GFP_KERNEL)) { + ret = -ENOMEM; + goto free_valid_mask; + } + + for_each_present_cpu(cpu) { + if (validate_cpu_freq_invariance_counters(cpu)) + continue; + cpumask_set_cpu(cpu, valid_cpus); + have_policy |= enable_policy_freq_counters(cpu, valid_cpus); + } + + /* + * If we are not restricted by cpufreq policies, we only enable + * the use of the AMU feature for FIE if all CPUs support AMU. + * Otherwise, enable_policy_freq_counters has already enabled + * policy cpus. + */ + if (!have_policy && cpumask_equal(valid_cpus, cpu_present_mask)) + cpumask_or(amu_fie_cpus, amu_fie_cpus, valid_cpus); + + if (!cpumask_empty(amu_fie_cpus)) { + pr_info("CPUs[%*pbl]: counters will be used for FIE.", + cpumask_pr_args(amu_fie_cpus)); + static_branch_enable(&amu_fie_key); + } + +free_valid_mask: + free_cpumask_var(valid_cpus); + + return ret; +} +late_initcall_sync(init_amu_fie); + +bool arch_freq_counters_available(struct cpumask *cpus) +{ + return amu_freq_invariant() && + cpumask_subset(cpus, amu_fie_cpus); +} + +void topology_scale_freq_tick(void) +{ + u64 prev_core_cnt, prev_const_cnt; + u64 core_cnt, const_cnt, scale; + int cpu = smp_processor_id(); + + if (!amu_freq_invariant()) + return; + + if (!cpumask_test_cpu(cpu, amu_fie_cpus)) + return; + + const_cnt = read_sysreg_s(SYS_AMEVCNTR0_CONST_EL0); + core_cnt = read_sysreg_s(SYS_AMEVCNTR0_CORE_EL0); + prev_const_cnt = this_cpu_read(arch_const_cycles_prev); + prev_core_cnt = this_cpu_read(arch_core_cycles_prev); + + if (unlikely(core_cnt <= prev_core_cnt || + const_cnt <= prev_const_cnt)) + goto store_and_exit; + + /* + * /\core arch_max_freq_scale + * scale = ------- * -------------------- + * /\const SCHED_CAPACITY_SCALE + * + * See validate_cpu_freq_invariance_counters() for details on + * arch_max_freq_scale and the use of SCHED_CAPACITY_SHIFT. + */ + scale = core_cnt - prev_core_cnt; + scale *= this_cpu_read(arch_max_freq_scale); + scale = div64_u64(scale >> SCHED_CAPACITY_SHIFT, + const_cnt - prev_const_cnt); + + scale = min_t(unsigned long, scale, SCHED_CAPACITY_SCALE); + this_cpu_write(freq_scale, (unsigned long)scale); + +store_and_exit: + this_cpu_write(arch_core_cycles_prev, core_cnt); + this_cpu_write(arch_const_cycles_prev, const_cnt); +} +#endif /* CONFIG_ARM64_AMU_EXTN */ diff --git a/arch/arm64/kernel/vdso/sigreturn.S b/arch/arm64/kernel/vdso/sigreturn.S index 0723aa398d6e..12324863d5c2 100644 --- a/arch/arm64/kernel/vdso/sigreturn.S +++ b/arch/arm64/kernel/vdso/sigreturn.S @@ -14,7 +14,7 @@ .text nop -ENTRY(__kernel_rt_sigreturn) +SYM_FUNC_START(__kernel_rt_sigreturn) .cfi_startproc .cfi_signal_frame .cfi_def_cfa x29, 0 @@ -23,4 +23,4 @@ ENTRY(__kernel_rt_sigreturn) mov x8, #__NR_rt_sigreturn svc #0 .cfi_endproc -ENDPROC(__kernel_rt_sigreturn) +SYM_FUNC_END(__kernel_rt_sigreturn) diff --git a/arch/arm64/kernel/vdso32/sigreturn.S b/arch/arm64/kernel/vdso32/sigreturn.S index 1a81277c2d09..620524969696 100644 --- a/arch/arm64/kernel/vdso32/sigreturn.S +++ b/arch/arm64/kernel/vdso32/sigreturn.S @@ -10,13 +10,6 @@ #include <asm/asm-offsets.h> #include <asm/unistd.h> -#define ARM_ENTRY(name) \ - ENTRY(name) - -#define ARM_ENDPROC(name) \ - .type name, %function; \ - END(name) - .text .arm @@ -24,39 +17,39 @@ .save {r0-r15} .pad #COMPAT_SIGFRAME_REGS_OFFSET nop -ARM_ENTRY(__kernel_sigreturn_arm) +SYM_FUNC_START(__kernel_sigreturn_arm) mov r7, #__NR_compat_sigreturn svc #0 .fnend -ARM_ENDPROC(__kernel_sigreturn_arm) +SYM_FUNC_END(__kernel_sigreturn_arm) .fnstart .save {r0-r15} .pad #COMPAT_RT_SIGFRAME_REGS_OFFSET nop -ARM_ENTRY(__kernel_rt_sigreturn_arm) +SYM_FUNC_START(__kernel_rt_sigreturn_arm) mov r7, #__NR_compat_rt_sigreturn svc #0 .fnend -ARM_ENDPROC(__kernel_rt_sigreturn_arm) +SYM_FUNC_END(__kernel_rt_sigreturn_arm) .thumb .fnstart .save {r0-r15} .pad #COMPAT_SIGFRAME_REGS_OFFSET nop -ARM_ENTRY(__kernel_sigreturn_thumb) +SYM_FUNC_START(__kernel_sigreturn_thumb) mov r7, #__NR_compat_sigreturn svc #0 .fnend -ARM_ENDPROC(__kernel_sigreturn_thumb) +SYM_FUNC_END(__kernel_sigreturn_thumb) .fnstart .save {r0-r15} .pad #COMPAT_RT_SIGFRAME_REGS_OFFSET nop -ARM_ENTRY(__kernel_rt_sigreturn_thumb) +SYM_FUNC_START(__kernel_rt_sigreturn_thumb) mov r7, #__NR_compat_rt_sigreturn svc #0 .fnend -ARM_ENDPROC(__kernel_rt_sigreturn_thumb) +SYM_FUNC_END(__kernel_rt_sigreturn_thumb) diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S index 160be2b4696d..6e6ed5581eed 100644 --- a/arch/arm64/kvm/hyp-init.S +++ b/arch/arm64/kvm/hyp-init.S @@ -18,7 +18,7 @@ .align 11 -ENTRY(__kvm_hyp_init) +SYM_CODE_START(__kvm_hyp_init) ventry __invalid // Synchronous EL2t ventry __invalid // IRQ EL2t ventry __invalid // FIQ EL2t @@ -60,7 +60,7 @@ alternative_else_nop_endif msr ttbr0_el2, x4 mrs x4, tcr_el1 - ldr x5, =TCR_EL2_MASK + mov_q x5, TCR_EL2_MASK and x4, x4, x5 mov x5, #TCR_EL2_RES1 orr x4, x4, x5 @@ -102,7 +102,7 @@ alternative_else_nop_endif * as well as the EE bit on BE. Drop the A flag since the compiler * is allowed to generate unaligned accesses. */ - ldr x4, =(SCTLR_EL2_RES1 | (SCTLR_ELx_FLAGS & ~SCTLR_ELx_A)) + mov_q x4, (SCTLR_EL2_RES1 | (SCTLR_ELx_FLAGS & ~SCTLR_ELx_A)) CPU_BE( orr x4, x4, #SCTLR_ELx_EE) msr sctlr_el2, x4 isb @@ -117,9 +117,9 @@ CPU_BE( orr x4, x4, #SCTLR_ELx_EE) /* Hello, World! */ eret -ENDPROC(__kvm_hyp_init) +SYM_CODE_END(__kvm_hyp_init) -ENTRY(__kvm_handle_stub_hvc) +SYM_CODE_START(__kvm_handle_stub_hvc) cmp x0, #HVC_SOFT_RESTART b.ne 1f @@ -142,7 +142,7 @@ reset: * case we coming via HVC_SOFT_RESTART. */ mrs x5, sctlr_el2 - ldr x6, =SCTLR_ELx_FLAGS + mov_q x6, SCTLR_ELx_FLAGS bic x5, x5, x6 // Clear SCTL_M and etc pre_disable_mmu_workaround msr sctlr_el2, x5 @@ -155,11 +155,9 @@ reset: eret 1: /* Bad stub call */ - ldr x0, =HVC_STUB_ERR + mov_q x0, HVC_STUB_ERR eret -ENDPROC(__kvm_handle_stub_hvc) - - .ltorg +SYM_CODE_END(__kvm_handle_stub_hvc) .popsection diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S index c0094d520dff..3c79a1124af2 100644 --- a/arch/arm64/kvm/hyp.S +++ b/arch/arm64/kvm/hyp.S @@ -28,7 +28,7 @@ * and is used to implement hyp stubs in the same way as in * arch/arm64/kernel/hyp_stub.S. */ -ENTRY(__kvm_call_hyp) +SYM_FUNC_START(__kvm_call_hyp) hvc #0 ret -ENDPROC(__kvm_call_hyp) +SYM_FUNC_END(__kvm_call_hyp) diff --git a/arch/arm64/kvm/hyp/fpsimd.S b/arch/arm64/kvm/hyp/fpsimd.S index 78ff53225691..5b8ff517ff10 100644 --- a/arch/arm64/kvm/hyp/fpsimd.S +++ b/arch/arm64/kvm/hyp/fpsimd.S @@ -11,12 +11,12 @@ .text .pushsection .hyp.text, "ax" -ENTRY(__fpsimd_save_state) +SYM_FUNC_START(__fpsimd_save_state) fpsimd_save x0, 1 ret -ENDPROC(__fpsimd_save_state) +SYM_FUNC_END(__fpsimd_save_state) -ENTRY(__fpsimd_restore_state) +SYM_FUNC_START(__fpsimd_restore_state) fpsimd_restore x0, 1 ret -ENDPROC(__fpsimd_restore_state) +SYM_FUNC_END(__fpsimd_restore_state) diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S index ffa68d5713f1..c2a13ab3c471 100644 --- a/arch/arm64/kvm/hyp/hyp-entry.S +++ b/arch/arm64/kvm/hyp/hyp-entry.S @@ -180,7 +180,7 @@ el2_error: eret sb -ENTRY(__hyp_do_panic) +SYM_FUNC_START(__hyp_do_panic) mov lr, #(PSR_F_BIT | PSR_I_BIT | PSR_A_BIT | PSR_D_BIT |\ PSR_MODE_EL1h) msr spsr_el2, lr @@ -188,18 +188,19 @@ ENTRY(__hyp_do_panic) msr elr_el2, lr eret sb -ENDPROC(__hyp_do_panic) +SYM_FUNC_END(__hyp_do_panic) -ENTRY(__hyp_panic) +SYM_CODE_START(__hyp_panic) get_host_ctxt x0, x1 b hyp_panic -ENDPROC(__hyp_panic) +SYM_CODE_END(__hyp_panic) .macro invalid_vector label, target = __hyp_panic .align 2 +SYM_CODE_START(\label) \label: b \target -ENDPROC(\label) +SYM_CODE_END(\label) .endm /* None of these should ever happen */ @@ -246,7 +247,7 @@ check_preamble_length 661b, 662b check_preamble_length 661b, 662b .endm -ENTRY(__kvm_hyp_vector) +SYM_CODE_START(__kvm_hyp_vector) invalid_vect el2t_sync_invalid // Synchronous EL2t invalid_vect el2t_irq_invalid // IRQ EL2t invalid_vect el2t_fiq_invalid // FIQ EL2t @@ -266,7 +267,7 @@ ENTRY(__kvm_hyp_vector) valid_vect el1_irq // IRQ 32-bit EL1 invalid_vect el1_fiq_invalid // FIQ 32-bit EL1 valid_vect el1_error // Error 32-bit EL1 -ENDPROC(__kvm_hyp_vector) +SYM_CODE_END(__kvm_hyp_vector) #ifdef CONFIG_KVM_INDIRECT_VECTORS .macro hyp_ventry @@ -311,15 +312,17 @@ alternative_cb_end .endm .align 11 -ENTRY(__bp_harden_hyp_vecs_start) +SYM_CODE_START(__bp_harden_hyp_vecs) .rept BP_HARDEN_EL2_SLOTS generate_vectors .endr -ENTRY(__bp_harden_hyp_vecs_end) +1: .org __bp_harden_hyp_vecs + __BP_HARDEN_HYP_VECS_SZ + .org 1b +SYM_CODE_END(__bp_harden_hyp_vecs) .popsection -ENTRY(__smccc_workaround_1_smc_start) +SYM_CODE_START(__smccc_workaround_1_smc) esb sub sp, sp, #(8 * 4) stp x2, x3, [sp, #(8 * 0)] @@ -329,5 +332,7 @@ ENTRY(__smccc_workaround_1_smc_start) ldp x2, x3, [sp, #(8 * 0)] ldp x0, x1, [sp, #(8 * 2)] add sp, sp, #(8 * 4) -ENTRY(__smccc_workaround_1_smc_end) +1: .org __smccc_workaround_1_smc + __SMCCC_WORKAROUND_1_SMC_SZ + .org 1b +SYM_CODE_END(__smccc_workaround_1_smc) #endif diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index 925086b46136..eaa05c3c7235 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -98,6 +98,18 @@ static void activate_traps_vhe(struct kvm_vcpu *vcpu) val = read_sysreg(cpacr_el1); val |= CPACR_EL1_TTA; val &= ~CPACR_EL1_ZEN; + + /* + * With VHE (HCR.E2H == 1), accesses to CPACR_EL1 are routed to + * CPTR_EL2. In general, CPACR_EL1 has the same layout as CPTR_EL2, + * except for some missing controls, such as TAM. + * In this case, CPTR_EL2.TAM has the same position with or without + * VHE (HCR.E2H == 1) which allows us to use here the CPTR_EL2.TAM + * shift value for trapping the AMU accesses. + */ + + val |= CPTR_EL2_TAM; + if (update_fp_enabled(vcpu)) { if (vcpu_has_sve(vcpu)) val |= CPACR_EL1_ZEN; @@ -119,7 +131,7 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu) __activate_traps_common(vcpu); val = CPTR_EL2_DEFAULT; - val |= CPTR_EL2_TTA | CPTR_EL2_TZ; + val |= CPTR_EL2_TTA | CPTR_EL2_TZ | CPTR_EL2_TAM; if (!update_fp_enabled(vcpu)) { val |= CPTR_EL2_TFP; __activate_traps_fpsimd32(vcpu); @@ -127,7 +139,7 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu) write_sysreg(val, cptr_el2); - if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) { + if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) { struct kvm_cpu_context *ctxt = &vcpu->arch.ctxt; isb(); @@ -146,12 +158,12 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu) { u64 hcr = vcpu->arch.hcr_el2; - if (cpus_have_const_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM)) + if (cpus_have_final_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM)) hcr |= HCR_TVM; write_sysreg(hcr, hcr_el2); - if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN) && (hcr & HCR_VSE)) + if (cpus_have_final_cap(ARM64_HAS_RAS_EXTN) && (hcr & HCR_VSE)) write_sysreg_s(vcpu->arch.vsesr_el2, SYS_VSESR_EL2); if (has_vhe()) @@ -181,7 +193,7 @@ static void __hyp_text __deactivate_traps_nvhe(void) { u64 mdcr_el2 = read_sysreg(mdcr_el2); - if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) { + if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) { u64 val; /* @@ -328,7 +340,7 @@ static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu) * resolve the IPA using the AT instruction. */ if (!(esr & ESR_ELx_S1PTW) && - (cpus_have_const_cap(ARM64_WORKAROUND_834220) || + (cpus_have_final_cap(ARM64_WORKAROUND_834220) || (esr & ESR_ELx_FSC_TYPE) == FSC_PERM)) { if (!__translate_far_to_hpfar(far, &hpfar)) return false; @@ -498,7 +510,7 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code) if (*exit_code != ARM_EXCEPTION_TRAP) goto exit; - if (cpus_have_const_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM) && + if (cpus_have_final_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM) && kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SYS64 && handle_tx2_tvm(vcpu)) return true; @@ -555,7 +567,7 @@ exit: static inline bool __hyp_text __needs_ssbd_off(struct kvm_vcpu *vcpu) { - if (!cpus_have_const_cap(ARM64_SSBD)) + if (!cpus_have_final_cap(ARM64_SSBD)) return false; return !(vcpu->arch.workaround_flags & VCPU_WORKAROUND_2_FLAG); diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c index 7672a978926c..75b1925763f1 100644 --- a/arch/arm64/kvm/hyp/sysreg-sr.c +++ b/arch/arm64/kvm/hyp/sysreg-sr.c @@ -71,7 +71,7 @@ static void __hyp_text __sysreg_save_el2_return_state(struct kvm_cpu_context *ct ctxt->gp_regs.regs.pc = read_sysreg_el2(SYS_ELR); ctxt->gp_regs.regs.pstate = read_sysreg_el2(SYS_SPSR); - if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) + if (cpus_have_final_cap(ARM64_HAS_RAS_EXTN)) ctxt->sys_regs[DISR_EL1] = read_sysreg_s(SYS_VDISR_EL2); } @@ -118,7 +118,7 @@ static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt) write_sysreg(ctxt->sys_regs[MPIDR_EL1], vmpidr_el2); write_sysreg(ctxt->sys_regs[CSSELR_EL1], csselr_el1); - if (!cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) { + if (!cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) { write_sysreg_el1(ctxt->sys_regs[SCTLR_EL1], SYS_SCTLR); write_sysreg_el1(ctxt->sys_regs[TCR_EL1], SYS_TCR); } else if (!ctxt->__hyp_running_vcpu) { @@ -149,7 +149,7 @@ static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt) write_sysreg(ctxt->sys_regs[PAR_EL1], par_el1); write_sysreg(ctxt->sys_regs[TPIDR_EL1], tpidr_el1); - if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE) && + if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE) && ctxt->__hyp_running_vcpu) { /* * Must only be done for host registers, hence the context @@ -194,7 +194,7 @@ __sysreg_restore_el2_return_state(struct kvm_cpu_context *ctxt) write_sysreg_el2(ctxt->gp_regs.regs.pc, SYS_ELR); write_sysreg_el2(pstate, SYS_SPSR); - if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) + if (cpus_have_final_cap(ARM64_HAS_RAS_EXTN)) write_sysreg_s(ctxt->sys_regs[DISR_EL1], SYS_VDISR_EL2); } diff --git a/arch/arm64/kvm/hyp/tlb.c b/arch/arm64/kvm/hyp/tlb.c index 92f560e3e1aa..ceaddbe4279f 100644 --- a/arch/arm64/kvm/hyp/tlb.c +++ b/arch/arm64/kvm/hyp/tlb.c @@ -23,7 +23,7 @@ static void __hyp_text __tlb_switch_to_guest_vhe(struct kvm *kvm, local_irq_save(cxt->flags); - if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_VHE)) { + if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT_VHE)) { /* * For CPUs that are affected by ARM errata 1165522 or 1530923, * we cannot trust stage-1 to be in a correct state at that @@ -63,7 +63,7 @@ static void __hyp_text __tlb_switch_to_guest_vhe(struct kvm *kvm, static void __hyp_text __tlb_switch_to_guest_nvhe(struct kvm *kvm, struct tlb_inv_context *cxt) { - if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) { + if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) { u64 val; /* @@ -103,7 +103,7 @@ static void __hyp_text __tlb_switch_to_host_vhe(struct kvm *kvm, write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2); isb(); - if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_VHE)) { + if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT_VHE)) { /* Restore the registers to what they were */ write_sysreg_el1(cxt->tcr, SYS_TCR); write_sysreg_el1(cxt->sctlr, SYS_SCTLR); @@ -117,7 +117,7 @@ static void __hyp_text __tlb_switch_to_host_nvhe(struct kvm *kvm, { write_sysreg(0, vttbr_el2); - if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) { + if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) { /* Ensure write of the host VMID */ isb(); /* Restore the host's TCR_EL1 */ diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 3e909b117f0c..090f46d3add1 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1003,6 +1003,20 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, { SYS_DESC(SYS_PMEVTYPERn_EL0(n)), \ access_pmu_evtyper, reset_unknown, (PMEVTYPER0_EL0 + n), } +static bool access_amu(struct kvm_vcpu *vcpu, struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + kvm_inject_undefined(vcpu); + + return false; +} + +/* Macro to expand the AMU counter and type registers*/ +#define AMU_AMEVCNTR0_EL0(n) { SYS_DESC(SYS_AMEVCNTR0_EL0(n)), access_amu } +#define AMU_AMEVTYPE0_EL0(n) { SYS_DESC(SYS_AMEVTYPE0_EL0(n)), access_amu } +#define AMU_AMEVCNTR1_EL0(n) { SYS_DESC(SYS_AMEVCNTR1_EL0(n)), access_amu } +#define AMU_AMEVTYPE1_EL0(n) { SYS_DESC(SYS_AMEVTYPE1_EL0(n)), access_amu } + static bool trap_ptrauth(struct kvm_vcpu *vcpu, struct sys_reg_params *p, const struct sys_reg_desc *rd) @@ -1078,13 +1092,25 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu, (u32)r->CRn, (u32)r->CRm, (u32)r->Op2); u64 val = raz ? 0 : read_sanitised_ftr_reg(id); - if (id == SYS_ID_AA64PFR0_EL1 && !vcpu_has_sve(vcpu)) { - val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT); + if (id == SYS_ID_AA64PFR0_EL1) { + if (!vcpu_has_sve(vcpu)) + val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT); + val &= ~(0xfUL << ID_AA64PFR0_AMU_SHIFT); } else if (id == SYS_ID_AA64ISAR1_EL1 && !vcpu_has_ptrauth(vcpu)) { val &= ~((0xfUL << ID_AA64ISAR1_APA_SHIFT) | (0xfUL << ID_AA64ISAR1_API_SHIFT) | (0xfUL << ID_AA64ISAR1_GPA_SHIFT) | (0xfUL << ID_AA64ISAR1_GPI_SHIFT)); + } else if (id == SYS_ID_AA64DFR0_EL1) { + /* Limit guests to PMUv3 for ARMv8.1 */ + val = cpuid_feature_cap_perfmon_field(val, + ID_AA64DFR0_PMUVER_SHIFT, + ID_AA64DFR0_PMUVER_8_1); + } else if (id == SYS_ID_DFR0_EL1) { + /* Limit guests to PMUv3 for ARMv8.1 */ + val = cpuid_feature_cap_perfmon_field(val, + ID_DFR0_PERFMON_SHIFT, + ID_DFR0_PERFMON_8_1); } return val; @@ -1565,6 +1591,79 @@ static const struct sys_reg_desc sys_reg_descs[] = { { SYS_DESC(SYS_TPIDR_EL0), NULL, reset_unknown, TPIDR_EL0 }, { SYS_DESC(SYS_TPIDRRO_EL0), NULL, reset_unknown, TPIDRRO_EL0 }, + { SYS_DESC(SYS_AMCR_EL0), access_amu }, + { SYS_DESC(SYS_AMCFGR_EL0), access_amu }, + { SYS_DESC(SYS_AMCGCR_EL0), access_amu }, + { SYS_DESC(SYS_AMUSERENR_EL0), access_amu }, + { SYS_DESC(SYS_AMCNTENCLR0_EL0), access_amu }, + { SYS_DESC(SYS_AMCNTENSET0_EL0), access_amu }, + { SYS_DESC(SYS_AMCNTENCLR1_EL0), access_amu }, + { SYS_DESC(SYS_AMCNTENSET1_EL0), access_amu }, + AMU_AMEVCNTR0_EL0(0), + AMU_AMEVCNTR0_EL0(1), + AMU_AMEVCNTR0_EL0(2), + AMU_AMEVCNTR0_EL0(3), + AMU_AMEVCNTR0_EL0(4), + AMU_AMEVCNTR0_EL0(5), + AMU_AMEVCNTR0_EL0(6), + AMU_AMEVCNTR0_EL0(7), + AMU_AMEVCNTR0_EL0(8), + AMU_AMEVCNTR0_EL0(9), + AMU_AMEVCNTR0_EL0(10), + AMU_AMEVCNTR0_EL0(11), + AMU_AMEVCNTR0_EL0(12), + AMU_AMEVCNTR0_EL0(13), + AMU_AMEVCNTR0_EL0(14), + AMU_AMEVCNTR0_EL0(15), + AMU_AMEVTYPE0_EL0(0), + AMU_AMEVTYPE0_EL0(1), + AMU_AMEVTYPE0_EL0(2), + AMU_AMEVTYPE0_EL0(3), + AMU_AMEVTYPE0_EL0(4), + AMU_AMEVTYPE0_EL0(5), + AMU_AMEVTYPE0_EL0(6), + AMU_AMEVTYPE0_EL0(7), + AMU_AMEVTYPE0_EL0(8), + AMU_AMEVTYPE0_EL0(9), + AMU_AMEVTYPE0_EL0(10), + AMU_AMEVTYPE0_EL0(11), + AMU_AMEVTYPE0_EL0(12), + AMU_AMEVTYPE0_EL0(13), + AMU_AMEVTYPE0_EL0(14), + AMU_AMEVTYPE0_EL0(15), + AMU_AMEVCNTR1_EL0(0), + AMU_AMEVCNTR1_EL0(1), + AMU_AMEVCNTR1_EL0(2), + AMU_AMEVCNTR1_EL0(3), + AMU_AMEVCNTR1_EL0(4), + AMU_AMEVCNTR1_EL0(5), + AMU_AMEVCNTR1_EL0(6), + AMU_AMEVCNTR1_EL0(7), + AMU_AMEVCNTR1_EL0(8), + AMU_AMEVCNTR1_EL0(9), + AMU_AMEVCNTR1_EL0(10), + AMU_AMEVCNTR1_EL0(11), + AMU_AMEVCNTR1_EL0(12), + AMU_AMEVCNTR1_EL0(13), + AMU_AMEVCNTR1_EL0(14), + AMU_AMEVCNTR1_EL0(15), + AMU_AMEVTYPE1_EL0(0), + AMU_AMEVTYPE1_EL0(1), + AMU_AMEVTYPE1_EL0(2), + AMU_AMEVTYPE1_EL0(3), + AMU_AMEVTYPE1_EL0(4), + AMU_AMEVTYPE1_EL0(5), + AMU_AMEVTYPE1_EL0(6), + AMU_AMEVTYPE1_EL0(7), + AMU_AMEVTYPE1_EL0(8), + AMU_AMEVTYPE1_EL0(9), + AMU_AMEVTYPE1_EL0(10), + AMU_AMEVTYPE1_EL0(11), + AMU_AMEVTYPE1_EL0(12), + AMU_AMEVTYPE1_EL0(13), + AMU_AMEVTYPE1_EL0(14), + AMU_AMEVTYPE1_EL0(15), + { SYS_DESC(SYS_CNTP_TVAL_EL0), access_arch_timer }, { SYS_DESC(SYS_CNTP_CTL_EL0), access_arch_timer }, { SYS_DESC(SYS_CNTP_CVAL_EL0), access_arch_timer }, diff --git a/arch/arm64/lib/csum.c b/arch/arm64/lib/csum.c index 1f82c66b32ea..60eccae2abad 100644 --- a/arch/arm64/lib/csum.c +++ b/arch/arm64/lib/csum.c @@ -124,3 +124,30 @@ unsigned int do_csum(const unsigned char *buff, int len) return sum >> 16; } + +__sum16 csum_ipv6_magic(const struct in6_addr *saddr, + const struct in6_addr *daddr, + __u32 len, __u8 proto, __wsum csum) +{ + __uint128_t src, dst; + u64 sum = (__force u64)csum; + + src = *(const __uint128_t *)saddr->s6_addr; + dst = *(const __uint128_t *)daddr->s6_addr; + + sum += (__force u32)htonl(len); +#ifdef __LITTLE_ENDIAN + sum += (u32)proto << 24; +#else + sum += proto; +#endif + src += (src >> 64) | (src << 64); + dst += (dst >> 64) | (dst << 64); + + sum = accumulate(sum, src >> 64); + sum = accumulate(sum, dst >> 64); + + sum += ((sum >> 32) | (sum << 32)); + return csum_fold((__force __wsum)(sum >> 32)); +} +EXPORT_SYMBOL(csum_ipv6_magic); diff --git a/arch/arm64/lib/strcmp.S b/arch/arm64/lib/strcmp.S index 4767540d1b94..4e79566726c8 100644 --- a/arch/arm64/lib/strcmp.S +++ b/arch/arm64/lib/strcmp.S @@ -186,7 +186,7 @@ CPU_LE( rev data2, data2 ) * as carry-propagation can corrupt the upper bits if the trailing * bytes in the string contain 0x01. * However, if there is no NUL byte in the dword, we can generate - * the result directly. We ca not just subtract the bytes as the + * the result directly. We cannot just subtract the bytes as the * MSB might be significant. */ CPU_BE( cbnz has_nul, 1f ) diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c index d89bb22589f6..9b26f9a88724 100644 --- a/arch/arm64/mm/context.c +++ b/arch/arm64/mm/context.c @@ -6,6 +6,7 @@ * Copyright (C) 2012 ARM Ltd. */ +#include <linux/bitfield.h> #include <linux/bitops.h> #include <linux/sched.h> #include <linux/slab.h> @@ -254,10 +255,37 @@ switch_mm_fastpath: /* Errata workaround post TTBRx_EL1 update. */ asmlinkage void post_ttbr_update_workaround(void) { + if (!IS_ENABLED(CONFIG_CAVIUM_ERRATUM_27456)) + return; + asm(ALTERNATIVE("nop; nop; nop", "ic iallu; dsb nsh; isb", - ARM64_WORKAROUND_CAVIUM_27456, - CONFIG_CAVIUM_ERRATUM_27456)); + ARM64_WORKAROUND_CAVIUM_27456)); +} + +void cpu_do_switch_mm(phys_addr_t pgd_phys, struct mm_struct *mm) +{ + unsigned long ttbr1 = read_sysreg(ttbr1_el1); + unsigned long asid = ASID(mm); + unsigned long ttbr0 = phys_to_ttbr(pgd_phys); + + /* Skip CNP for the reserved ASID */ + if (system_supports_cnp() && asid) + ttbr0 |= TTBR_CNP_BIT; + + /* SW PAN needs a copy of the ASID in TTBR0 for entry */ + if (IS_ENABLED(CONFIG_ARM64_SW_TTBR0_PAN)) + ttbr0 |= FIELD_PREP(TTBR_ASID_MASK, asid); + + /* Set ASID in TTBR1 since TCR.A1 is set */ + ttbr1 &= ~TTBR_ASID_MASK; + ttbr1 |= FIELD_PREP(TTBR_ASID_MASK, asid); + + write_sysreg(ttbr1, ttbr1_el1); + isb(); + write_sysreg(ttbr0, ttbr0_el1); + isb(); + post_ttbr_update_workaround(); } static int asids_update_limit(void) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 128f70852bf3..9b08f7c7e6f0 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -17,6 +17,7 @@ #include <linux/mman.h> #include <linux/nodemask.h> #include <linux/memblock.h> +#include <linux/memory.h> #include <linux/fs.h> #include <linux/io.h> #include <linux/mm.h> @@ -724,6 +725,312 @@ int kern_addr_valid(unsigned long addr) return pfn_valid(pte_pfn(pte)); } + +#ifdef CONFIG_MEMORY_HOTPLUG +static void free_hotplug_page_range(struct page *page, size_t size) +{ + WARN_ON(PageReserved(page)); + free_pages((unsigned long)page_address(page), get_order(size)); +} + +static void free_hotplug_pgtable_page(struct page *page) +{ + free_hotplug_page_range(page, PAGE_SIZE); +} + +static bool pgtable_range_aligned(unsigned long start, unsigned long end, + unsigned long floor, unsigned long ceiling, + unsigned long mask) +{ + start &= mask; + if (start < floor) + return false; + + if (ceiling) { + ceiling &= mask; + if (!ceiling) + return false; + } + + if (end - 1 > ceiling - 1) + return false; + return true; +} + +static void unmap_hotplug_pte_range(pmd_t *pmdp, unsigned long addr, + unsigned long end, bool free_mapped) +{ + pte_t *ptep, pte; + + do { + ptep = pte_offset_kernel(pmdp, addr); + pte = READ_ONCE(*ptep); + if (pte_none(pte)) + continue; + + WARN_ON(!pte_present(pte)); + pte_clear(&init_mm, addr, ptep); + flush_tlb_kernel_range(addr, addr + PAGE_SIZE); + if (free_mapped) + free_hotplug_page_range(pte_page(pte), PAGE_SIZE); + } while (addr += PAGE_SIZE, addr < end); +} + +static void unmap_hotplug_pmd_range(pud_t *pudp, unsigned long addr, + unsigned long end, bool free_mapped) +{ + unsigned long next; + pmd_t *pmdp, pmd; + + do { + next = pmd_addr_end(addr, end); + pmdp = pmd_offset(pudp, addr); + pmd = READ_ONCE(*pmdp); + if (pmd_none(pmd)) + continue; + + WARN_ON(!pmd_present(pmd)); + if (pmd_sect(pmd)) { + pmd_clear(pmdp); + + /* + * One TLBI should be sufficient here as the PMD_SIZE + * range is mapped with a single block entry. + */ + flush_tlb_kernel_range(addr, addr + PAGE_SIZE); + if (free_mapped) + free_hotplug_page_range(pmd_page(pmd), + PMD_SIZE); + continue; + } + WARN_ON(!pmd_table(pmd)); + unmap_hotplug_pte_range(pmdp, addr, next, free_mapped); + } while (addr = next, addr < end); +} + +static void unmap_hotplug_pud_range(p4d_t *p4dp, unsigned long addr, + unsigned long end, bool free_mapped) +{ + unsigned long next; + pud_t *pudp, pud; + + do { + next = pud_addr_end(addr, end); + pudp = pud_offset(p4dp, addr); + pud = READ_ONCE(*pudp); + if (pud_none(pud)) + continue; + + WARN_ON(!pud_present(pud)); + if (pud_sect(pud)) { + pud_clear(pudp); + + /* + * One TLBI should be sufficient here as the PUD_SIZE + * range is mapped with a single block entry. + */ + flush_tlb_kernel_range(addr, addr + PAGE_SIZE); + if (free_mapped) + free_hotplug_page_range(pud_page(pud), + PUD_SIZE); + continue; + } + WARN_ON(!pud_table(pud)); + unmap_hotplug_pmd_range(pudp, addr, next, free_mapped); + } while (addr = next, addr < end); +} + +static void unmap_hotplug_p4d_range(pgd_t *pgdp, unsigned long addr, + unsigned long end, bool free_mapped) +{ + unsigned long next; + p4d_t *p4dp, p4d; + + do { + next = p4d_addr_end(addr, end); + p4dp = p4d_offset(pgdp, addr); + p4d = READ_ONCE(*p4dp); + if (p4d_none(p4d)) + continue; + + WARN_ON(!p4d_present(p4d)); + unmap_hotplug_pud_range(p4dp, addr, next, free_mapped); + } while (addr = next, addr < end); +} + +static void unmap_hotplug_range(unsigned long addr, unsigned long end, + bool free_mapped) +{ + unsigned long next; + pgd_t *pgdp, pgd; + + do { + next = pgd_addr_end(addr, end); + pgdp = pgd_offset_k(addr); + pgd = READ_ONCE(*pgdp); + if (pgd_none(pgd)) + continue; + + WARN_ON(!pgd_present(pgd)); + unmap_hotplug_p4d_range(pgdp, addr, next, free_mapped); + } while (addr = next, addr < end); +} + +static void free_empty_pte_table(pmd_t *pmdp, unsigned long addr, + unsigned long end, unsigned long floor, + unsigned long ceiling) +{ + pte_t *ptep, pte; + unsigned long i, start = addr; + + do { + ptep = pte_offset_kernel(pmdp, addr); + pte = READ_ONCE(*ptep); + + /* + * This is just a sanity check here which verifies that + * pte clearing has been done by earlier unmap loops. + */ + WARN_ON(!pte_none(pte)); + } while (addr += PAGE_SIZE, addr < end); + + if (!pgtable_range_aligned(start, end, floor, ceiling, PMD_MASK)) + return; + + /* + * Check whether we can free the pte page if the rest of the + * entries are empty. Overlap with other regions have been + * handled by the floor/ceiling check. + */ + ptep = pte_offset_kernel(pmdp, 0UL); + for (i = 0; i < PTRS_PER_PTE; i++) { + if (!pte_none(READ_ONCE(ptep[i]))) + return; + } + + pmd_clear(pmdp); + __flush_tlb_kernel_pgtable(start); + free_hotplug_pgtable_page(virt_to_page(ptep)); +} + +static void free_empty_pmd_table(pud_t *pudp, unsigned long addr, + unsigned long end, unsigned long floor, + unsigned long ceiling) +{ + pmd_t *pmdp, pmd; + unsigned long i, next, start = addr; + + do { + next = pmd_addr_end(addr, end); + pmdp = pmd_offset(pudp, addr); + pmd = READ_ONCE(*pmdp); + if (pmd_none(pmd)) + continue; + + WARN_ON(!pmd_present(pmd) || !pmd_table(pmd) || pmd_sect(pmd)); + free_empty_pte_table(pmdp, addr, next, floor, ceiling); + } while (addr = next, addr < end); + + if (CONFIG_PGTABLE_LEVELS <= 2) + return; + + if (!pgtable_range_aligned(start, end, floor, ceiling, PUD_MASK)) + return; + + /* + * Check whether we can free the pmd page if the rest of the + * entries are empty. Overlap with other regions have been + * handled by the floor/ceiling check. + */ + pmdp = pmd_offset(pudp, 0UL); + for (i = 0; i < PTRS_PER_PMD; i++) { + if (!pmd_none(READ_ONCE(pmdp[i]))) + return; + } + + pud_clear(pudp); + __flush_tlb_kernel_pgtable(start); + free_hotplug_pgtable_page(virt_to_page(pmdp)); +} + +static void free_empty_pud_table(p4d_t *p4dp, unsigned long addr, + unsigned long end, unsigned long floor, + unsigned long ceiling) +{ + pud_t *pudp, pud; + unsigned long i, next, start = addr; + + do { + next = pud_addr_end(addr, end); + pudp = pud_offset(p4dp, addr); + pud = READ_ONCE(*pudp); + if (pud_none(pud)) + continue; + + WARN_ON(!pud_present(pud) || !pud_table(pud) || pud_sect(pud)); + free_empty_pmd_table(pudp, addr, next, floor, ceiling); + } while (addr = next, addr < end); + + if (CONFIG_PGTABLE_LEVELS <= 3) + return; + + if (!pgtable_range_aligned(start, end, floor, ceiling, PGDIR_MASK)) + return; + + /* + * Check whether we can free the pud page if the rest of the + * entries are empty. Overlap with other regions have been + * handled by the floor/ceiling check. + */ + pudp = pud_offset(p4dp, 0UL); + for (i = 0; i < PTRS_PER_PUD; i++) { + if (!pud_none(READ_ONCE(pudp[i]))) + return; + } + + p4d_clear(p4dp); + __flush_tlb_kernel_pgtable(start); + free_hotplug_pgtable_page(virt_to_page(pudp)); +} + +static void free_empty_p4d_table(pgd_t *pgdp, unsigned long addr, + unsigned long end, unsigned long floor, + unsigned long ceiling) +{ + unsigned long next; + p4d_t *p4dp, p4d; + + do { + next = p4d_addr_end(addr, end); + p4dp = p4d_offset(pgdp, addr); + p4d = READ_ONCE(*p4dp); + if (p4d_none(p4d)) + continue; + + WARN_ON(!p4d_present(p4d)); + free_empty_pud_table(p4dp, addr, next, floor, ceiling); + } while (addr = next, addr < end); +} + +static void free_empty_tables(unsigned long addr, unsigned long end, + unsigned long floor, unsigned long ceiling) +{ + unsigned long next; + pgd_t *pgdp, pgd; + + do { + next = pgd_addr_end(addr, end); + pgdp = pgd_offset_k(addr); + pgd = READ_ONCE(*pgdp); + if (pgd_none(pgd)) + continue; + + WARN_ON(!pgd_present(pgd)); + free_empty_p4d_table(pgdp, addr, next, floor, ceiling); + } while (addr = next, addr < end); +} +#endif + #ifdef CONFIG_SPARSEMEM_VMEMMAP #if !ARM64_SWAPPER_USES_SECTION_MAPS int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, @@ -771,6 +1078,12 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, void vmemmap_free(unsigned long start, unsigned long end, struct vmem_altmap *altmap) { +#ifdef CONFIG_MEMORY_HOTPLUG + WARN_ON((start < VMEMMAP_START) || (end > VMEMMAP_END)); + + unmap_hotplug_range(start, end, true); + free_empty_tables(start, end, VMEMMAP_START, VMEMMAP_END); +#endif } #endif /* CONFIG_SPARSEMEM_VMEMMAP */ @@ -1049,10 +1362,21 @@ int p4d_free_pud_page(p4d_t *p4d, unsigned long addr) } #ifdef CONFIG_MEMORY_HOTPLUG +static void __remove_pgd_mapping(pgd_t *pgdir, unsigned long start, u64 size) +{ + unsigned long end = start + size; + + WARN_ON(pgdir != init_mm.pgd); + WARN_ON((start < PAGE_OFFSET) || (end > PAGE_END)); + + unmap_hotplug_range(start, end, false); + free_empty_tables(start, end, PAGE_OFFSET, PAGE_END); +} + int arch_add_memory(int nid, u64 start, u64 size, struct mhp_restrictions *restrictions) { - int flags = 0; + int ret, flags = 0; if (rodata_full || debug_pagealloc_enabled()) flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS; @@ -1062,22 +1386,59 @@ int arch_add_memory(int nid, u64 start, u64 size, memblock_clear_nomap(start, size); - return __add_pages(nid, start >> PAGE_SHIFT, size >> PAGE_SHIFT, + ret = __add_pages(nid, start >> PAGE_SHIFT, size >> PAGE_SHIFT, restrictions); + if (ret) + __remove_pgd_mapping(swapper_pg_dir, + __phys_to_virt(start), size); + return ret; } + void arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap) { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; - /* - * FIXME: Cleanup page tables (also in arch_add_memory() in case - * adding fails). Until then, this function should only be used - * during memory hotplug (adding memory), not for memory - * unplug. ARCH_ENABLE_MEMORY_HOTREMOVE must not be - * unlocked yet. - */ __remove_pages(start_pfn, nr_pages, altmap); + __remove_pgd_mapping(swapper_pg_dir, __phys_to_virt(start), size); +} + +/* + * This memory hotplug notifier helps prevent boot memory from being + * inadvertently removed as it blocks pfn range offlining process in + * __offline_pages(). Hence this prevents both offlining as well as + * removal process for boot memory which is initially always online. + * In future if and when boot memory could be removed, this notifier + * should be dropped and free_hotplug_page_range() should handle any + * reserved pages allocated during boot. + */ +static int prevent_bootmem_remove_notifier(struct notifier_block *nb, + unsigned long action, void *data) +{ + struct mem_section *ms; + struct memory_notify *arg = data; + unsigned long end_pfn = arg->start_pfn + arg->nr_pages; + unsigned long pfn = arg->start_pfn; + + if (action != MEM_GOING_OFFLINE) + return NOTIFY_OK; + + for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) { + ms = __pfn_to_section(pfn); + if (early_section(ms)) + return NOTIFY_BAD; + } + return NOTIFY_OK; +} + +static struct notifier_block prevent_bootmem_remove_nb = { + .notifier_call = prevent_bootmem_remove_notifier, +}; + +static int __init prevent_bootmem_remove_init(void) +{ + return register_memory_notifier(&prevent_bootmem_remove_nb); } +device_initcall(prevent_bootmem_remove_init); #endif diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S index aafed6902411..197a9ba2d5ea 100644 --- a/arch/arm64/mm/proc.S +++ b/arch/arm64/mm/proc.S @@ -11,11 +11,13 @@ #include <linux/linkage.h> #include <asm/assembler.h> #include <asm/asm-offsets.h> +#include <asm/asm_pointer_auth.h> #include <asm/hwcap.h> #include <asm/pgtable.h> #include <asm/pgtable-hwdef.h> #include <asm/cpufeature.h> #include <asm/alternative.h> +#include <asm/smp.h> #ifdef CONFIG_ARM64_64K_PAGES #define TCR_TG_FLAGS TCR_TG0_64K | TCR_TG1_64K @@ -131,45 +133,19 @@ alternative_endif ubfx x11, x11, #1, #1 msr oslar_el1, x11 reset_pmuserenr_el0 x0 // Disable PMU access from EL0 + reset_amuserenr_el0 x0 // Disable AMU access from EL0 alternative_if ARM64_HAS_RAS_EXTN msr_s SYS_DISR_EL1, xzr alternative_else_nop_endif + ptrauth_keys_install_kernel x14, 0, x1, x2, x3 isb ret SYM_FUNC_END(cpu_do_resume) .popsection #endif -/* - * cpu_do_switch_mm(pgd_phys, tsk) - * - * Set the translation table base pointer to be pgd_phys. - * - * - pgd_phys - physical address of new TTB - */ -SYM_FUNC_START(cpu_do_switch_mm) - mrs x2, ttbr1_el1 - mmid x1, x1 // get mm->context.id - phys_to_ttbr x3, x0 - -alternative_if ARM64_HAS_CNP - cbz x1, 1f // skip CNP for reserved ASID - orr x3, x3, #TTBR_CNP_BIT -1: -alternative_else_nop_endif -#ifdef CONFIG_ARM64_SW_TTBR0_PAN - bfi x3, x1, #48, #16 // set the ASID field in TTBR0 -#endif - bfi x2, x1, #48, #16 // set the ASID - msr ttbr1_el1, x2 // in TTBR1 (since TCR.A1 is set) - isb - msr ttbr0_el1, x3 // now update TTBR0 - isb - b post_ttbr_update_workaround // Back to C code... -SYM_FUNC_END(cpu_do_switch_mm) - .pushsection ".idmap.text", "awx" .macro __idmap_cpu_set_reserved_ttbr1, tmp1, tmp2 @@ -408,35 +384,37 @@ SYM_FUNC_END(idmap_kpti_install_ng_mappings) /* * __cpu_setup * - * Initialise the processor for turning the MMU on. Return in x0 the - * value of the SCTLR_EL1 register. + * Initialise the processor for turning the MMU on. + * + * Input: + * x0 with a flag ARM64_CPU_BOOT_PRIMARY/ARM64_CPU_BOOT_SECONDARY/ARM64_CPU_RUNTIME. + * Output: + * Return in x0 the value of the SCTLR_EL1 register. */ .pushsection ".idmap.text", "awx" SYM_FUNC_START(__cpu_setup) tlbi vmalle1 // Invalidate local TLB dsb nsh - mov x0, #3 << 20 - msr cpacr_el1, x0 // Enable FP/ASIMD - mov x0, #1 << 12 // Reset mdscr_el1 and disable - msr mdscr_el1, x0 // access to the DCC from EL0 + mov x1, #3 << 20 + msr cpacr_el1, x1 // Enable FP/ASIMD + mov x1, #1 << 12 // Reset mdscr_el1 and disable + msr mdscr_el1, x1 // access to the DCC from EL0 isb // Unmask debug exceptions now, enable_dbg // since this is per-cpu - reset_pmuserenr_el0 x0 // Disable PMU access from EL0 + reset_pmuserenr_el0 x1 // Disable PMU access from EL0 + reset_amuserenr_el0 x1 // Disable AMU access from EL0 + /* * Memory region attributes */ mov_q x5, MAIR_EL1_SET msr mair_el1, x5 /* - * Prepare SCTLR - */ - mov_q x0, SCTLR_EL1_SET - /* * Set/prepare TCR and TTBR. We use 512GB (39-bit) address range for * both user and kernel. */ - ldr x10, =TCR_TxSZ(VA_BITS) | TCR_CACHE_FLAGS | TCR_SMP_FLAGS | \ + mov_q x10, TCR_TxSZ(VA_BITS) | TCR_CACHE_FLAGS | TCR_SMP_FLAGS | \ TCR_TG_FLAGS | TCR_KASLR_FLAGS | TCR_ASID16 | \ TCR_TBI0 | TCR_A1 | TCR_KASAN_FLAGS tcr_clear_errata_bits x10, x9, x5 @@ -468,5 +446,51 @@ SYM_FUNC_START(__cpu_setup) 1: #endif /* CONFIG_ARM64_HW_AFDBM */ msr tcr_el1, x10 + mov x1, x0 + /* + * Prepare SCTLR + */ + mov_q x0, SCTLR_EL1_SET + +#ifdef CONFIG_ARM64_PTR_AUTH + /* No ptrauth setup for run time cpus */ + cmp x1, #ARM64_CPU_RUNTIME + b.eq 3f + + /* Check if the CPU supports ptrauth */ + mrs x2, id_aa64isar1_el1 + ubfx x2, x2, #ID_AA64ISAR1_APA_SHIFT, #8 + cbz x2, 3f + + /* + * The primary cpu keys are reset here and can be + * re-initialised with some proper values later. + */ + msr_s SYS_APIAKEYLO_EL1, xzr + msr_s SYS_APIAKEYHI_EL1, xzr + + /* Just enable ptrauth for primary cpu */ + cmp x1, #ARM64_CPU_BOOT_PRIMARY + b.eq 2f + + /* if !system_supports_address_auth() then skip enable */ +alternative_if_not ARM64_HAS_ADDRESS_AUTH + b 3f +alternative_else_nop_endif + + /* Install ptrauth key for secondary cpus */ + adr_l x2, secondary_data + ldr x3, [x2, #CPU_BOOT_TASK] // get secondary_data.task + cbz x3, 2f // check for slow booting cpus + ldp x3, x4, [x2, #CPU_BOOT_PTRAUTH_KEY] + msr_s SYS_APIAKEYLO_EL1, x3 + msr_s SYS_APIAKEYHI_EL1, x4 + +2: /* Enable ptrauth instructions */ + ldr x2, =SCTLR_ELx_ENIA | SCTLR_ELx_ENIB | \ + SCTLR_ELx_ENDA | SCTLR_ELx_ENDB + orr x0, x0, x2 +3: +#endif ret // return to head.S SYM_FUNC_END(__cpu_setup) diff --git a/arch/arm64/mm/ptdump_debugfs.c b/arch/arm64/mm/ptdump_debugfs.c index 1f2eae3e988b..d29d722ec3ec 100644 --- a/arch/arm64/mm/ptdump_debugfs.c +++ b/arch/arm64/mm/ptdump_debugfs.c @@ -1,5 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 #include <linux/debugfs.h> +#include <linux/memory_hotplug.h> #include <linux/seq_file.h> #include <asm/ptdump.h> @@ -7,7 +8,10 @@ static int ptdump_show(struct seq_file *m, void *v) { struct ptdump_info *info = m->private; + + get_online_mems(); ptdump_walk(m, info); + put_online_mems(); return 0; } DEFINE_SHOW_ATTRIBUTE(ptdump); diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c index e5d691cf824c..4d0a0038b476 100644 --- a/drivers/base/arch_topology.c +++ b/drivers/base/arch_topology.c @@ -21,6 +21,10 @@ #include <linux/sched.h> #include <linux/smp.h> +__weak bool arch_freq_counters_available(struct cpumask *cpus) +{ + return false; +} DEFINE_PER_CPU(unsigned long, freq_scale) = SCHED_CAPACITY_SCALE; void arch_set_freq_scale(struct cpumask *cpus, unsigned long cur_freq, @@ -29,6 +33,14 @@ void arch_set_freq_scale(struct cpumask *cpus, unsigned long cur_freq, unsigned long scale; int i; + /* + * If the use of counters for FIE is enabled, just return as we don't + * want to update the scale factor with information from CPUFREQ. + * Instead the scale factor will be updated from arch_scale_freq_tick. + */ + if (arch_freq_counters_available(cpus)) + return; + scale = (cur_freq << SCHED_CAPACITY_SHIFT) / max_freq; for_each_cpu(i, cpus) diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c index d53f4c7ccaae..2204a444e801 100644 --- a/drivers/clocksource/arm_arch_timer.c +++ b/drivers/clocksource/arm_arch_timer.c @@ -889,6 +889,17 @@ static int arch_timer_starting_cpu(unsigned int cpu) return 0; } +static int validate_timer_rate(void) +{ + if (!arch_timer_rate) + return -EINVAL; + + /* Arch timer frequency < 1MHz can cause trouble */ + WARN_ON(arch_timer_rate < 1000000); + + return 0; +} + /* * For historical reasons, when probing with DT we use whichever (non-zero) * rate was probed first, and don't verify that others match. If the first node @@ -904,7 +915,7 @@ static void arch_timer_of_configure_rate(u32 rate, struct device_node *np) arch_timer_rate = rate; /* Check the timer frequency. */ - if (arch_timer_rate == 0) + if (validate_timer_rate()) pr_warn("frequency not available\n"); } @@ -1598,9 +1609,10 @@ static int __init arch_timer_acpi_init(struct acpi_table_header *table) * CNTFRQ value. This *must* be correct. */ arch_timer_rate = arch_timer_get_cntfrq(); - if (!arch_timer_rate) { + ret = validate_timer_rate(); + if (ret) { pr_err(FW_BUG "frequency not available.\n"); - return -EINVAL; + return ret; } arch_timer_uses_ppi = arch_timer_select_ppi(); diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 808874bccf4a..045f9fe157ce 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1733,6 +1733,26 @@ unsigned int cpufreq_quick_get_max(unsigned int cpu) } EXPORT_SYMBOL(cpufreq_quick_get_max); +/** + * cpufreq_get_hw_max_freq - get the max hardware frequency of the CPU + * @cpu: CPU number + * + * The default return value is the max_freq field of cpuinfo. + */ +__weak unsigned int cpufreq_get_hw_max_freq(unsigned int cpu) +{ + struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); + unsigned int ret_freq = 0; + + if (policy) { + ret_freq = policy->cpuinfo.max_freq; + cpufreq_cpu_put(policy); + } + + return ret_freq; +} +EXPORT_SYMBOL(cpufreq_get_hw_max_freq); + static unsigned int __cpufreq_get(struct cpufreq_policy *policy) { if (unlikely(policy_is_inactive(policy))) diff --git a/drivers/firmware/arm_sdei.c b/drivers/firmware/arm_sdei.c index a479023fa036..334c8be0c11f 100644 --- a/drivers/firmware/arm_sdei.c +++ b/drivers/firmware/arm_sdei.c @@ -267,26 +267,19 @@ static struct sdei_event *sdei_event_create(u32 event_num, event->private_registered = regs; } - if (sdei_event_find(event_num)) { - kfree(event->registered); - kfree(event); - event = ERR_PTR(-EBUSY); - } else { - spin_lock(&sdei_list_lock); - list_add(&event->list, &sdei_list); - spin_unlock(&sdei_list_lock); - } + spin_lock(&sdei_list_lock); + list_add(&event->list, &sdei_list); + spin_unlock(&sdei_list_lock); return event; } -static void sdei_event_destroy(struct sdei_event *event) +static void sdei_event_destroy_llocked(struct sdei_event *event) { lockdep_assert_held(&sdei_events_lock); + lockdep_assert_held(&sdei_list_lock); - spin_lock(&sdei_list_lock); list_del(&event->list); - spin_unlock(&sdei_list_lock); if (event->type == SDEI_EVENT_TYPE_SHARED) kfree(event->registered); @@ -296,6 +289,13 @@ static void sdei_event_destroy(struct sdei_event *event) kfree(event); } +static void sdei_event_destroy(struct sdei_event *event) +{ + spin_lock(&sdei_list_lock); + sdei_event_destroy_llocked(event); + spin_unlock(&sdei_list_lock); +} + static int sdei_api_get_version(u64 *version) { return invoke_sdei_fn(SDEI_1_0_FN_SDEI_VERSION, 0, 0, 0, 0, 0, version); @@ -412,14 +412,19 @@ int sdei_event_enable(u32 event_num) return -ENOENT; } - spin_lock(&sdei_list_lock); - event->reenable = true; - spin_unlock(&sdei_list_lock); + cpus_read_lock(); if (event->type == SDEI_EVENT_TYPE_SHARED) err = sdei_api_event_enable(event->event_num); else err = sdei_do_cross_call(_local_event_enable, event); + + if (!err) { + spin_lock(&sdei_list_lock); + event->reenable = true; + spin_unlock(&sdei_list_lock); + } + cpus_read_unlock(); mutex_unlock(&sdei_events_lock); return err; @@ -491,11 +496,6 @@ static int _sdei_event_unregister(struct sdei_event *event) { lockdep_assert_held(&sdei_events_lock); - spin_lock(&sdei_list_lock); - event->reregister = false; - event->reenable = false; - spin_unlock(&sdei_list_lock); - if (event->type == SDEI_EVENT_TYPE_SHARED) return sdei_api_event_unregister(event->event_num); @@ -518,6 +518,11 @@ int sdei_event_unregister(u32 event_num) break; } + spin_lock(&sdei_list_lock); + event->reregister = false; + event->reenable = false; + spin_unlock(&sdei_list_lock); + err = _sdei_event_unregister(event); if (err) break; @@ -585,26 +590,15 @@ static int _sdei_event_register(struct sdei_event *event) lockdep_assert_held(&sdei_events_lock); - spin_lock(&sdei_list_lock); - event->reregister = true; - spin_unlock(&sdei_list_lock); - if (event->type == SDEI_EVENT_TYPE_SHARED) return sdei_api_event_register(event->event_num, sdei_entry_point, event->registered, SDEI_EVENT_REGISTER_RM_ANY, 0); - err = sdei_do_cross_call(_local_event_register, event); - if (err) { - spin_lock(&sdei_list_lock); - event->reregister = false; - event->reenable = false; - spin_unlock(&sdei_list_lock); - + if (err) sdei_do_cross_call(_local_event_unregister, event); - } return err; } @@ -632,12 +626,18 @@ int sdei_event_register(u32 event_num, sdei_event_callback *cb, void *arg) break; } + cpus_read_lock(); err = _sdei_event_register(event); if (err) { sdei_event_destroy(event); pr_warn("Failed to register event %u: %d\n", event_num, err); + } else { + spin_lock(&sdei_list_lock); + event->reregister = true; + spin_unlock(&sdei_list_lock); } + cpus_read_unlock(); } while (0); mutex_unlock(&sdei_events_lock); @@ -645,16 +645,17 @@ int sdei_event_register(u32 event_num, sdei_event_callback *cb, void *arg) } EXPORT_SYMBOL(sdei_event_register); -static int sdei_reregister_event(struct sdei_event *event) +static int sdei_reregister_event_llocked(struct sdei_event *event) { int err; lockdep_assert_held(&sdei_events_lock); + lockdep_assert_held(&sdei_list_lock); err = _sdei_event_register(event); if (err) { pr_err("Failed to re-register event %u\n", event->event_num); - sdei_event_destroy(event); + sdei_event_destroy_llocked(event); return err; } @@ -683,7 +684,7 @@ static int sdei_reregister_shared(void) continue; if (event->reregister) { - err = sdei_reregister_event(event); + err = sdei_reregister_event_llocked(event); if (err) break; } diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c index de87693cf557..cc92bc3ed820 100644 --- a/drivers/misc/lkdtm/bugs.c +++ b/drivers/misc/lkdtm/bugs.c @@ -378,3 +378,39 @@ void lkdtm_DOUBLE_FAULT(void) pr_err("XFAIL: this test is ia32-only\n"); #endif } + +#ifdef CONFIG_ARM64_PTR_AUTH +static noinline void change_pac_parameters(void) +{ + /* Reset the keys of current task */ + ptrauth_thread_init_kernel(current); + ptrauth_thread_switch_kernel(current); +} + +#define CORRUPT_PAC_ITERATE 10 +noinline void lkdtm_CORRUPT_PAC(void) +{ + int i; + + if (!system_supports_address_auth()) { + pr_err("FAIL: arm64 pointer authentication feature not present\n"); + return; + } + + pr_info("Change the PAC parameters to force function return failure\n"); + /* + * Pac is a hash value computed from input keys, return address and + * stack pointer. As pac has fewer bits so there is a chance of + * collision, so iterate few times to reduce the collision probability. + */ + for (i = 0; i < CORRUPT_PAC_ITERATE; i++) + change_pac_parameters(); + + pr_err("FAIL: %s test failed. Kernel may be unstable from here\n", __func__); +} +#else /* !CONFIG_ARM64_PTR_AUTH */ +noinline void lkdtm_CORRUPT_PAC(void) +{ + pr_err("FAIL: arm64 pointer authentication config disabled\n"); +} +#endif diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c index ee0d6e721441..5ce4ac8c06fc 100644 --- a/drivers/misc/lkdtm/core.c +++ b/drivers/misc/lkdtm/core.c @@ -116,6 +116,7 @@ static const struct crashtype crashtypes[] = { CRASHTYPE(STACK_GUARD_PAGE_LEADING), CRASHTYPE(STACK_GUARD_PAGE_TRAILING), CRASHTYPE(UNSET_SMEP), + CRASHTYPE(CORRUPT_PAC), CRASHTYPE(UNALIGNED_LOAD_STORE_WRITE), CRASHTYPE(OVERWRITE_ALLOCATION), CRASHTYPE(WRITE_AFTER_FREE), diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h index c56d23e37643..8d13d0176624 100644 --- a/drivers/misc/lkdtm/lkdtm.h +++ b/drivers/misc/lkdtm/lkdtm.h @@ -31,6 +31,7 @@ void lkdtm_UNSET_SMEP(void); #ifdef CONFIG_X86_32 void lkdtm_DOUBLE_FAULT(void); #endif +void lkdtm_CORRUPT_PAC(void); /* lkdtm_heap.c */ void __init lkdtm_heap_init(void); diff --git a/drivers/perf/arm-ccn.c b/drivers/perf/arm-ccn.c index fea354d6fb29..d50edef91f59 100644 --- a/drivers/perf/arm-ccn.c +++ b/drivers/perf/arm-ccn.c @@ -328,15 +328,15 @@ static ssize_t arm_ccn_pmu_event_show(struct device *dev, struct arm_ccn_pmu_event, attr); ssize_t res; - res = snprintf(buf, PAGE_SIZE, "type=0x%x", event->type); + res = scnprintf(buf, PAGE_SIZE, "type=0x%x", event->type); if (event->event) - res += snprintf(buf + res, PAGE_SIZE - res, ",event=0x%x", + res += scnprintf(buf + res, PAGE_SIZE - res, ",event=0x%x", event->event); if (event->def) - res += snprintf(buf + res, PAGE_SIZE - res, ",%s", + res += scnprintf(buf + res, PAGE_SIZE - res, ",%s", event->def); if (event->mask) - res += snprintf(buf + res, PAGE_SIZE - res, ",mask=0x%x", + res += scnprintf(buf + res, PAGE_SIZE - res, ",mask=0x%x", event->mask); /* Arguments required by an event */ @@ -344,25 +344,25 @@ static ssize_t arm_ccn_pmu_event_show(struct device *dev, case CCN_TYPE_CYCLES: break; case CCN_TYPE_XP: - res += snprintf(buf + res, PAGE_SIZE - res, + res += scnprintf(buf + res, PAGE_SIZE - res, ",xp=?,vc=?"); if (event->event == CCN_EVENT_WATCHPOINT) - res += snprintf(buf + res, PAGE_SIZE - res, + res += scnprintf(buf + res, PAGE_SIZE - res, ",port=?,dir=?,cmp_l=?,cmp_h=?,mask=?"); else - res += snprintf(buf + res, PAGE_SIZE - res, + res += scnprintf(buf + res, PAGE_SIZE - res, ",bus=?"); break; case CCN_TYPE_MN: - res += snprintf(buf + res, PAGE_SIZE - res, ",node=%d", ccn->mn_id); + res += scnprintf(buf + res, PAGE_SIZE - res, ",node=%d", ccn->mn_id); break; default: - res += snprintf(buf + res, PAGE_SIZE - res, ",node=?"); + res += scnprintf(buf + res, PAGE_SIZE - res, ",node=?"); break; } - res += snprintf(buf + res, PAGE_SIZE - res, "\n"); + res += scnprintf(buf + res, PAGE_SIZE - res, "\n"); return res; } diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c index 4e4984a55cd1..b72c04852599 100644 --- a/drivers/perf/arm_spe_pmu.c +++ b/drivers/perf/arm_spe_pmu.c @@ -831,7 +831,7 @@ static void *arm_spe_pmu_setup_aux(struct perf_event *event, void **pages, * parts and give userspace a fighting chance of getting some * useful data out of it. */ - if (!nr_pages || (snapshot && (nr_pages & 1))) + if (snapshot && (nr_pages & 1)) return NULL; if (cpu == -1) diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h index 8dd9aeb6cdeb..0566cb3314ef 100644 --- a/include/linux/arch_topology.h +++ b/include/linux/arch_topology.h @@ -30,6 +30,8 @@ static inline unsigned long topology_get_freq_scale(int cpu) return per_cpu(freq_scale, cpu); } +bool arch_freq_counters_available(struct cpumask *cpus); + DECLARE_PER_CPU(unsigned long, thermal_pressure); static inline unsigned long topology_get_thermal_pressure(int cpu) diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index 0fb561d1b524..f7240251a949 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -205,6 +205,7 @@ static inline bool policy_is_shared(struct cpufreq_policy *policy) unsigned int cpufreq_get(unsigned int cpu); unsigned int cpufreq_quick_get(unsigned int cpu); unsigned int cpufreq_quick_get_max(unsigned int cpu); +unsigned int cpufreq_get_hw_max_freq(unsigned int cpu); void disable_cpufreq(void); u64 get_cpu_idle_time(unsigned int cpu, u64 *wall, int io_busy); @@ -232,6 +233,10 @@ static inline unsigned int cpufreq_quick_get_max(unsigned int cpu) { return 0; } +static inline unsigned int cpufreq_get_hw_max_freq(unsigned int cpu) +{ + return 0; +} static inline void disable_cpufreq(void) { } #endif diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h index 71f525a35ac2..5b616dde9a4c 100644 --- a/include/linux/perf/arm_pmu.h +++ b/include/linux/perf/arm_pmu.h @@ -80,6 +80,7 @@ struct arm_pmu { struct pmu pmu; cpumask_t supported_cpus; char *name; + int pmuver; irqreturn_t (*handle_irq)(struct arm_pmu *pmu); void (*enable)(struct perf_event *event); void (*disable)(struct perf_event *event); diff --git a/include/linux/stackprotector.h b/include/linux/stackprotector.h index 6b792d080eee..4c678c4fec58 100644 --- a/include/linux/stackprotector.h +++ b/include/linux/stackprotector.h @@ -6,7 +6,7 @@ #include <linux/sched.h> #include <linux/random.h> -#ifdef CONFIG_STACKPROTECTOR +#if defined(CONFIG_STACKPROTECTOR) || defined(CONFIG_ARM64_PTR_AUTH) # include <asm/stackprotector.h> #else static inline void boot_init_stack_canary(void) diff --git a/mm/mremap.c b/mm/mremap.c index af363063ea23..d28f08a36b96 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -606,6 +606,16 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, LIST_HEAD(uf_unmap_early); LIST_HEAD(uf_unmap); + /* + * There is a deliberate asymmetry here: we strip the pointer tag + * from the old address but leave the new address alone. This is + * for consistency with mmap(), where we prevent the creation of + * aliasing mappings in userspace by leaving the tag bits of the + * mapping address intact. A non-zero tag will cause the subsequent + * range checks to reject the address as invalid. + * + * See Documentation/arm64/tagged-address-abi.rst for more information. + */ addr = untagged_addr(addr); if (flags & ~(MREMAP_FIXED | MREMAP_MAYMOVE)) diff --git a/scripts/Kconfig.include b/scripts/Kconfig.include index 496d11c92c97..8074f14d9d0d 100644 --- a/scripts/Kconfig.include +++ b/scripts/Kconfig.include @@ -31,6 +31,12 @@ cc-option = $(success,$(CC) -Werror $(CLANG_FLAGS) $(1) -S -x c /dev/null -o /de # Return y if the linker supports <flag>, n otherwise ld-option = $(success,$(LD) -v $(1)) +# $(as-option,<flag>) +# /dev/zero is used as output instead of /dev/null as some assembler cribs when +# both input and output are same. Also both of them have same write behaviour so +# can be easily substituted. +as-option = $(success, $(CC) $(CLANG_FLAGS) $(1) -c -x assembler /dev/null -o /dev/zero) + # $(as-instr,<instr>) # Return y if the assembler supports <instr>, n otherwise as-instr = $(success,printf "%b\n" "$(1)" | $(CC) $(CLANG_FLAGS) -c -x assembler -o /dev/null -) |