summaryrefslogtreecommitdiffstats
path: root/arch/arm64/include
AgeCommit message (Collapse)AuthorFilesLines
2020-08-07asm-generic: pgalloc: provide generic pgd_free()Mike Rapoport1-0/+1
Most architectures define pgd_free() as a wrapper for free_page(). Provide a generic version in asm-generic/pgalloc.h and enable its use for most architectures. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Pekka Enberg <penberg@kernel.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Joerg Roedel <joro@8bytes.org> Cc: Joerg Roedel <jroedel@suse.de> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Cc: Stafford Horne <shorne@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Matthew Wilcox <willy@infradead.org> Link: http://lkml.kernel.org/r/20200627143453.31835-7-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07asm-generic: pgalloc: provide generic pud_alloc_one() and pud_free_one()Mike Rapoport1-11/+0
Several architectures define pud_alloc_one() as a wrapper for __get_free_page() and pud_free() as a wrapper for free_page(). Provide a generic implementation in asm-generic/pgalloc.h and use it where appropriate. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Pekka Enberg <penberg@kernel.org> Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Joerg Roedel <joro@8bytes.org> Cc: Joerg Roedel <jroedel@suse.de> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Cc: Stafford Horne <shorne@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Matthew Wilcox <willy@infradead.org> Link: http://lkml.kernel.org/r/20200627143453.31835-6-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07asm-generic: pgalloc: provide generic pmd_alloc_one() and pmd_free_one()Mike Rapoport1-26/+1
For most architectures that support >2 levels of page tables, pmd_alloc_one() is a wrapper for __get_free_pages(), sometimes with __GFP_ZERO and sometimes followed by memset(0) instead. More elaborate versions on arm64 and x86 account memory for the user page tables and call to pgtable_pmd_page_ctor() as the part of PMD page initialization. Move the arm64 version to include/asm-generic/pgalloc.h and use the generic version on several architectures. The pgtable_pmd_page_ctor() is a NOP when ARCH_ENABLE_SPLIT_PMD_PTLOCK is not enabled, so there is no functional change for most architectures except of the addition of __GFP_ACCOUNT for allocation of user page tables. The pmd_free() is a wrapper for free_page() in all the cases, so no functional change here. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Pekka Enberg <penberg@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Joerg Roedel <joro@8bytes.org> Cc: Joerg Roedel <jroedel@suse.de> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Cc: Stafford Horne <shorne@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Link: http://lkml.kernel.org/r/20200627143453.31835-5-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-06Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds4-25/+19
Pull KVM updates from Paolo Bonzini: "s390: - implement diag318 x86: - Report last CPU for debugging - Emulate smaller MAXPHYADDR in the guest than in the host - .noinstr and tracing fixes from Thomas - nested SVM page table switching optimization and fixes Generic: - Unify shadow MMU cache data structures across architectures" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits) KVM: SVM: Fix sev_pin_memory() error handling KVM: LAPIC: Set the TDCR settable bits KVM: x86: Specify max TDP level via kvm_configure_mmu() KVM: x86/mmu: Rename max_page_level to max_huge_page_level KVM: x86: Dynamically calculate TDP level from max level and MAXPHYADDR KVM: VXM: Remove temporary WARN on expected vs. actual EPTP level mismatch KVM: x86: Pull the PGD's level from the MMU instead of recalculating it KVM: VMX: Make vmx_load_mmu_pgd() static KVM: x86/mmu: Add separate helper for shadow NPT root page role calc KVM: VMX: Drop a duplicate declaration of construct_eptp() KVM: nSVM: Correctly set the shadow NPT root level in its MMU role KVM: Using macros instead of magic values MIPS: KVM: Fix build error caused by 'kvm_run' cleanup KVM: nSVM: remove nonsensical EXITINFO1 adjustment on nested NPF KVM: x86: Add a capability for GUEST_MAXPHYADDR < HOST_MAXPHYADDR support KVM: VMX: optimize #PF injection when MAXPHYADDR does not match KVM: VMX: Add guest physical address check in EPT violation and misconfig KVM: VMX: introduce vmx_need_pf_intercept KVM: x86: update exception bitmap on CPUID changes KVM: x86: rename update_bp_intercept to update_exception_bitmap ...
2020-08-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds2-2/+14
Pull networking updates from David Miller: 1) Support 6Ghz band in ath11k driver, from Rajkumar Manoharan. 2) Support UDP segmentation in code TSO code, from Eric Dumazet. 3) Allow flashing different flash images in cxgb4 driver, from Vishal Kulkarni. 4) Add drop frames counter and flow status to tc flower offloading, from Po Liu. 5) Support n-tuple filters in cxgb4, from Vishal Kulkarni. 6) Various new indirect call avoidance, from Eric Dumazet and Brian Vazquez. 7) Fix BPF verifier failures on 32-bit pointer arithmetic, from Yonghong Song. 8) Support querying and setting hardware address of a port function via devlink, use this in mlx5, from Parav Pandit. 9) Support hw ipsec offload on bonding slaves, from Jarod Wilson. 10) Switch qca8k driver over to phylink, from Jonathan McDowell. 11) In bpftool, show list of processes holding BPF FD references to maps, programs, links, and btf objects. From Andrii Nakryiko. 12) Several conversions over to generic power management, from Vaibhav Gupta. 13) Add support for SO_KEEPALIVE et al. to bpf_setsockopt(), from Dmitry Yakunin. 14) Various https url conversions, from Alexander A. Klimov. 15) Timestamping and PHC support for mscc PHY driver, from Antoine Tenart. 16) Support bpf iterating over tcp and udp sockets, from Yonghong Song. 17) Support 5GBASE-T i40e NICs, from Aleksandr Loktionov. 18) Add kTLS RX HW offload support to mlx5e, from Tariq Toukan. 19) Fix the ->ndo_start_xmit() return type to be netdev_tx_t in several drivers. From Luc Van Oostenryck. 20) XDP support for xen-netfront, from Denis Kirjanov. 21) Support receive buffer autotuning in MPTCP, from Florian Westphal. 22) Support EF100 chip in sfc driver, from Edward Cree. 23) Add XDP support to mvpp2 driver, from Matteo Croce. 24) Support MPTCP in sock_diag, from Paolo Abeni. 25) Commonize UDP tunnel offloading code by creating udp_tunnel_nic infrastructure, from Jakub Kicinski. 26) Several pci_ --> dma_ API conversions, from Christophe JAILLET. 27) Add FLOW_ACTION_POLICE support to mlxsw, from Ido Schimmel. 28) Add SK_LOOKUP bpf program type, from Jakub Sitnicki. 29) Refactor a lot of networking socket option handling code in order to avoid set_fs() calls, from Christoph Hellwig. 30) Add rfc4884 support to icmp code, from Willem de Bruijn. 31) Support TBF offload in dpaa2-eth driver, from Ioana Ciornei. 32) Support XDP_REDIRECT in qede driver, from Alexander Lobakin. 33) Support PCI relaxed ordering in mlx5 driver, from Aya Levin. 34) Support TCP syncookies in MPTCP, from Flowian Westphal. 35) Fix several tricky cases of PMTU handling wrt. briding, from Stefano Brivio. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2056 commits) net: thunderx: initialize VF's mailbox mutex before first usage usb: hso: remove bogus check for EINPROGRESS usb: hso: no complaint about kmalloc failure hso: fix bailout in error case of probe ip_tunnel_core: Fix build for archs without _HAVE_ARCH_IPV6_CSUM selftests/net: relax cpu affinity requirement in msg_zerocopy test mptcp: be careful on subflow creation selftests: rtnetlink: make kci_test_encap() return sub-test result selftests: rtnetlink: correct the final return value for the test net: dsa: sja1105: use detected device id instead of DT one on mismatch tipc: set ub->ifindex for local ipv6 address ipv6: add ipv6_dev_find() net: openvswitch: silence suspicious RCU usage warning Revert "vxlan: fix tos value before xmit" ptp: only allow phase values lower than 1 period farsync: switch from 'pci_' to 'dma_' API wan: wanxl: switch from 'pci_' to 'dma_' API hv_netvsc: do not use VF device if link is down dpaa2-eth: Fix passing zero to 'PTR_ERR' warning net: macb: Properly handle phylink on at91sam9x ...
2020-08-05random: random.h should include archrandom.h, not the other way aroundLinus Torvalds1-1/+0
This is hopefully the final piece of the crazy puzzle with random.h dependencies. And by "hopefully" I obviously mean "Linus is a hopeless optimist". Reported-and-tested-by: Daniel Díaz <daniel.diaz@linaro.org> Acked-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-04Merge tag 'close-range-v5.9' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull close_range() implementation from Christian Brauner: "This adds the close_range() syscall. It allows to efficiently close a range of file descriptors up to all file descriptors of a calling task. This is coordinated with the FreeBSD folks which have copied our version of this syscall and in the meantime have already merged it in April 2019: https://reviews.freebsd.org/D21627 https://svnweb.freebsd.org/base?view=revision&revision=359836 The syscall originally came up in a discussion around the new mount API and making new file descriptor types cloexec by default. During this discussion, Al suggested the close_range() syscall. First, it helps to close all file descriptors of an exec()ing task. This can be done safely via (quoting Al's example from [1] verbatim): /* that exec is sensitive */ unshare(CLONE_FILES); /* we don't want anything past stderr here */ close_range(3, ~0U); execve(....); The code snippet above is one way of working around the problem that file descriptors are not cloexec by default. This is aggravated by the fact that we can't just switch them over without massively regressing userspace. For a whole class of programs having an in-kernel method of closing all file descriptors is very helpful (e.g. demons, service managers, programming language standard libraries, container managers etc.). Second, it allows userspace to avoid implementing closing all file descriptors by parsing through /proc/<pid>/fd/* and calling close() on each file descriptor and other hacks. From looking at various large(ish) userspace code bases this or similar patterns are very common in service managers, container runtimes, and programming language runtimes/standard libraries such as Python or Rust. In addition, the syscall will also work for tasks that do not have procfs mounted and on kernels that do not have procfs support compiled in. In such situations the only way to make sure that all file descriptors are closed is to call close() on each file descriptor up to UINT_MAX or RLIMIT_NOFILE, OPEN_MAX trickery. Based on Linus' suggestion close_range() also comes with a new flag CLOSE_RANGE_UNSHARE to more elegantly handle file descriptor dropping right before exec. This would usually be expressed in the sequence: unshare(CLONE_FILES); close_range(3, ~0U); as pointed out by Linus it might be desirable to have this be a part of close_range() itself under a new flag CLOSE_RANGE_UNSHARE which gets especially handy when we're closing all file descriptors above a certain threshold. Test-suite as always included" * tag 'close-range-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: tests: add CLOSE_RANGE_UNSHARE tests close_range: add CLOSE_RANGE_UNSHARE tests: add close_range() tests arch: wire-up close_range() open: add close_range()
2020-08-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller1-0/+12
Daniel Borkmann says: ==================== pull-request: bpf-next 2020-08-04 The following pull-request contains BPF updates for your *net-next* tree. We've added 73 non-merge commits during the last 9 day(s) which contain a total of 135 files changed, 4603 insertions(+), 1013 deletions(-). The main changes are: 1) Implement bpf_link support for XDP. Also add LINK_DETACH operation for the BPF syscall allowing processes with BPF link FD to force-detach, from Andrii Nakryiko. 2) Add BPF iterator for map elements and to iterate all BPF programs for efficient in-kernel inspection, from Yonghong Song and Alexei Starovoitov. 3) Separate bpf_get_{stack,stackid}() helpers for perf events in BPF to avoid unwinder errors, from Song Liu. 4) Allow cgroup local storage map to be shared between programs on the same cgroup. Also extend BPF selftests with coverage, from YiFei Zhu. 5) Add BPF exception tables to ARM64 JIT in order to be able to JIT BPF_PROBE_MEM load instructions, from Jean-Philippe Brucker. 6) Follow-up fixes on BPF socket lookup in combination with reuseport group handling. Also add related BPF selftests, from Jakub Sitnicki. 7) Allow to use socket storage in BPF_PROG_TYPE_CGROUP_SOCK-typed programs for socket create/release as well as bind functions, from Stanislav Fomichev. 8) Fix an info leak in xsk_getsockopt() when retrieving XDP stats via old struct xdp_statistics, from Peilin Ye. 9) Fix PT_REGS_RC{,_CORE}() macros in libbpf for MIPS arch, from Jerry Crunchtime. 10) Extend BPF kernel test infra with skb->family and skb->{local,remote}_ip{4,6} fields and allow user space to specify skb->dev via ifindex, from Dmitry Yakunin. 11) Fix a bpftool segfault due to missing program type name and make it more robust to prevent them in future gaps, from Quentin Monnet. 12) Consolidate cgroup helper functions across selftests and fix a v6 localhost resolver issue, from John Fastabend. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03Merge tag 'sched-core-2020-08-03' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - Improve uclamp performance by using a static key for the fast path - Add the "sched_util_clamp_min_rt_default" sysctl, to optimize for better power efficiency of RT tasks on battery powered devices. (The default is to maximize performance & reduce RT latencies.) - Improve utime and stime tracking accuracy, which had a fixed boundary of error, which created larger and larger relative errors as the values become larger. This is now replaced with more precise arithmetics, using the new mul_u64_u64_div_u64() helper in math64.h. - Improve the deadline scheduler, such as making it capacity aware - Improve frequency-invariant scheduling - Misc cleanups in energy/power aware scheduling - Add sched_update_nr_running tracepoint to track changes to nr_running - Documentation additions and updates - Misc cleanups and smaller fixes * tag 'sched-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits) sched/doc: Factorize bits between sched-energy.rst & sched-capacity.rst sched/doc: Document capacity aware scheduling sched: Document arch_scale_*_capacity() arm, arm64: Fix selection of CONFIG_SCHED_THERMAL_PRESSURE Documentation/sysctl: Document uclamp sysctl knobs sched/uclamp: Add a new sysctl to control RT default boost value sched/uclamp: Fix a deadlock when enabling uclamp static key sched: Remove duplicated tick_nohz_full_enabled() check sched: Fix a typo in a comment sched/uclamp: Remove unnecessary mutex_init() arm, arm64: Select CONFIG_SCHED_THERMAL_PRESSURE sched: Cleanup SCHED_THERMAL_PRESSURE kconfig entry arch_topology, sched/core: Cleanup thermal pressure definition trace/events/sched.h: fix duplicated word linux/sched/mm.h: drop duplicated words in comments smp: Fix a potential usage of stale nr_cpus sched/fair: update_pick_idlest() Select group with lowest group_util when idle_cpus are equal sched: nohz: stop passing around unused "ticks" parameter. sched: Better document ttwu() sched: Add a tracepoint to track rq->nr_running ...
2020-08-03Merge tag 'locking-core-2020-08-03' of ↵Linus Torvalds1-2/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: - LKMM updates: mostly documentation changes, but also some new litmus tests for atomic ops. - KCSAN updates: the most important change is that GCC 11 now has all fixes in place to support KCSAN, so GCC support can be enabled again. Also more annotations. - futex updates: minor cleanups and simplifications - seqlock updates: merge preparatory changes/cleanups for the 'associated locks' facilities. - lockdep updates: - simplify IRQ trace event handling - add various new debug checks - simplify header dependencies, split out <linux/lockdep_types.h>, decouple lockdep from other low level headers some more - fix NMI handling - misc cleanups and smaller fixes * tag 'locking-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) kcsan: Improve IRQ state trace reporting lockdep: Refactor IRQ trace events fields into struct seqlock: lockdep assert non-preemptibility on seqcount_t write lockdep: Add preemption enabled/disabled assertion APIs seqlock: Implement raw_seqcount_begin() in terms of raw_read_seqcount() seqlock: Add kernel-doc for seqcount_t and seqlock_t APIs seqlock: Reorder seqcount_t and seqlock_t API definitions seqlock: seqcount_t latch: End read sections with read_seqcount_retry() seqlock: Properly format kernel-doc code samples Documentation: locking: Describe seqlock design and usage locking/qspinlock: Do not include atomic.h from qspinlock_types.h locking/atomic: Move ATOMIC_INIT into linux/types.h lockdep: Move list.h inclusion into lockdep.h locking/lockdep: Fix TRACE_IRQFLAGS vs. NMIs futex: Remove unused or redundant includes futex: Consistently use fshared as boolean futex: Remove needless goto's futex: Remove put_futex_key() rwsem: fix commas in initialisation docs: locking: Replace HTTP links with HTTPS ones ...
2020-08-03Merge tag 'arm64-upstream' of ↵Linus Torvalds22-55/+348
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 and cross-arch updates from Catalin Marinas: "Here's a slightly wider-spread set of updates for 5.9. Going outside the usual arch/arm64/ area is the removal of read_barrier_depends() series from Will and the MSI/IOMMU ID translation series from Lorenzo. The notable arm64 updates include ARMv8.4 TLBI range operations and translation level hint, time namespace support, and perf. Summary: - Removal of the tremendously unpopular read_barrier_depends() barrier, which is a NOP on all architectures apart from Alpha, in favour of allowing architectures to override READ_ONCE() and do whatever dance they need to do to ensure address dependencies provide LOAD -> LOAD/STORE ordering. This work also offers a potential solution if compilers are shown to convert LOAD -> LOAD address dependencies into control dependencies (e.g. under LTO), as weakly ordered architectures will effectively be able to upgrade READ_ONCE() to smp_load_acquire(). The latter case is not used yet, but will be discussed further at LPC. - Make the MSI/IOMMU input/output ID translation PCI agnostic, augment the MSI/IOMMU ACPI/OF ID mapping APIs to accept an input ID bus-specific parameter and apply the resulting changes to the device ID space provided by the Freescale FSL bus. - arm64 support for TLBI range operations and translation table level hints (part of the ARMv8.4 architecture version). - Time namespace support for arm64. - Export the virtual and physical address sizes in vmcoreinfo for makedumpfile and crash utilities. - CPU feature handling cleanups and checks for programmer errors (overlapping bit-fields). - ACPI updates for arm64: disallow AML accesses to EFI code regions and kernel memory. - perf updates for arm64. - Miscellaneous fixes and cleanups, most notably PLT counting optimisation for module loading, recordmcount fix to ignore relocations other than R_AARCH64_CALL26, CMA areas reserved for gigantic pages on 16K and 64K configurations. - Trivial typos, duplicate words" Link: http://lkml.kernel.org/r/20200710165203.31284-1-will@kernel.org Link: http://lkml.kernel.org/r/20200619082013.13661-1-lorenzo.pieralisi@arm.com * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (82 commits) arm64: use IRQ_STACK_SIZE instead of THREAD_SIZE for irq stack arm64/mm: save memory access in check_and_switch_context() fast switch path arm64: sigcontext.h: delete duplicated word arm64: ptrace.h: delete duplicated word arm64: pgtable-hwdef.h: delete duplicated words bus: fsl-mc: Add ACPI support for fsl-mc bus/fsl-mc: Refactor the MSI domain creation in the DPRC driver of/irq: Make of_msi_map_rid() PCI bus agnostic of/irq: make of_msi_map_get_device_domain() bus agnostic dt-bindings: arm: fsl: Add msi-map device-tree binding for fsl-mc bus of/device: Add input id to of_dma_configure() of/iommu: Make of_map_rid() PCI agnostic ACPI/IORT: Add an input ID to acpi_dma_configure() ACPI/IORT: Remove useless PCI bus walk ACPI/IORT: Make iort_msi_map_rid() PCI agnostic ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic ACPI/IORT: Make iort_match_node_callback walk the ACPI namespace for NC arm64: enable time namespace support arm64/vdso: Restrict splitting VVAR VMA arm64/vdso: Handle faults on timens page ...
2020-08-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-3/+8
Pull KVM fixes from Paolo Bonzini: "Bugfixes and strengthening the validity checks on inputs from new userspace APIs. Now I know why I shouldn't prepare pull requests on the weekend, it's hard to concentrate if your son is shouting about his latest Minecraft builds in your ear. Fortunately all the patches were ready and I just had to check the test results..." * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: SVM: Fix disable pause loop exit/pause filtering capability on SVM KVM: LAPIC: Prevent setting the tscdeadline timer if the lapic is hw disabled KVM: arm64: Don't inherit exec permission across page-table levels KVM: arm64: Prevent vcpu_has_ptrauth from generating OOL functions KVM: nVMX: check for invalid hdr.vmx.flags KVM: nVMX: check for required but missing VMCS12 in KVM_SET_NESTED_STATE selftests: kvm: do not set guest mode flag
2020-08-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller3-5/+5
Resolved kernel/bpf/btf.c using instructions from merge commit 69138b34a7248d2396ab85c8652e20c0c39beaba Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-31Merge branch 'for-next/read-barrier-depends' into for-next/coreCatalin Marinas5-6/+10
* for-next/read-barrier-depends: : Allow architectures to override __READ_ONCE() arm64: Reduce the number of header files pulled into vmlinux.lds.S compiler.h: Move compiletime_assert() macros into compiler_types.h checkpatch: Remove checks relating to [smp_]read_barrier_depends() include/linux: Remove smp_read_barrier_depends() from comments tools/memory-model: Remove smp_read_barrier_depends() from informal doc Documentation/barriers/kokr: Remove references to [smp_]read_barrier_depends() Documentation/barriers: Remove references to [smp_]read_barrier_depends() locking/barriers: Remove definitions for [smp_]read_barrier_depends() alpha: Replace smp_read_barrier_depends() usage with smp_[r]mb() vhost: Remove redundant use of read_barrier_depends() barrier asm/rwonce: Don't pull <asm/barrier.h> into 'asm-generic/rwonce.h' asm/rwonce: Remove smp_read_barrier_depends() invocation alpha: Override READ_ONCE() with barriered implementation asm/rwonce: Allow __READ_ONCE to be overridden by the architecture compiler.h: Split {READ,WRITE}_ONCE definitions out into rwonce.h tools: bpf: Use local copy of headers including uapi/linux/filter.h
2020-07-31Merge branch 'for-next/tlbi' into for-next/coreCatalin Marinas8-17/+223
* for-next/tlbi: : Support for TTL (translation table level) hint in the TLB operations arm64: tlb: Use the TLBI RANGE feature in arm64 arm64: enable tlbi range instructions arm64: tlb: Detect the ARMv8.4 TLBI RANGE feature arm64: tlb: don't set the ttl value in flush_tlb_page_nosync arm64: Shift the __tlbi_level() indentation left arm64: tlb: Set the TTL field in flush_*_tlb_range arm64: tlb: Set the TTL field in flush_tlb_range tlb: mmu_gather: add tlb_flush_*_range APIs arm64: Add tlbi_user_level TLB invalidation helper arm64: Add level-hinted TLB invalidation helper arm64: Document SW reserved PTE/PMD bits in Stage-2 descriptors arm64: Detect the ARMv8.4 TTL feature
2020-07-31Merge branches 'for-next/misc', 'for-next/vmcoreinfo', ↵Catalin Marinas9-18/+97
'for-next/cpufeature', 'for-next/acpi', 'for-next/perf', 'for-next/timens', 'for-next/msi-iommu' and 'for-next/trivial' into for-next/core * for-next/misc: : Miscellaneous fixes and cleanups arm64: use IRQ_STACK_SIZE instead of THREAD_SIZE for irq stack arm64/mm: save memory access in check_and_switch_context() fast switch path recordmcount: only record relocation of type R_AARCH64_CALL26 on arm64. arm64: Reserve HWCAP2_MTE as (1 << 18) arm64/entry: deduplicate SW PAN entry/exit routines arm64: s/AMEVTYPE/AMEVTYPER arm64/hugetlb: Reserve CMA areas for gigantic pages on 16K and 64K configs arm64: stacktrace: Move export for save_stack_trace_tsk() smccc: Make constants available to assembly arm64/mm: Redefine CONT_{PTE, PMD}_SHIFT arm64/defconfig: Enable CONFIG_KEXEC_FILE arm64: Document sysctls for emulated deprecated instructions arm64/panic: Unify all three existing notifier blocks arm64/module: Optimize module load time by optimizing PLT counting * for-next/vmcoreinfo: : Export the virtual and physical address sizes in vmcoreinfo arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo * for-next/cpufeature: : CPU feature handling cleanups arm64/cpufeature: Validate feature bits spacing in arm64_ftr_regs[] arm64/cpufeature: Replace all open bits shift encodings with macros arm64/cpufeature: Add remaining feature bits in ID_AA64MMFR2 register arm64/cpufeature: Add remaining feature bits in ID_AA64MMFR1 register arm64/cpufeature: Add remaining feature bits in ID_AA64MMFR0 register * for-next/acpi: : ACPI updates for arm64 arm64/acpi: disallow writeable AML opregion mapping for EFI code regions arm64/acpi: disallow AML memory opregions to access kernel memory * for-next/perf: : perf updates for arm64 arm64: perf: Expose some new events via sysfs tools headers UAPI: Update tools's copy of linux/perf_event.h arm64: perf: Add cap_user_time_short perf: Add perf_event_mmap_page::cap_user_time_short ABI arm64: perf: Only advertise cap_user_time for arch_timer arm64: perf: Implement correct cap_user_time time/sched_clock: Use raw_read_seqcount_latch() sched_clock: Expose struct clock_read_data arm64: perf: Correct the event index in sysfs perf/smmuv3: To simplify code for ioremap page in pmcg * for-next/timens: : Time namespace support for arm64 arm64: enable time namespace support arm64/vdso: Restrict splitting VVAR VMA arm64/vdso: Handle faults on timens page arm64/vdso: Add time namespace page arm64/vdso: Zap vvar pages when switching to a time namespace arm64/vdso: use the fault callback to map vvar pages * for-next/msi-iommu: : Make the MSI/IOMMU input/output ID translation PCI agnostic, augment the : MSI/IOMMU ACPI/OF ID mapping APIs to accept an input ID bus-specific parameter : and apply the resulting changes to the device ID space provided by the : Freescale FSL bus bus: fsl-mc: Add ACPI support for fsl-mc bus/fsl-mc: Refactor the MSI domain creation in the DPRC driver of/irq: Make of_msi_map_rid() PCI bus agnostic of/irq: make of_msi_map_get_device_domain() bus agnostic dt-bindings: arm: fsl: Add msi-map device-tree binding for fsl-mc bus of/device: Add input id to of_dma_configure() of/iommu: Make of_map_rid() PCI agnostic ACPI/IORT: Add an input ID to acpi_dma_configure() ACPI/IORT: Remove useless PCI bus walk ACPI/IORT: Make iort_msi_map_rid() PCI agnostic ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic ACPI/IORT: Make iort_match_node_callback walk the ACPI namespace for NC * for-next/trivial: : Trivial fixes arm64: sigcontext.h: delete duplicated word arm64: ptrace.h: delete duplicated word arm64: pgtable-hwdef.h: delete duplicated words
2020-07-31Merge tag 'arm64-fixes' of ↵Linus Torvalds3-5/+5
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "The main one is to fix the build after Willy's per-cpu entropy changes this week. Although that was already resolved elsewhere, the arm64 fix here is useful cleanup anyway. Other than that, we've got a fix for building with Clang's integrated assembler and a fix to make our IPv4 checksumming robust against invalid header lengths (this only seems to be triggerable by injected errors). - Fix build breakage due to circular headers - Fix build regression when using Clang's integrated assembler - Fix IPv4 header checksum code to deal with invalid length field - Fix broken path for Arm PMU entry in MAINTAINERS" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: MAINTAINERS: Include drivers subdirs for ARM PMU PROFILING AND DEBUGGING entry arm64: csum: Fix handling of bad packets arm64: Drop unnecessary include from asm/smp.h arm64/alternatives: move length validation inside the subsection
2020-07-31bpf, arm64: Add BPF exception tablesJean-Philippe Brucker1-0/+12
When a tracing BPF program attempts to read memory without using the bpf_probe_read() helper, the verifier marks the load instruction with the BPF_PROBE_MEM flag. Since the arm64 JIT does not currently recognize this flag it falls back to the interpreter. Add support for BPF_PROBE_MEM, by appending an exception table to the BPF program. If the load instruction causes a data abort, the fixup infrastructure finds the exception table and fixes up the fault, by clearing the destination register and jumping over the faulting instruction. To keep the compact exception table entry format, inspect the pc in fixup_exception(). A more generic solution would add a "handler" field to the table entry, like on x86 and s390. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200728152122.1292756-2-jean-philippe@linaro.org
2020-07-30arm64: csum: Fix handling of bad packetsRobin Murphy1-2/+3
Although iph is expected to point to at least 20 bytes of valid memory, ihl may be bogus, for example on reception of a corrupt packet. If it happens to be less than 5, we really don't want to run away and dereference 16GB worth of memory until it wraps back to exactly zero... Fixes: 0e455d8e80aa ("arm64: Implement optimised IP checksum helpers") Reported-by: guodeqing <geffrey.guo@huawei.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-30arm64: Drop unnecessary include from asm/smp.hMarc Zyngier1-1/+0
asm/pointer_auth.h is not needed anymore in asm/smp.h, as 62a679cb2825 ("arm64: simplify ptrauth initialization") removed the keys from the secondary_data structure. This also cures a compilation issue introduced by f227e3ec3b5c ("random32: update the net random state on interrupt and activity"). Fixes: 62a679cb2825 ("arm64: simplify ptrauth initialization") Fixes: f227e3ec3b5c ("random32: update the net random state on interrupt and activity") Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-30arm64/alternatives: move length validation inside the subsectionSami Tolvanen1-2/+2
Commit f7b93d42945c ("arm64/alternatives: use subsections for replacement sequences") breaks LLVM's integrated assembler, because due to its one-pass design, it cannot compute instruction sequence lengths before the layout for the subsection has been finalized. This change fixes the build by moving the .org directives inside the subsection, so they are processed after the subsection layout is known. Fixes: f7b93d42945c ("arm64/alternatives: use subsections for replacement sequences") Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Link: https://github.com/ClangBuiltLinux/linux/issues/1078 Link: https://lore.kernel.org/r/20200730153701.3892953-1-samitolvanen@google.com Signed-off-by: Will Deacon <will@kernel.org>
2020-07-30arm64/mm: save memory access in check_and_switch_context() fast switch pathPingfan Liu1-4/+2
On arm64, smp_processor_id() reads a per-cpu `cpu_number` variable, using the per-cpu offset stored in the tpidr_el1 system register. In some cases we generate a per-cpu address with a sequence like: cpu_ptr = &per_cpu(ptr, smp_processor_id()); Which potentially incurs a cache miss for both `cpu_number` and the in-memory `__per_cpu_offset` array. This can be written more optimally as: cpu_ptr = this_cpu_ptr(ptr); Which only needs the offset from tpidr_el1, and does not need to load from memory. The following two test cases show a small performance improvement measured on a 46-cpus qualcomm machine with 5.8.0-rc4 kernel. Test 1: (about 0.3% improvement) #cat b.sh make clean && make all -j138 #perf stat --repeat 10 --null --sync sh b.sh - before this patch Performance counter stats for 'sh b.sh' (10 runs): 298.62 +- 1.86 seconds time elapsed ( +- 0.62% ) - after this patch Performance counter stats for 'sh b.sh' (10 runs): 297.734 +- 0.954 seconds time elapsed ( +- 0.32% ) Test 2: (about 1.69% improvement) 'perf stat -r 10 perf bench sched messaging' Then sum the total time of 'sched/messaging' by manual. - before this patch total 0.707 sec for 10 times - after this patch totol 0.695 sec for 10 times Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Steve Capper <steve.capper@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Vladimir Murzin <vladimir.murzin@arm.com> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org> Link: https://lore.kernel.org/r/1594389852-19949-1-git-send-email-kernelfans@gmail.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-30arm64: sigcontext.h: delete duplicated wordRandy Dunlap1-1/+1
Drop the repeated word "the". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20200726003207.20253-4-rdunlap@infradead.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-30arm64: ptrace.h: delete duplicated wordRandy Dunlap1-1/+1
Drop the repeated word "the". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20200726003207.20253-3-rdunlap@infradead.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-30arm64: pgtable-hwdef.h: delete duplicated wordsRandy Dunlap1-2/+2
Drop the repeated words "at" and "the". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20200726003207.20253-2-rdunlap@infradead.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-29Merge branch 'locking/header'Peter Zijlstra1-2/+0
2020-07-29locking/atomic: Move ATOMIC_INIT into linux/types.hHerbert Xu1-2/+0
This patch moves ATOMIC_INIT from asm/atomic.h into linux/types.h. This allows users of atomic_t to use ATOMIC_INIT without having to include atomic.h as that way may lead to header loops. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Waiman Long <longman@redhat.com> Link: https://lkml.kernel.org/r/20200729123105.GB7047@gondor.apana.org.au
2020-07-28KVM: arm64: Prevent vcpu_has_ptrauth from generating OOL functionsMarc Zyngier1-3/+8
So far, vcpu_has_ptrauth() is implemented in terms of system_supports_*_auth() calls, which are declared "inline". In some specific conditions (clang and SCS), the "inline" very much turns into an "out of line", which leads to a fireworks when this predicate is evaluated on a non-VHE system (right at the beginning of __hyp_handle_ptrauth). Instead, make sure vcpu_has_ptrauth gets expanded inline by directly using the cpus_have_final_cap() helpers, which are __always_inline, generate much better code, and are the only thing that make sense when running at EL2 on a nVHE system. Fixes: 29eb5a3c57f7 ("KVM: arm64: Handle PtrAuth traps early") Reported-by: Nathan Chancellor <natechancellor@gmail.com> Reported-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Link: https://lore.kernel.org/r/20200722162231.3689767-1-maz@kernel.org
2020-07-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller3-1/+14
The UDP reuseport conflict was a little bit tricky. The net-next code, via bpf-next, extracted the reuseport handling into a helper so that the BPF sk lookup code could invoke it. At the same time, the logic for reuseport handling of unconnected sockets changed via commit efc6b6f6c3113e8b203b9debfb72d81e0f3dcace which changed the logic to carry on the reuseport result into the rest of the lookup loop if we do not return immediately. This requires moving the reuseport_has_conns() logic into the callers. While we are here, get rid of inline directives as they do not belong in foo.c files. The other changes were cases of more straightforward overlapping modifications. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24arm64/vdso: Add time namespace pageAndrei Vagin3-0/+22
Allocate the time namespace page among VVAR pages. Provide __arch_get_timens_vdso_data() helper for VDSO code to get the code-relative position of VVARs on that special page. If a task belongs to a time namespace then the VVAR page which contains the system wide VDSO data is replaced with a namespace specific page which has the same layout as the VVAR page. That page has vdso_data->seq set to 1 to enforce the slow path and vdso_data->clock_mode set to VCLOCK_TIMENS to enforce the time namespace handling path. The extra check in the case that vdso_data->seq is odd, e.g. a concurrent update of the VDSO data is in progress, is not really affecting regular tasks which are not part of a time namespace as the task is spin waiting for the update to finish and vdso_data->seq to become even again. If a time namespace task hits that code path, it invokes the corresponding time getter function which retrieves the real VVAR page, reads host time and then adds the offset for the requested clock which is stored in the special VVAR page. The time-namespace page isn't allocated on !CONFIG_TIME_NAMESPACE, but vma is the same size, which simplifies criu/vdso migration between different kernel configs. Signed-off-by: Andrei Vagin <avagin@gmail.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Dmitry Safonov <dima@arista.com> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20200624083321.144975-4-avagin@gmail.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-24arm64: Reserve HWCAP2_MTE as (1 << 18)Catalin Marinas2-0/+2
While MTE is not supported in the upstream kernel yet, add a comment that HWCAP2_MTE as (1 << 18) is reserved. Glibc makes use of it for the resolving (ifunc) of the MTE-safe string routines. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-22arm64: s/AMEVTYPE/AMEVTYPERVladimir Murzin1-2/+2
Activity Monitor Event Type Registers are named as AMEVTYPER{0,1}<n> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20200721091259.102756-1-vladimir.murzin@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-22arch_topology, sched/core: Cleanup thermal pressure definitionValentin Schneider1-1/+2
The following commit: 14533a16c46d ("thermal/cpu-cooling, sched/core: Move the arch_set_thermal_pressure() API to generic scheduler code") moved the definition of arch_set_thermal_pressure() to sched/core.c, but kept its declaration in linux/arch_topology.h. When building e.g. an x86 kernel with CONFIG_SCHED_THERMAL_PRESSURE=y, cpufreq_cooling.c ends up getting the declaration of arch_set_thermal_pressure() from include/linux/arch_topology.h, which is somewhat awkward. On top of this, sched/core.c unconditionally defines o The thermal_pressure percpu variable o arch_set_thermal_pressure() while arch_scale_thermal_pressure() does nothing unless redefined by the architecture. arch_*() functions are meant to be defined by architectures, so revert the aforementioned commit and re-implement it in a way that keeps arch_set_thermal_pressure() architecture-definable, and doesn't define the thermal pressure percpu variable for kernels that don't need it (CONFIG_SCHED_THERMAL_PRESSURE=n). Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200712165917.9168-2-valentin.schneider@arm.com
2020-07-21arm64: perf: Expose some new events via sysfsShaokun Zhang1-0/+27
Some new PMU events can been detected by PMCEID1_EL0, but it can't be listed, Let's expose these through sysfs. Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/1595328573-12751-2-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2020-07-21arm64: Reduce the number of header files pulled into vmlinux.lds.SWill Deacon3-6/+8
Although vmlinux.lds.S smells like an assembly file and is compiled with __ASSEMBLY__ defined, it's actually just fed to the preprocessor to create our linker script. This means that any assembly macros defined by headers that it includes will result in a helpful link error: | aarch64-linux-gnu-ld:./arch/arm64/kernel/vmlinux.lds:1: syntax error In preparation for an arm64-private asm/rwonce.h implementation, which will end up pulling assembly macros into linux/compiler.h, reduce the number of headers we include directly and transitively in vmlinux.lds.S Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-21asm/rwonce: Don't pull <asm/barrier.h> into 'asm-generic/rwonce.h'Will Deacon2-0/+2
Now that 'smp_read_barrier_depends()' has gone the way of the Norwegian Blue, drop the inclusion of <asm/barrier.h> in 'asm-generic/rwonce.h'. This requires fixups to some architecture vdso headers which were previously relying on 'asm/barrier.h' coming in via 'linux/compiler.h'. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-19net: remove compat_sys_{get,set}sockoptChristoph Hellwig1-2/+2
Now that the ->compat_{get,set}sockopt proto_ops methods are gone there is no good reason left to keep the compat syscalls separate. This fixes the odd use of unsigned int for the compat_setsockopt optlen and the missing sock_use_custom_sol_socket. It would also easily allow running the eBPF hooks for the compat syscalls, but such a large change in behavior does not belong into a consolidation patch like this one. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17Merge tag 'arm64-fixes' of ↵Linus Torvalds3-1/+14
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into master Pull arm64 fixes from Will Deacon: "A batch of arm64 fixes. Although the diffstat is a bit larger than we'd usually have at this stage, a decent amount of it is the addition of comments describing our syscall tracing behaviour, and also a sweep across all the modular arm64 PMU drivers to make them rebust against unloading and unbinding. There are a couple of minor things kicking around at the moment (CPU errata and module PLTs for very large modules), but I'm not expecting any significant changes now for us in 5.8. - Fix kernel text addresses for relocatable images booting using EFI and with KASLR disabled so that they match the vmlinux ELF binary. - Fix unloading and unbinding of PMU driver modules. - Fix generic mmiowb() when writeX() is called from preemptible context (reported by the riscv folks). - Fix ptrace hardware single-step interactions with signal handlers, system calls and reverse debugging. - Fix reporting of 64-bit x0 register for 32-bit tasks via 'perf_regs'. - Add comments describing syscall entry/exit tracing ABI" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: drivers/perf: Prevent forced unbinding of PMU drivers asm-generic/mmiowb: Allow mmiowb_set_pending() when preemptible() arm64: Use test_tsk_thread_flag() for checking TIF_SINGLESTEP arm64: ptrace: Use NO_SYSCALL instead of -1 in syscall_trace_enter() arm64: syscall: Expand the comment about ptrace and syscall(-1) arm64: ptrace: Add a comment describing our syscall entry/exit trap ABI arm64: compat: Ensure upper 32 bits of x0 are zero on syscall return arm64: ptrace: Override SPSR.SS when single-stepping is enabled arm64: ptrace: Consistently use pseudo-singlestep exceptions drivers/perf: Fix kernel panic when rmmod PMU modules during perf sampling efi/libstub/arm64: Retain 2MB kernel Image alignment if !KASLR
2020-07-16arm64: compat: Ensure upper 32 bits of x0 are zero on syscall returnWill Deacon1-1/+11
Although we zero the upper bits of x0 on entry to the kernel from an AArch32 task, we do not clear them on the exception return path and can therefore expose 64-bit sign extended syscall return values to userspace via interfaces such as the 'perf_regs' ABI, which deal exclusively with 64-bit registers. Explicitly clear the upper 32 bits of x0 on return from a compat system call. Cc: <stable@vger.kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Keno Fischer <keno@juliacomputing.com> Cc: Luis Machado <luis.machado@linaro.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16arm64: ptrace: Override SPSR.SS when single-stepping is enabledWill Deacon1-0/+2
Luis reports that, when reverse debugging with GDB, single-step does not function as expected on arm64: | I've noticed, under very specific conditions, that a PTRACE_SINGLESTEP | request by GDB won't execute the underlying instruction. As a consequence, | the PC doesn't move, but we return a SIGTRAP just like we would for a | regular successful PTRACE_SINGLESTEP request. The underlying problem is that when the CPU register state is restored as part of a reverse step, the SPSR.SS bit is cleared and so the hardware single-step state can transition to the "active-pending" state, causing an unexpected step exception to be taken immediately if a step operation is attempted. In hindsight, we probably shouldn't have exposed SPSR.SS in the pstate accessible by the GPR regset, but it's a bit late for that now. Instead, simply prevent userspace from configuring the bit to a value which is inconsistent with the TIF_SINGLESTEP state for the task being traced. Cc: <stable@vger.kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Keno Fischer <keno@juliacomputing.com> Link: https://lore.kernel.org/r/1eed6d69-d53d-9657-1fc9-c089be07f98c@linaro.org Reported-by: Luis Machado <luis.machado@linaro.org> Tested-by: Luis Machado <luis.machado@linaro.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16arm64: ptrace: Consistently use pseudo-singlestep exceptionsWill Deacon1-0/+1
Although the arm64 single-step state machine can be fast-forwarded in cases where we wish to generate a SIGTRAP without actually executing an instruction, this has two major limitations outside of simply skipping an instruction due to emulation. 1. Stepping out of a ptrace signal stop into a signal handler where SIGTRAP is blocked. Fast-forwarding the stepping state machine in this case will result in a forced SIGTRAP, with the handler reset to SIG_DFL. 2. The hardware implicitly fast-forwards the state machine when executing an SVC instruction for issuing a system call. This can interact badly with subsequent ptrace stops signalled during the execution of the system call (e.g. SYSCALL_EXIT or seccomp traps), as they may corrupt the stepping state by updating the PSTATE for the tracee. Resolve both of these issues by injecting a pseudo-singlestep exception on entry to a signal handler and also on return to userspace following a system call. Cc: <stable@vger.kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Tested-by: Luis Machado <luis.machado@linaro.org> Reported-by: Keno Fischer <keno@juliacomputing.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-15arm64: tlb: Use the TLBI RANGE feature in arm64Zhenyu Ye2-29/+131
Add __TLBI_VADDR_RANGE macro and rewrite __flush_tlb_range(). When cpu supports TLBI feature, the minimum range granularity is decided by 'scale', so we can not flush all pages by one instruction in some cases. For example, when the pages = 0xe81a, let's start 'scale' from maximum, and find right 'num' for each 'scale': 1. scale = 3, we can flush no pages because the minimum range is 2^(5*3 + 1) = 0x10000. 2. scale = 2, the minimum range is 2^(5*2 + 1) = 0x800, we can flush 0xe800 pages this time, the num = 0xe800/0x800 - 1 = 0x1c. Remaining pages is 0x1a; 3. scale = 1, the minimum range is 2^(5*1 + 1) = 0x40, no page can be flushed. 4. scale = 0, we flush the remaining 0x1a pages, the num = 0x1a/0x2 - 1 = 0xd. However, in most scenarios, the pages = 1 when flush_tlb_range() is called. Start from scale = 3 or other proper value (such as scale = ilog2(pages)), will incur extra overhead. So increase 'scale' from 0 to maximum, the flush order is exactly opposite to the example. Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com> Link: https://lore.kernel.org/r/20200715071945.897-4-yezhenyu2@huawei.com [catalin.marinas@arm.com: removed unnecessary masks in __TLBI_VADDR_RANGE] [catalin.marinas@arm.com: __TLB_RANGE_NUM subtracts 1] [catalin.marinas@arm.com: minor adjustments to the comments] [catalin.marinas@arm.com: introduce system_supports_tlb_range()] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-15arm64: tlb: Detect the ARMv8.4 TLBI RANGE featureZhenyu Ye2-1/+5
ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a range of input addresses. This patch detect this feature. Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com> Link: https://lore.kernel.org/r/20200715071945.897-2-yezhenyu2@huawei.com [catalin.marinas@arm.com: some renaming for consistency] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-15arm64/hugetlb: Reserve CMA areas for gigantic pages on 16K and 64K configsAnshuman Khandual1-0/+2
Currently 'hugetlb_cma=' command line argument does not create CMA area on ARM64_16K_PAGES and ARM64_64K_PAGES based platforms. Instead, it just ends up with the following warning message. Reason being, hugetlb_cma_reserve() never gets called for these huge page sizes. [ 64.255669] hugetlb_cma: the option isn't supported by current arch This enables CMA areas reservation on ARM64_16K_PAGES and ARM64_64K_PAGES configs by defining an unified arm64_hugetlb_cma_reseve() that is wrapped in CONFIG_CMA. Call site for arm64_hugetlb_cma_reserve() is also protected as <asm/hugetlb.h> is conditionally included and hence cannot contain stub for the inverse config i.e !(CONFIG_HUGETLB_PAGE && CONFIG_CMA). Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Barry Song <song.bao.hua@hisilicon.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Link: https://lore.kernel.org/r/1593578521-24672-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-14arm64/acpi: disallow AML memory opregions to access kernel memoryArd Biesheuvel1-14/+1
AML uses SystemMemory opregions to allow AML handlers to access MMIO registers of, e.g., GPIO controllers, or access reserved regions of memory that are owned by the firmware. Currently, we also allow AML access to memory that is owned by the kernel and mapped via the linear region, which does not seem to be supported by a valid use case, and exposes the kernel's internal state to AML methods that may be buggy and exploitable. On arm64, ACPI support requires booting in EFI mode, and so we can cross reference the requested region against the EFI memory map, rather than just do a minimal check on the first page. So let's only permit regions to be remapped by the ACPI core if - they don't appear in the EFI memory map at all (which is the case for most MMIO), or - they are covered by a single region in the EFI memory map, which is not of a type that describes memory that is given to the kernel at boot. Reported-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Link: https://lore.kernel.org/r/20200626155832.2323789-2-ardb@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-10Merge tag 'arm64-fixes' of ↵Linus Torvalds4-4/+14
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "An unfortunately large collection of arm64 fixes for -rc5. Some of this is absolutely trivial, but the alternatives, vDSO and CPU errata workaround fixes are significant. At least people are finding and fixing these things, I suppose. - Fix workaround for CPU erratum #1418040 to disable the compat vDSO - Fix Oops when single-stepping with KGDB - Fix memory attributes for hypervisor device mappings at EL2 - Fix memory leak in PSCI and remove useless variable assignment - Fix up some comments and asm labels in our entry code - Fix broken register table formatting in our generated html docs - Fix missing NULL sentinel in CPU errata workaround list - Fix patching of branches in alternative instruction sections" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64/alternatives: don't patch up internal branches arm64: Add missing sentinel to erratum_1463225 arm64: Documentation: Fix broken table in generated HTML arm64: kgdb: Fix single-step exception handling oops arm64: entry: Tidy up block comments and label numbers arm64: Rework ARM_ERRATUM_1414080 handling arm64: arch_timer: Disable the compat vdso for cores affected by ARM64_WORKAROUND_1418040 arm64: arch_timer: Allow an workaround descriptor to disable compat vdso arm64: Introduce a way to disable the 32bit vdso arm64: entry: Fix the typo in the comment of el1_dbg() drivers/firmware/psci: Assign @err directly in hotplug_tests() drivers/firmware/psci: Fix memory leakage in alloc_init_cpu_groups() KVM: arm64: Fix definition of PAGE_HYP_DEVICE
2020-07-10arm64: tlb: don't set the ttl value in flush_tlb_page_nosyncZhenyu Ye1-3/+2
flush_tlb_page_nosync() may be called from pmd level, so we can not set the ttl = 3 here. The callstack is as follows: pmdp_set_access_flags ptep_set_access_flags flush_tlb_fix_spurious_fault flush_tlb_page flush_tlb_page_nosync Fixes: e735b98a5fe0 ("arm64: Add tlbi_user_level TLB invalidation helper") Reported-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com> Link: https://lore.kernel.org/r/20200710094158.468-1-yezhenyu2@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-10KVM: arm64: clean up redundant 'kvm_run' parametersTianjia Zhang3-14/+11
In the current kvm version, 'kvm_run' has been included in the 'kvm_vcpu' structure. For historical reasons, many kvm-related function parameters retain the 'kvm_run' and 'kvm_vcpu' parameters at the same time. This patch does a unified cleanup of these remaining redundant parameters. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200623131418.31473-3-tianjia.zhang@linux.alibaba.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09KVM: arm64: Use common KVM implementation of MMU memory cachesSean Christopherson3-13/+8
Move to the common MMU memory cache implementation now that the common code and arm64's existing code are semantically compatible. No functional change intended. Cc: Marc Zyngier <maz@kernel.org> Suggested-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200703023545.8771-19-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09KVM: arm64: Use common code's approach for __GFP_ZERO with memory cachesSean Christopherson1-0/+1
Add a "gfp_zero" member to arm64's 'struct kvm_mmu_memory_cache' to make the struct and its usage compatible with the common 'struct kvm_mmu_memory_cache' in linux/kvm_host.h. This will minimize code churn when arm64 moves to the common implementation in a future patch, at the cost of temporarily having somewhat silly code. Note, GFP_PGTABLE_USER is equivalent to GFP_KERNEL_ACCOUNT | GFP_ZERO: #define GFP_PGTABLE_USER (GFP_PGTABLE_KERNEL | __GFP_ACCOUNT) | -> #define GFP_PGTABLE_KERNEL (GFP_KERNEL | __GFP_ZERO) == GFP_KERNEL | __GFP_ACCOUNT | __GFP_ZERO versus #define GFP_KERNEL_ACCOUNT (GFP_KERNEL | __GFP_ACCOUNT) with __GFP_ZERO explicitly OR'd in == GFP_KERNEL | __GFP_ACCOUNT | __GFP_ZERO No functional change intended. Tested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200703023545.8771-18-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>