summaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/admin-guide/perf/arm-cmn.rst65
-rw-r--r--Documentation/admin-guide/perf/index.rst1
-rw-r--r--Documentation/arm64/cpu-feature-registers.rst2
-rw-r--r--Documentation/arm64/elf_hwcaps.rst4
-rw-r--r--Documentation/arm64/index.rst1
-rw-r--r--Documentation/arm64/memory-tagging-extension.rst305
-rw-r--r--Documentation/devicetree/bindings/perf/arm,cmn.yaml57
-rw-r--r--Documentation/virt/kvm/arm/hyp-abi.rst6
8 files changed, 438 insertions, 3 deletions
diff --git a/Documentation/admin-guide/perf/arm-cmn.rst b/Documentation/admin-guide/perf/arm-cmn.rst
new file mode 100644
index 000000000000..0e4809346014
--- /dev/null
+++ b/Documentation/admin-guide/perf/arm-cmn.rst
@@ -0,0 +1,65 @@
+=============================
+Arm Coherent Mesh Network PMU
+=============================
+
+CMN-600 is a configurable mesh interconnect consisting of a rectangular
+grid of crosspoints (XPs), with each crosspoint supporting up to two
+device ports to which various AMBA CHI agents are attached.
+
+CMN implements a distributed PMU design as part of its debug and trace
+functionality. This consists of a local monitor (DTM) at every XP, which
+counts up to 4 event signals from the connected device nodes and/or the
+XP itself. Overflow from these local counters is accumulated in up to 8
+global counters implemented by the main controller (DTC), which provides
+overall PMU control and interrupts for global counter overflow.
+
+PMU events
+----------
+
+The PMU driver registers a single PMU device for the whole interconnect,
+see /sys/bus/event_source/devices/arm_cmn. Multi-chip systems may link
+more than one CMN together via external CCIX links - in this situation,
+each mesh counts its own events entirely independently, and additional
+PMU devices will be named arm_cmn_{1..n}.
+
+Most events are specified in a format based directly on the TRM
+definitions - "type" selects the respective node type, and "eventid" the
+event number. Some events require an additional occupancy ID, which is
+specified by "occupid".
+
+* Since RN-D nodes do not have any distinct events from RN-I nodes, they
+ are treated as the same type (0xa), and the common event templates are
+ named "rnid_*".
+
+* The cycle counter is treated as a synthetic event belonging to the DTC
+ node ("type" == 0x3, "eventid" is ignored).
+
+* XP events also encode the port and channel in the "eventid" field, to
+ match the underlying pmu_event0_id encoding for the pmu_event_sel
+ register. The event templates are named with prefixes to cover all
+ permutations.
+
+By default each event provides an aggregate count over all nodes of the
+given type. To target a specific node, "bynodeid" must be set to 1 and
+"nodeid" to the appropriate value derived from the CMN configuration
+(as defined in the "Node ID Mapping" section of the TRM).
+
+Watchpoints
+-----------
+
+The PMU can also count watchpoint events to monitor specific flit
+traffic. Watchpoints are treated as a synthetic event type, and like PMU
+events can be global or targeted with a particular XP's "nodeid" value.
+Since the watchpoint direction is otherwise implicit in the underlying
+register selection, separate events are provided for flit uploads and
+downloads.
+
+The flit match value and mask are passed in config1 and config2 ("val"
+and "mask" respectively). "wp_dev_sel", "wp_chn_sel", "wp_grp" and
+"wp_exclusive" are specified per the TRM definitions for dtm_wp_config0.
+Where a watchpoint needs to match fields from both match groups on the
+REQ or SNP channel, it can be specified as two events - one for each
+group - with the same nonzero "combine" value. The count for such a
+pair of combined events will be attributed to the primary match.
+Watchpoint events with a "combine" value of 0 are considered independent
+and will count individually.
diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst
index 47c99f40cc16..5a8f2529a033 100644
--- a/Documentation/admin-guide/perf/index.rst
+++ b/Documentation/admin-guide/perf/index.rst
@@ -12,6 +12,7 @@ Performance monitor support
qcom_l2_pmu
qcom_l3_pmu
arm-ccn
+ arm-cmn
xgene-pmu
arm_dsu_pmu
thunderx2-pmu
diff --git a/Documentation/arm64/cpu-feature-registers.rst b/Documentation/arm64/cpu-feature-registers.rst
index f28853f80089..328e0c454fbd 100644
--- a/Documentation/arm64/cpu-feature-registers.rst
+++ b/Documentation/arm64/cpu-feature-registers.rst
@@ -175,6 +175,8 @@ infrastructure:
+------------------------------+---------+---------+
| Name | bits | visible |
+------------------------------+---------+---------+
+ | MTE | [11-8] | y |
+ +------------------------------+---------+---------+
| SSBS | [7-4] | y |
+------------------------------+---------+---------+
| BT | [3-0] | y |
diff --git a/Documentation/arm64/elf_hwcaps.rst b/Documentation/arm64/elf_hwcaps.rst
index 84a9fd2d41b4..bbd9cf54db6c 100644
--- a/Documentation/arm64/elf_hwcaps.rst
+++ b/Documentation/arm64/elf_hwcaps.rst
@@ -240,6 +240,10 @@ HWCAP2_BTI
Functionality implied by ID_AA64PFR0_EL1.BT == 0b0001.
+HWCAP2_MTE
+
+ Functionality implied by ID_AA64PFR1_EL1.MTE == 0b0010, as described
+ by Documentation/arm64/memory-tagging-extension.rst.
4. Unused AT_HWCAP bits
-----------------------
diff --git a/Documentation/arm64/index.rst b/Documentation/arm64/index.rst
index d9665d83c53a..43b0939d384e 100644
--- a/Documentation/arm64/index.rst
+++ b/Documentation/arm64/index.rst
@@ -14,6 +14,7 @@ ARM64 Architecture
hugetlbpage
legacy_instructions
memory
+ memory-tagging-extension
perf
pointer-authentication
silicon-errata
diff --git a/Documentation/arm64/memory-tagging-extension.rst b/Documentation/arm64/memory-tagging-extension.rst
new file mode 100644
index 000000000000..034d37c605e8
--- /dev/null
+++ b/Documentation/arm64/memory-tagging-extension.rst
@@ -0,0 +1,305 @@
+===============================================
+Memory Tagging Extension (MTE) in AArch64 Linux
+===============================================
+
+Authors: Vincenzo Frascino <vincenzo.frascino@arm.com>
+ Catalin Marinas <catalin.marinas@arm.com>
+
+Date: 2020-02-25
+
+This document describes the provision of the Memory Tagging Extension
+functionality in AArch64 Linux.
+
+Introduction
+============
+
+ARMv8.5 based processors introduce the Memory Tagging Extension (MTE)
+feature. MTE is built on top of the ARMv8.0 virtual address tagging TBI
+(Top Byte Ignore) feature and allows software to access a 4-bit
+allocation tag for each 16-byte granule in the physical address space.
+Such memory range must be mapped with the Normal-Tagged memory
+attribute. A logical tag is derived from bits 59-56 of the virtual
+address used for the memory access. A CPU with MTE enabled will compare
+the logical tag against the allocation tag and potentially raise an
+exception on mismatch, subject to system registers configuration.
+
+Userspace Support
+=================
+
+When ``CONFIG_ARM64_MTE`` is selected and Memory Tagging Extension is
+supported by the hardware, the kernel advertises the feature to
+userspace via ``HWCAP2_MTE``.
+
+PROT_MTE
+--------
+
+To access the allocation tags, a user process must enable the Tagged
+memory attribute on an address range using a new ``prot`` flag for
+``mmap()`` and ``mprotect()``:
+
+``PROT_MTE`` - Pages allow access to the MTE allocation tags.
+
+The allocation tag is set to 0 when such pages are first mapped in the
+user address space and preserved on copy-on-write. ``MAP_SHARED`` is
+supported and the allocation tags can be shared between processes.
+
+**Note**: ``PROT_MTE`` is only supported on ``MAP_ANONYMOUS`` and
+RAM-based file mappings (``tmpfs``, ``memfd``). Passing it to other
+types of mapping will result in ``-EINVAL`` returned by these system
+calls.
+
+**Note**: The ``PROT_MTE`` flag (and corresponding memory type) cannot
+be cleared by ``mprotect()``.
+
+**Note**: ``madvise()`` memory ranges with ``MADV_DONTNEED`` and
+``MADV_FREE`` may have the allocation tags cleared (set to 0) at any
+point after the system call.
+
+Tag Check Faults
+----------------
+
+When ``PROT_MTE`` is enabled on an address range and a mismatch between
+the logical and allocation tags occurs on access, there are three
+configurable behaviours:
+
+- *Ignore* - This is the default mode. The CPU (and kernel) ignores the
+ tag check fault.
+
+- *Synchronous* - The kernel raises a ``SIGSEGV`` synchronously, with
+ ``.si_code = SEGV_MTESERR`` and ``.si_addr = <fault-address>``. The
+ memory access is not performed. If ``SIGSEGV`` is ignored or blocked
+ by the offending thread, the containing process is terminated with a
+ ``coredump``.
+
+- *Asynchronous* - The kernel raises a ``SIGSEGV``, in the offending
+ thread, asynchronously following one or multiple tag check faults,
+ with ``.si_code = SEGV_MTEAERR`` and ``.si_addr = 0`` (the faulting
+ address is unknown).
+
+The user can select the above modes, per thread, using the
+``prctl(PR_SET_TAGGED_ADDR_CTRL, flags, 0, 0, 0)`` system call where
+``flags`` contain one of the following values in the ``PR_MTE_TCF_MASK``
+bit-field:
+
+- ``PR_MTE_TCF_NONE`` - *Ignore* tag check faults
+- ``PR_MTE_TCF_SYNC`` - *Synchronous* tag check fault mode
+- ``PR_MTE_TCF_ASYNC`` - *Asynchronous* tag check fault mode
+
+The current tag check fault mode can be read using the
+``prctl(PR_GET_TAGGED_ADDR_CTRL, 0, 0, 0, 0)`` system call.
+
+Tag checking can also be disabled for a user thread by setting the
+``PSTATE.TCO`` bit with ``MSR TCO, #1``.
+
+**Note**: Signal handlers are always invoked with ``PSTATE.TCO = 0``,
+irrespective of the interrupted context. ``PSTATE.TCO`` is restored on
+``sigreturn()``.
+
+**Note**: There are no *match-all* logical tags available for user
+applications.
+
+**Note**: Kernel accesses to the user address space (e.g. ``read()``
+system call) are not checked if the user thread tag checking mode is
+``PR_MTE_TCF_NONE`` or ``PR_MTE_TCF_ASYNC``. If the tag checking mode is
+``PR_MTE_TCF_SYNC``, the kernel makes a best effort to check its user
+address accesses, however it cannot always guarantee it.
+
+Excluding Tags in the ``IRG``, ``ADDG`` and ``SUBG`` instructions
+-----------------------------------------------------------------
+
+The architecture allows excluding certain tags to be randomly generated
+via the ``GCR_EL1.Exclude`` register bit-field. By default, Linux
+excludes all tags other than 0. A user thread can enable specific tags
+in the randomly generated set using the ``prctl(PR_SET_TAGGED_ADDR_CTRL,
+flags, 0, 0, 0)`` system call where ``flags`` contains the tags bitmap
+in the ``PR_MTE_TAG_MASK`` bit-field.
+
+**Note**: The hardware uses an exclude mask but the ``prctl()``
+interface provides an include mask. An include mask of ``0`` (exclusion
+mask ``0xffff``) results in the CPU always generating tag ``0``.
+
+Initial process state
+---------------------
+
+On ``execve()``, the new process has the following configuration:
+
+- ``PR_TAGGED_ADDR_ENABLE`` set to 0 (disabled)
+- Tag checking mode set to ``PR_MTE_TCF_NONE``
+- ``PR_MTE_TAG_MASK`` set to 0 (all tags excluded)
+- ``PSTATE.TCO`` set to 0
+- ``PROT_MTE`` not set on any of the initial memory maps
+
+On ``fork()``, the new process inherits the parent's configuration and
+memory map attributes with the exception of the ``madvise()`` ranges
+with ``MADV_WIPEONFORK`` which will have the data and tags cleared (set
+to 0).
+
+The ``ptrace()`` interface
+--------------------------
+
+``PTRACE_PEEKMTETAGS`` and ``PTRACE_POKEMTETAGS`` allow a tracer to read
+the tags from or set the tags to a tracee's address space. The
+``ptrace()`` system call is invoked as ``ptrace(request, pid, addr,
+data)`` where:
+
+- ``request`` - one of ``PTRACE_PEEKMTETAGS`` or ``PTRACE_POKEMTETAGS``.
+- ``pid`` - the tracee's PID.
+- ``addr`` - address in the tracee's address space.
+- ``data`` - pointer to a ``struct iovec`` where ``iov_base`` points to
+ a buffer of ``iov_len`` length in the tracer's address space.
+
+The tags in the tracer's ``iov_base`` buffer are represented as one
+4-bit tag per byte and correspond to a 16-byte MTE tag granule in the
+tracee's address space.
+
+**Note**: If ``addr`` is not aligned to a 16-byte granule, the kernel
+will use the corresponding aligned address.
+
+``ptrace()`` return value:
+
+- 0 - tags were copied, the tracer's ``iov_len`` was updated to the
+ number of tags transferred. This may be smaller than the requested
+ ``iov_len`` if the requested address range in the tracee's or the
+ tracer's space cannot be accessed or does not have valid tags.
+- ``-EPERM`` - the specified process cannot be traced.
+- ``-EIO`` - the tracee's address range cannot be accessed (e.g. invalid
+ address) and no tags copied. ``iov_len`` not updated.
+- ``-EFAULT`` - fault on accessing the tracer's memory (``struct iovec``
+ or ``iov_base`` buffer) and no tags copied. ``iov_len`` not updated.
+- ``-EOPNOTSUPP`` - the tracee's address does not have valid tags (never
+ mapped with the ``PROT_MTE`` flag). ``iov_len`` not updated.
+
+**Note**: There are no transient errors for the requests above, so user
+programs should not retry in case of a non-zero system call return.
+
+``PTRACE_GETREGSET`` and ``PTRACE_SETREGSET`` with ``addr ==
+``NT_ARM_TAGGED_ADDR_CTRL`` allow ``ptrace()`` access to the tagged
+address ABI control and MTE configuration of a process as per the
+``prctl()`` options described in
+Documentation/arm64/tagged-address-abi.rst and above. The corresponding
+``regset`` is 1 element of 8 bytes (``sizeof(long))``).
+
+Example of correct usage
+========================
+
+*MTE Example code*
+
+.. code-block:: c
+
+ /*
+ * To be compiled with -march=armv8.5-a+memtag
+ */
+ #include <errno.h>
+ #include <stdint.h>
+ #include <stdio.h>
+ #include <stdlib.h>
+ #include <unistd.h>
+ #include <sys/auxv.h>
+ #include <sys/mman.h>
+ #include <sys/prctl.h>
+
+ /*
+ * From arch/arm64/include/uapi/asm/hwcap.h
+ */
+ #define HWCAP2_MTE (1 << 18)
+
+ /*
+ * From arch/arm64/include/uapi/asm/mman.h
+ */
+ #define PROT_MTE 0x20
+
+ /*
+ * From include/uapi/linux/prctl.h
+ */
+ #define PR_SET_TAGGED_ADDR_CTRL 55
+ #define PR_GET_TAGGED_ADDR_CTRL 56
+ # define PR_TAGGED_ADDR_ENABLE (1UL << 0)
+ # define PR_MTE_TCF_SHIFT 1
+ # define PR_MTE_TCF_NONE (0UL << PR_MTE_TCF_SHIFT)
+ # define PR_MTE_TCF_SYNC (1UL << PR_MTE_TCF_SHIFT)
+ # define PR_MTE_TCF_ASYNC (2UL << PR_MTE_TCF_SHIFT)
+ # define PR_MTE_TCF_MASK (3UL << PR_MTE_TCF_SHIFT)
+ # define PR_MTE_TAG_SHIFT 3
+ # define PR_MTE_TAG_MASK (0xffffUL << PR_MTE_TAG_SHIFT)
+
+ /*
+ * Insert a random logical tag into the given pointer.
+ */
+ #define insert_random_tag(ptr) ({ \
+ uint64_t __val; \
+ asm("irg %0, %1" : "=r" (__val) : "r" (ptr)); \
+ __val; \
+ })
+
+ /*
+ * Set the allocation tag on the destination address.
+ */
+ #define set_tag(tagged_addr) do { \
+ asm volatile("stg %0, [%0]" : : "r" (tagged_addr) : "memory"); \
+ } while (0)
+
+ int main()
+ {
+ unsigned char *a;
+ unsigned long page_sz = sysconf(_SC_PAGESIZE);
+ unsigned long hwcap2 = getauxval(AT_HWCAP2);
+
+ /* check if MTE is present */
+ if (!(hwcap2 & HWCAP2_MTE))
+ return EXIT_FAILURE;
+
+ /*
+ * Enable the tagged address ABI, synchronous MTE tag check faults and
+ * allow all non-zero tags in the randomly generated set.
+ */
+ if (prctl(PR_SET_TAGGED_ADDR_CTRL,
+ PR_TAGGED_ADDR_ENABLE | PR_MTE_TCF_SYNC | (0xfffe << PR_MTE_TAG_SHIFT),
+ 0, 0, 0)) {
+ perror("prctl() failed");
+ return EXIT_FAILURE;
+ }
+
+ a = mmap(0, page_sz, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+ if (a == MAP_FAILED) {
+ perror("mmap() failed");
+ return EXIT_FAILURE;
+ }
+
+ /*
+ * Enable MTE on the above anonymous mmap. The flag could be passed
+ * directly to mmap() and skip this step.
+ */
+ if (mprotect(a, page_sz, PROT_READ | PROT_WRITE | PROT_MTE)) {
+ perror("mprotect() failed");
+ return EXIT_FAILURE;
+ }
+
+ /* access with the default tag (0) */
+ a[0] = 1;
+ a[1] = 2;
+
+ printf("a[0] = %hhu a[1] = %hhu\n", a[0], a[1]);
+
+ /* set the logical and allocation tags */
+ a = (unsigned char *)insert_random_tag(a);
+ set_tag(a);
+
+ printf("%p\n", a);
+
+ /* non-zero tag access */
+ a[0] = 3;
+ printf("a[0] = %hhu a[1] = %hhu\n", a[0], a[1]);
+
+ /*
+ * If MTE is enabled correctly the next instruction will generate an
+ * exception.
+ */
+ printf("Expecting SIGSEGV...\n");
+ a[16] = 0xdd;
+
+ /* this should not be printed in the PR_MTE_TCF_SYNC mode */
+ printf("...haven't got one\n");
+
+ return EXIT_FAILURE;
+ }
diff --git a/Documentation/devicetree/bindings/perf/arm,cmn.yaml b/Documentation/devicetree/bindings/perf/arm,cmn.yaml
new file mode 100644
index 000000000000..e4fcc0de25e2
--- /dev/null
+++ b/Documentation/devicetree/bindings/perf/arm,cmn.yaml
@@ -0,0 +1,57 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+# Copyright 2020 Arm Ltd.
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/perf/arm,cmn.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Arm CMN (Coherent Mesh Network) Performance Monitors
+
+maintainers:
+ - Robin Murphy <robin.murphy@arm.com>
+
+properties:
+ compatible:
+ const: arm,cmn-600
+
+ reg:
+ items:
+ - description: Physical address of the base (PERIPHBASE) and
+ size (up to 64MB) of the configuration address space.
+
+ interrupts:
+ minItems: 1
+ maxItems: 4
+ items:
+ - description: Overflow interrupt for DTC0
+ - description: Overflow interrupt for DTC1
+ - description: Overflow interrupt for DTC2
+ - description: Overflow interrupt for DTC3
+ description: One interrupt for each DTC domain implemented must
+ be specified, in order. DTC0 is always present.
+
+ arm,root-node:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description: Offset from PERIPHBASE of the configuration
+ discovery node (see TRM definition of ROOTNODEBASE).
+
+required:
+ - compatible
+ - reg
+ - interrupts
+ - arm,root-node
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+ #include <dt-bindings/interrupt-controller/irq.h>
+ pmu@50000000 {
+ compatible = "arm,cmn-600";
+ reg = <0x50000000 0x4000000>;
+ /* 4x2 mesh with one DTC, and CFG node at 0,1,1,0 */
+ interrupts = <GIC_SPI 46 IRQ_TYPE_LEVEL_HIGH>;
+ arm,root-node = <0x104000>;
+ };
+...
diff --git a/Documentation/virt/kvm/arm/hyp-abi.rst b/Documentation/virt/kvm/arm/hyp-abi.rst
index d9eba93aa364..83cadd8186fa 100644
--- a/Documentation/virt/kvm/arm/hyp-abi.rst
+++ b/Documentation/virt/kvm/arm/hyp-abi.rst
@@ -54,9 +54,9 @@ these functions (see arch/arm{,64}/include/asm/virt.h):
x3 = x1's value when entering the next payload (arm64)
x4 = x2's value when entering the next payload (arm64)
- Mask all exceptions, disable the MMU, move the arguments into place
- (arm64 only), and jump to the restart address while at HYP/EL2. This
- hypercall is not expected to return to its caller.
+ Mask all exceptions, disable the MMU, clear I+D bits, move the arguments
+ into place (arm64 only), and jump to the restart address while at HYP/EL2.
+ This hypercall is not expected to return to its caller.
Any other value of r0/x0 triggers a hypervisor-specific handling,
which is not documented here.