diff options
Diffstat (limited to 'Documentation')
72 files changed, 2058 insertions, 509 deletions
diff --git a/Documentation/ABI/obsolete/sysfs-class-dax b/Documentation/ABI/obsolete/sysfs-class-dax new file mode 100644 index 000000000000..2cb9fc5e8bd1 --- /dev/null +++ b/Documentation/ABI/obsolete/sysfs-class-dax @@ -0,0 +1,22 @@ +What: /sys/class/dax/ +Date: May, 2016 +KernelVersion: v4.7 +Contact: linux-nvdimm@lists.01.org +Description: Device DAX is the device-centric analogue of Filesystem + DAX (CONFIG_FS_DAX). It allows memory ranges to be + allocated and mapped without need of an intervening file + system. Device DAX is strict, precise and predictable. + Specifically this interface: + + 1/ Guarantees fault granularity with respect to a given + page size (pte, pmd, or pud) set at configuration time. + + 2/ Enforces deterministic behavior by being strict about + what fault scenarios are supported. + + The /sys/class/dax/ interface enumerates all the + device-dax instances in the system. The ABI is + deprecated and will be removed after 2020. It is + replaced with the DAX bus interface /sys/bus/dax/ where + device-dax instances can be found under + /sys/bus/dax/devices/ diff --git a/Documentation/ABI/testing/debugfs-wilco-ec b/Documentation/ABI/testing/debugfs-wilco-ec new file mode 100644 index 000000000000..f814f112e213 --- /dev/null +++ b/Documentation/ABI/testing/debugfs-wilco-ec @@ -0,0 +1,23 @@ +What: /sys/kernel/debug/wilco_ec/raw +Date: January 2019 +KernelVersion: 5.1 +Description: + Write and read raw mailbox commands to the EC. + + For writing: + Bytes 0-1 indicate the message type: + 00 F0 = Execute Legacy Command + 00 F2 = Read/Write NVRAM Property + Byte 2 provides the command code + Bytes 3+ consist of the data passed in the request + + At least three bytes are required, for the msg type and command, + with additional bytes optional for additional data. + + Example: + // Request EC info type 3 (EC firmware build date) + $ echo 00 f0 38 00 03 00 > raw + // View the result. The decoded ASCII result "12/21/18" is + // included after the raw hex. + $ cat raw + 00 31 32 2f 32 31 2f 31 38 00 38 00 01 00 2f 00 .12/21/18.8... diff --git a/Documentation/ABI/testing/sysfs-class-watchdog b/Documentation/ABI/testing/sysfs-class-watchdog index 736046b33040..6317ade5ad19 100644 --- a/Documentation/ABI/testing/sysfs-class-watchdog +++ b/Documentation/ABI/testing/sysfs-class-watchdog @@ -49,3 +49,26 @@ Contact: Wim Van Sebroeck <wim@iguana.be> Description: It is a read only file. It is read to know about current value of timeout programmed. + +What: /sys/class/watchdog/watchdogn/pretimeout +Date: December 2016 +Contact: Wim Van Sebroeck <wim@iguana.be> +Description: + It is a read only file. It specifies the time in seconds before + timeout when the pretimeout interrupt is delivered. Pretimeout + is an optional feature. + +What: /sys/class/watchdog/watchdogn/pretimeout_avaialable_governors +Date: February 2017 +Contact: Wim Van Sebroeck <wim@iguana.be> +Description: + It is a read only file. It shows the pretimeout governors + available for this watchdog. + +What: /sys/class/watchdog/watchdogn/pretimeout_governor +Date: February 2017 +Contact: Wim Van Sebroeck <wim@iguana.be> +Description: + It is a read/write file. When read, the currently assigned + pretimeout governor is returned. When written, it sets + the pretimeout governor. diff --git a/Documentation/ABI/testing/sysfs-fs-ext4 b/Documentation/ABI/testing/sysfs-fs-ext4 index c631253cf85c..78604db56279 100644 --- a/Documentation/ABI/testing/sysfs-fs-ext4 +++ b/Documentation/ABI/testing/sysfs-fs-ext4 @@ -109,3 +109,10 @@ Description: write operation (since a 4k random write might turn into a much larger write due to the zeroout operation). + +What: /sys/fs/ext4/<disk>/journal_task +Date: February 2019 +Contact: "Theodore Ts'o" <tytso@mit.edu> +Description: + This file is read-only and shows the pid of journal thread in + current pid-namespace or 0 if task is unreachable. diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs index a7ce33199457..91822ce25831 100644 --- a/Documentation/ABI/testing/sysfs-fs-f2fs +++ b/Documentation/ABI/testing/sysfs-fs-f2fs @@ -86,6 +86,13 @@ Description: The unit size is one block, now only support configuring in range of [1, 512]. +What: /sys/fs/f2fs/<disk>/umount_discard_timeout +Date: January 2019 +Contact: "Jaegeuk Kim" <jaegeuk@kernel.org> +Description: + Set timeout to issue discard commands during umount. + Default: 5 secs + What: /sys/fs/f2fs/<disk>/max_victim_search Date: January 2014 Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com> diff --git a/Documentation/acpi/aml-debugger.txt b/Documentation/acpi/aml-debugger.txt index e851cc5de63f..75ebeb64ab29 100644 --- a/Documentation/acpi/aml-debugger.txt +++ b/Documentation/acpi/aml-debugger.txt @@ -23,7 +23,7 @@ kernel. The resultant userspace tool binary is then located at: - tools/acpi/power/acpi/acpidbg/acpidbg + tools/power/acpi/acpidbg It can be installed to system directories by running "make install" (as a sufficiently privileged user). @@ -35,7 +35,7 @@ kernel. # mount -t debugfs none /sys/kernel/debug # modprobe acpi_dbg - # tools/acpi/power/acpi/acpidbg/acpidbg + # tools/power/acpi/acpidbg That spawns the interactive AML debugger environment where you can execute debugger commands. diff --git a/Documentation/admin-guide/md.rst b/Documentation/admin-guide/md.rst index 84de718f24a4..3c51084ffd37 100644 --- a/Documentation/admin-guide/md.rst +++ b/Documentation/admin-guide/md.rst @@ -756,3 +756,6 @@ These currently include: The cache mode for raid5. raid5 could include an extra disk for caching. The mode can be "write-throuth" and "write-back". The default is "write-through". + + ppl_write_hint + NVMe stream ID to be set for each PPL write request. diff --git a/Documentation/arm/kernel_mode_neon.txt b/Documentation/arm/kernel_mode_neon.txt index 525452726d31..b9e060c5b61e 100644 --- a/Documentation/arm/kernel_mode_neon.txt +++ b/Documentation/arm/kernel_mode_neon.txt @@ -6,7 +6,7 @@ TL;DR summary * Use only NEON instructions, or VFP instructions that don't rely on support code * Isolate your NEON code in a separate compilation unit, and compile it with - '-mfpu=neon -mfloat-abi=softfp' + '-march=armv7-a -mfpu=neon -mfloat-abi=softfp' * Put kernel_neon_begin() and kernel_neon_end() calls around the calls into your NEON code * Don't sleep in your NEON code, and be aware that it will be executed with @@ -87,7 +87,7 @@ instructions appearing in unexpected places if no special care is taken. Therefore, the recommended and only supported way of using NEON/VFP in the kernel is by adhering to the following rules: * isolate the NEON code in a separate compilation unit and compile it with - '-mfpu=neon -mfloat-abi=softfp'; + '-march=armv7-a -mfpu=neon -mfloat-abi=softfp'; * issue the calls to kernel_neon_begin(), kernel_neon_end() as well as the calls into the unit containing the NEON code from a compilation unit which is *not* built with the GCC flag '-mfpu=neon' set. diff --git a/Documentation/core-api/flexible-arrays.rst b/Documentation/core-api/flexible-arrays.rst deleted file mode 100644 index b6b85a1b518e..000000000000 --- a/Documentation/core-api/flexible-arrays.rst +++ /dev/null @@ -1,130 +0,0 @@ - -=================================== -Using flexible arrays in the kernel -=================================== - -Large contiguous memory allocations can be unreliable in the Linux kernel. -Kernel programmers will sometimes respond to this problem by allocating -pages with :c:func:`vmalloc()`. This solution not ideal, though. On 32-bit -systems, memory from vmalloc() must be mapped into a relatively small address -space; it's easy to run out. On SMP systems, the page table changes required -by vmalloc() allocations can require expensive cross-processor interrupts on -all CPUs. And, on all systems, use of space in the vmalloc() range increases -pressure on the translation lookaside buffer (TLB), reducing the performance -of the system. - -In many cases, the need for memory from vmalloc() can be eliminated by piecing -together an array from smaller parts; the flexible array library exists to make -this task easier. - -A flexible array holds an arbitrary (within limits) number of fixed-sized -objects, accessed via an integer index. Sparse arrays are handled -reasonably well. Only single-page allocations are made, so memory -allocation failures should be relatively rare. The down sides are that the -arrays cannot be indexed directly, individual object size cannot exceed the -system page size, and putting data into a flexible array requires a copy -operation. It's also worth noting that flexible arrays do no internal -locking at all; if concurrent access to an array is possible, then the -caller must arrange for appropriate mutual exclusion. - -The creation of a flexible array is done with :c:func:`flex_array_alloc()`:: - - #include <linux/flex_array.h> - - struct flex_array *flex_array_alloc(int element_size, - unsigned int total, - gfp_t flags); - -The individual object size is provided by ``element_size``, while total is the -maximum number of objects which can be stored in the array. The flags -argument is passed directly to the internal memory allocation calls. With -the current code, using flags to ask for high memory is likely to lead to -notably unpleasant side effects. - -It is also possible to define flexible arrays at compile time with:: - - DEFINE_FLEX_ARRAY(name, element_size, total); - -This macro will result in a definition of an array with the given name; the -element size and total will be checked for validity at compile time. - -Storing data into a flexible array is accomplished with a call to -:c:func:`flex_array_put()`:: - - int flex_array_put(struct flex_array *array, unsigned int element_nr, - void *src, gfp_t flags); - -This call will copy the data from src into the array, in the position -indicated by ``element_nr`` (which must be less than the maximum specified when -the array was created). If any memory allocations must be performed, flags -will be used. The return value is zero on success, a negative error code -otherwise. - -There might possibly be a need to store data into a flexible array while -running in some sort of atomic context; in this situation, sleeping in the -memory allocator would be a bad thing. That can be avoided by using -``GFP_ATOMIC`` for the flags value, but, often, there is a better way. The -trick is to ensure that any needed memory allocations are done before -entering atomic context, using :c:func:`flex_array_prealloc()`:: - - int flex_array_prealloc(struct flex_array *array, unsigned int start, - unsigned int nr_elements, gfp_t flags); - -This function will ensure that memory for the elements indexed in the range -defined by ``start`` and ``nr_elements`` has been allocated. Thereafter, a -``flex_array_put()`` call on an element in that range is guaranteed not to -block. - -Getting data back out of the array is done with :c:func:`flex_array_get()`:: - - void *flex_array_get(struct flex_array *fa, unsigned int element_nr); - -The return value is a pointer to the data element, or NULL if that -particular element has never been allocated. - -Note that it is possible to get back a valid pointer for an element which -has never been stored in the array. Memory for array elements is allocated -one page at a time; a single allocation could provide memory for several -adjacent elements. Flexible array elements are normally initialized to the -value ``FLEX_ARRAY_FREE`` (defined as 0x6c in <linux/poison.h>), so errors -involving that number probably result from use of unstored array entries. -Note that, if array elements are allocated with ``__GFP_ZERO``, they will be -initialized to zero and this poisoning will not happen. - -Individual elements in the array can be cleared with -:c:func:`flex_array_clear()`:: - - int flex_array_clear(struct flex_array *array, unsigned int element_nr); - -This function will set the given element to ``FLEX_ARRAY_FREE`` and return -zero. If storage for the indicated element is not allocated for the array, -``flex_array_clear()`` will return ``-EINVAL`` instead. Note that clearing an -element does not release the storage associated with it; to reduce the -allocated size of an array, call :c:func:`flex_array_shrink()`:: - - int flex_array_shrink(struct flex_array *array); - -The return value will be the number of pages of memory actually freed. -This function works by scanning the array for pages containing nothing but -``FLEX_ARRAY_FREE`` bytes, so (1) it can be expensive, and (2) it will not work -if the array's pages are allocated with ``__GFP_ZERO``. - -It is possible to remove all elements of an array with a call to -:c:func:`flex_array_free_parts()`:: - - void flex_array_free_parts(struct flex_array *array); - -This call frees all elements, but leaves the array itself in place. -Freeing the entire array is done with :c:func:`flex_array_free()`:: - - void flex_array_free(struct flex_array *array); - -As of this writing, there are no users of flexible arrays in the mainline -kernel. The functions described here are also not exported to modules; -that will probably be fixed when somebody comes up with a need for it. - - -Flexible array functions ------------------------- - -.. kernel-doc:: include/linux/flex_array.h diff --git a/Documentation/core-api/generic-radix-tree.rst b/Documentation/core-api/generic-radix-tree.rst new file mode 100644 index 000000000000..ed42839ae42f --- /dev/null +++ b/Documentation/core-api/generic-radix-tree.rst @@ -0,0 +1,12 @@ +================================= +Generic radix trees/sparse arrays +================================= + +.. kernel-doc:: include/linux/generic-radix-tree.h + :doc: Generic radix trees/sparse arrays + +generic radix tree functions +---------------------------- + +.. kernel-doc:: include/linux/generic-radix-tree.h + :functions: diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst index 3adee82be311..6870baffef82 100644 --- a/Documentation/core-api/index.rst +++ b/Documentation/core-api/index.rst @@ -28,6 +28,7 @@ Core utilities errseq printk-formats circular-buffers + generic-radix-tree memory-allocation mm-api gfp_mask-from-fs-io diff --git a/Documentation/core-api/xarray.rst b/Documentation/core-api/xarray.rst index 5d54b27c6eba..ef6f9f98f595 100644 --- a/Documentation/core-api/xarray.rst +++ b/Documentation/core-api/xarray.rst @@ -85,7 +85,7 @@ which was at that index; if it returns the same entry which was passed as If you want to only store a new entry to an index if the current entry at that index is ``NULL``, you can use :c:func:`xa_insert` which -returns ``-EEXIST`` if the entry is not empty. +returns ``-EBUSY`` if the entry is not empty. You can enquire whether a mark is set on an entry by using :c:func:`xa_get_mark`. If the entry is not ``NULL``, you can set a mark @@ -131,17 +131,23 @@ If you use :c:func:`DEFINE_XARRAY_ALLOC` to define the XArray, or initialise it by passing ``XA_FLAGS_ALLOC`` to :c:func:`xa_init_flags`, the XArray changes to track whether entries are in use or not. -You can call :c:func:`xa_alloc` to store the entry at any unused index +You can call :c:func:`xa_alloc` to store the entry at an unused index in the XArray. If you need to modify the array from interrupt context, you can use :c:func:`xa_alloc_bh` or :c:func:`xa_alloc_irq` to disable interrupts while allocating the ID. -Using :c:func:`xa_store`, :c:func:`xa_cmpxchg` or :c:func:`xa_insert` -will mark the entry as being allocated. Unlike a normal XArray, storing +Using :c:func:`xa_store`, :c:func:`xa_cmpxchg` or :c:func:`xa_insert` will +also mark the entry as being allocated. Unlike a normal XArray, storing ``NULL`` will mark the entry as being in use, like :c:func:`xa_reserve`. To free an entry, use :c:func:`xa_erase` (or :c:func:`xa_release` if you only want to free the entry if it's ``NULL``). +By default, the lowest free entry is allocated starting from 0. If you +want to allocate entries starting at 1, it is more efficient to use +:c:func:`DEFINE_XARRAY_ALLOC1` or ``XA_FLAGS_ALLOC1``. If you want to +allocate IDs up to a maximum, then wrap back around to the lowest free +ID, you can use :c:func:`xa_alloc_cyclic`. + You cannot use ``XA_MARK_0`` with an allocating XArray as this mark is used to track whether an entry is free or not. The other marks are available for your use. @@ -209,7 +215,6 @@ Assumes xa_lock held on entry: * :c:func:`__xa_erase` * :c:func:`__xa_cmpxchg` * :c:func:`__xa_alloc` - * :c:func:`__xa_reserve` * :c:func:`__xa_set_mark` * :c:func:`__xa_clear_mark` diff --git a/Documentation/devicetree/bindings/Makefile b/Documentation/devicetree/bindings/Makefile index 50daa0b3b032..63b139f9ae28 100644 --- a/Documentation/devicetree/bindings/Makefile +++ b/Documentation/devicetree/bindings/Makefile @@ -15,7 +15,7 @@ DT_TMP_SCHEMA := processed-schema.yaml extra-y += $(DT_TMP_SCHEMA) quiet_cmd_mk_schema = SCHEMA $@ - cmd_mk_schema = $(DT_MK_SCHEMA) $(DT_MK_SCHEMA_FLAGS) -o $@ $(filter-out FORCE, $^) + cmd_mk_schema = $(DT_MK_SCHEMA) $(DT_MK_SCHEMA_FLAGS) -o $@ $(real-prereqs) DT_DOCS = $(shell \ cd $(srctree)/$(src) && \ diff --git a/Documentation/devicetree/bindings/clock/actions,owl-cmu.txt b/Documentation/devicetree/bindings/clock/actions,owl-cmu.txt index 2ef86ae96df8..d19885b7c73f 100644 --- a/Documentation/devicetree/bindings/clock/actions,owl-cmu.txt +++ b/Documentation/devicetree/bindings/clock/actions,owl-cmu.txt @@ -2,13 +2,14 @@ The Actions Semi Owl Clock Management Unit generates and supplies clock to various controllers within the SoC. The clock binding described here is -applicable to S900 and S700 SoC's. +applicable to S900, S700 and S500 SoC's. Required Properties: - compatible: should be one of the following, "actions,s900-cmu" "actions,s700-cmu" + "actions,s500-cmu" - reg: physical base address of the controller and length of memory mapped region. - clocks: Reference to the parent clocks ("hosc", "losc") @@ -19,8 +20,8 @@ Each clock is assigned an identifier, and client nodes can use this identifier to specify the clock which they consume. All available clocks are defined as preprocessor macros in corresponding -dt-bindings/clock/actions,s900-cmu.h or actions,s700-cmu.h header and can be -used in device tree sources. +dt-bindings/clock/actions,s900-cmu.h or actions,s700-cmu.h or +actions,s500-cmu.h header and can be used in device tree sources. External clocks: diff --git a/Documentation/devicetree/bindings/clock/amlogic,gxbb-aoclkc.txt b/Documentation/devicetree/bindings/clock/amlogic,gxbb-aoclkc.txt index 79511d7bb321..c41f0be5d438 100644 --- a/Documentation/devicetree/bindings/clock/amlogic,gxbb-aoclkc.txt +++ b/Documentation/devicetree/bindings/clock/amlogic,gxbb-aoclkc.txt @@ -10,6 +10,7 @@ Required Properties: - GXL (S905X, S905D) : "amlogic,meson-gxl-aoclkc" - GXM (S912) : "amlogic,meson-gxm-aoclkc" - AXG (A113D, A113X) : "amlogic,meson-axg-aoclkc" + - G12A (S905X2, S905D2, S905Y2) : "amlogic,meson-g12a-aoclkc" followed by the common "amlogic,meson-gx-aoclkc" - clocks: list of clock phandle, one for each entry clock-names. - clock-names: should contain the following: diff --git a/Documentation/devicetree/bindings/clock/amlogic,gxbb-clkc.txt b/Documentation/devicetree/bindings/clock/amlogic,gxbb-clkc.txt index a6871953bf04..5c8b105be4d6 100644 --- a/Documentation/devicetree/bindings/clock/amlogic,gxbb-clkc.txt +++ b/Documentation/devicetree/bindings/clock/amlogic,gxbb-clkc.txt @@ -9,6 +9,7 @@ Required Properties: "amlogic,gxbb-clkc" for GXBB SoC, "amlogic,gxl-clkc" for GXL and GXM SoC, "amlogic,axg-clkc" for AXG SoC. + "amlogic,g12a-clkc" for G12A SoC. - clocks : list of clock phandle, one for each entry clock-names. - clock-names : should contain the following: * "xtal": the platform xtal diff --git a/Documentation/devicetree/bindings/clock/exynos5433-clock.txt b/Documentation/devicetree/bindings/clock/exynos5433-clock.txt index 50d5897c9849..183c327a7d6b 100644 --- a/Documentation/devicetree/bindings/clock/exynos5433-clock.txt +++ b/Documentation/devicetree/bindings/clock/exynos5433-clock.txt @@ -50,6 +50,8 @@ Required Properties: IPs. - "samsung,exynos5433-cmu-cam1" - clock controller compatible for CMU_CAM1 which generates clocks for Cortex-A5/MIPI_CSIS2/FIMC-LITE_C/FIMC-FD IPs. + - "samsung,exynos5433-cmu-imem" - clock controller compatible for CMU_IMEM + which generates clocks for SSS (Security SubSystem) and SlimSSS IPs. - reg: physical base address of the controller and length of memory mapped region. @@ -168,6 +170,12 @@ Required Properties: - aclk_cam1_400 - aclk_cam1_552 + Input clocks for imem clock controller: + - oscclk + - aclk_imem_sssx_266 + - aclk_imem_266 + - aclk_imem_200 + Optional properties: - power-domains: a phandle to respective power domain node as described by generic PM domain bindings (see power/power_domain.txt for more @@ -469,6 +477,21 @@ Example 2: Examples of clock controller nodes are listed below. power-domains = <&pd_cam1>; }; + cmu_imem: clock-controller@11060000 { + compatible = "samsung,exynos5433-cmu-imem"; + reg = <0x11060000 0x1000>; + #clock-cells = <1>; + + clock-names = "oscclk", + "aclk_imem_sssx_266", + "aclk_imem_266", + "aclk_imem_200"; + clocks = <&xxti>, + <&cmu_top CLK_DIV_ACLK_IMEM_SSSX_266>, + <&cmu_top CLK_DIV_ACLK_IMEM_266>, + <&cmu_top CLK_DIV_ACLK_IMEM_200>; + }; + Example 3: UART controller node that consumes the clock generated by the clock controller. diff --git a/Documentation/devicetree/bindings/clock/fixed-clock.txt b/Documentation/devicetree/bindings/clock/fixed-clock.txt deleted file mode 100644 index 0641a663ad69..000000000000 --- a/Documentation/devicetree/bindings/clock/fixed-clock.txt +++ /dev/null @@ -1,23 +0,0 @@ -Binding for simple fixed-rate clock sources. - -This binding uses the common clock binding[1]. - -[1] Documentation/devicetree/bindings/clock/clock-bindings.txt - -Required properties: -- compatible : shall be "fixed-clock". -- #clock-cells : from common clock binding; shall be set to 0. -- clock-frequency : frequency of clock in Hz. Should be a single cell. - -Optional properties: -- clock-accuracy : accuracy of clock in ppb (parts per billion). - Should be a single cell. -- clock-output-names : From common clock binding. - -Example: - clock { - compatible = "fixed-clock"; - #clock-cells = <0>; - clock-frequency = <1000000000>; - clock-accuracy = <100>; - }; diff --git a/Documentation/devicetree/bindings/clock/fixed-clock.yaml b/Documentation/devicetree/bindings/clock/fixed-clock.yaml new file mode 100644 index 000000000000..b657ecd0ef1c --- /dev/null +++ b/Documentation/devicetree/bindings/clock/fixed-clock.yaml @@ -0,0 +1,44 @@ +# SPDX-License-Identifier: GPL-2.0 +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/clock/fixed-clock.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Binding for simple fixed-rate clock sources + +maintainers: + - Michael Turquette <mturquette@baylibre.com> + - Stephen Boyd <sboyd@kernel.org> + +properties: + compatible: + const: fixed-clock + + "#clock-cells": + const: 0 + + clock-frequency: true + + clock-accuracy: + description: accuracy of clock in ppb (parts per billion). + $ref: /schemas/types.yaml#/definitions/uint32 + + clock-output-names: + maxItems: 1 + +required: + - compatible + - "#clock-cells" + - clock-frequency + +additionalProperties: false + +examples: + - | + clock { + compatible = "fixed-clock"; + #clock-cells = <0>; + clock-frequency = <1000000000>; + clock-accuracy = <100>; + }; +... diff --git a/Documentation/devicetree/bindings/clock/fixed-factor-clock.txt b/Documentation/devicetree/bindings/clock/fixed-factor-clock.txt deleted file mode 100644 index 189467a7188a..000000000000 --- a/Documentation/devicetree/bindings/clock/fixed-factor-clock.txt +++ /dev/null @@ -1,28 +0,0 @@ -Binding for simple fixed factor rate clock sources. - -This binding uses the common clock binding[1]. - -[1] Documentation/devicetree/bindings/clock/clock-bindings.txt - -Required properties: -- compatible : shall be "fixed-factor-clock". -- #clock-cells : from common clock binding; shall be set to 0. -- clock-div: fixed divider. -- clock-mult: fixed multiplier. -- clocks: parent clock. - -Optional properties: -- clock-output-names : From common clock binding. - -Some clocks that require special treatments are also handled by that -driver, with the compatibles: - - allwinner,sun4i-a10-pll3-2x-clk - -Example: - clock { - compatible = "fixed-factor-clock"; - clocks = <&parentclk>; - #clock-cells = <0>; - clock-div = <2>; - clock-mult = <1>; - }; diff --git a/Documentation/devicetree/bindings/clock/fixed-factor-clock.yaml b/Documentation/devicetree/bindings/clock/fixed-factor-clock.yaml new file mode 100644 index 000000000000..b567f8092f8c --- /dev/null +++ b/Documentation/devicetree/bindings/clock/fixed-factor-clock.yaml @@ -0,0 +1,56 @@ +# SPDX-License-Identifier: GPL-2.0 +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/clock/fixed-factor-clock.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Binding for simple fixed factor rate clock sources + +maintainers: + - Michael Turquette <mturquette@baylibre.com> + - Stephen Boyd <sboyd@kernel.org> + +properties: + compatible: + enum: + - allwinner,sun4i-a10-pll3-2x-clk + - fixed-factor-clock + + "#clock-cells": + const: 0 + + clocks: + maxItems: 1 + + clock-div: + description: Fixed divider + allOf: + - $ref: /schemas/types.yaml#/definitions/uint32 + - minimum: 1 + + clock-mult: + description: Fixed multiplier + $ref: /schemas/types.yaml#/definitions/uint32 + + clock-output-names: + maxItems: 1 + +required: + - compatible + - clocks + - "#clock-cells" + - clock-div + - clock-mult + +additionalProperties: false + +examples: + - | + clock { + compatible = "fixed-factor-clock"; + clocks = <&parentclk>; + #clock-cells = <0>; + clock-div = <2>; + clock-mult = <1>; + }; +... diff --git a/Documentation/devicetree/bindings/clock/fixed-mmio-clock.txt b/Documentation/devicetree/bindings/clock/fixed-mmio-clock.txt new file mode 100644 index 000000000000..c359367fd1a9 --- /dev/null +++ b/Documentation/devicetree/bindings/clock/fixed-mmio-clock.txt @@ -0,0 +1,24 @@ +Binding for simple memory mapped io fixed-rate clock sources. +The driver reads a clock frequency value from a single 32-bit memory mapped +I/O register and registers it as a fixed rate clock. + +It was designed for test systems, like FPGA, not for complete, finished SoCs. + +This binding uses the common clock binding[1]. + +[1] Documentation/devicetree/bindings/clock/clock-bindings.txt + +Required properties: +- compatible : shall be "fixed-mmio-clock". +- #clock-cells : from common clock binding; shall be set to 0. +- reg : Address and length of the clock value register set. + +Optional properties: +- clock-output-names : From common clock binding. + +Example: +sysclock: sysclock@fd020004 { + #clock-cells = <0>; + compatible = "fixed-mmio-clock"; + reg = <0xfd020004 0x4>; +}; diff --git a/Documentation/devicetree/bindings/clock/imx8mm-clock.txt b/Documentation/devicetree/bindings/clock/imx8mm-clock.txt new file mode 100644 index 000000000000..8e4ab9e619a1 --- /dev/null +++ b/Documentation/devicetree/bindings/clock/imx8mm-clock.txt @@ -0,0 +1,29 @@ +* Clock bindings for NXP i.MX8M Mini + +Required properties: +- compatible: Should be "fsl,imx8mm-ccm" +- reg: Address and length of the register set +- #clock-cells: Should be <1> +- clocks: list of clock specifiers, must contain an entry for each required + entry in clock-names +- clock-names: should include the following entries: + - "osc_32k" + - "osc_24m" + - "clk_ext1" + - "clk_ext2" + - "clk_ext3" + - "clk_ext4" + +clk: clock-controller@30380000 { + compatible = "fsl,imx8mm-ccm"; + reg = <0x0 0x30380000 0x0 0x10000>; + #clock-cells = <1>; + clocks = <&osc_32k>, <&osc_24m>, <&clk_ext1>, <&clk_ext2>, + <&clk_ext3>, <&clk_ext4>; + clock-names = "osc_32k", "osc_24m", "clk_ext1", "clk_ext2", + "clk_ext3", "clk_ext4"; +}; + +The clock consumer should specify the desired clock by having the clock +ID in its "clocks" phandle cell. See include/dt-bindings/clock/imx8mm-clock.h +for the full list of i.MX8M Mini clock IDs. diff --git a/Documentation/devicetree/bindings/clock/qcom,rpmcc.txt b/Documentation/devicetree/bindings/clock/qcom,rpmcc.txt index 87b4949e9bc8..944719bd586f 100644 --- a/Documentation/devicetree/bindings/clock/qcom,rpmcc.txt +++ b/Documentation/devicetree/bindings/clock/qcom,rpmcc.txt @@ -16,6 +16,7 @@ Required properties : "qcom,rpmcc-msm8974", "qcom,rpmcc" "qcom,rpmcc-apq8064", "qcom,rpmcc" "qcom,rpmcc-msm8996", "qcom,rpmcc" + "qcom,rpmcc-msm8998", "qcom,rpmcc" "qcom,rpmcc-qcs404", "qcom,rpmcc" - #clock-cells : shall contain 1 diff --git a/Documentation/devicetree/bindings/display/ssd1307fb.txt b/Documentation/devicetree/bindings/display/ssd1307fb.txt index 209d931ef16c..b67f8caa212c 100644 --- a/Documentation/devicetree/bindings/display/ssd1307fb.txt +++ b/Documentation/devicetree/bindings/display/ssd1307fb.txt @@ -36,7 +36,6 @@ ssd1307: oled@3c { reg = <0x3c>; pwms = <&pwm 4 3000>; reset-gpios = <&gpio2 7>; - reset-active-low; }; ssd1306: oled@3c { @@ -44,7 +43,6 @@ ssd1306: oled@3c { reg = <0x3c>; pwms = <&pwm 4 3000>; reset-gpios = <&gpio2 7>; - reset-active-low; solomon,com-lrremap; solomon,com-invdir; solomon,com-offset = <32>; diff --git a/Documentation/devicetree/bindings/dma/dma.txt b/Documentation/devicetree/bindings/dma/dma.txt index 6312fb00ce8d..eeb4e4d1771e 100644 --- a/Documentation/devicetree/bindings/dma/dma.txt +++ b/Documentation/devicetree/bindings/dma/dma.txt @@ -16,6 +16,9 @@ Optional properties: - dma-channels: Number of DMA channels supported by the controller. - dma-requests: Number of DMA request signals supported by the controller. +- dma-channel-mask: Bitmask of available DMA channels in ascending order + that are not reserved by firmware and are available to + the kernel. i.e. first channel corresponds to LSB. Example: @@ -29,6 +32,7 @@ Example: #dma-cells = <1>; dma-channels = <32>; dma-requests = <127>; + dma-channel-mask = <0xfffe> }; * DMA router diff --git a/Documentation/devicetree/bindings/dma/fsl-qdma.txt b/Documentation/devicetree/bindings/dma/fsl-qdma.txt new file mode 100644 index 000000000000..6a0ff9059e72 --- /dev/null +++ b/Documentation/devicetree/bindings/dma/fsl-qdma.txt @@ -0,0 +1,57 @@ +NXP Layerscape SoC qDMA Controller +================================== + +This device follows the generic DMA bindings defined in dma/dma.txt. + +Required properties: + +- compatible: Must be one of + "fsl,ls1021a-qdma": for LS1021A Board + "fsl,ls1043a-qdma": for ls1043A Board + "fsl,ls1046a-qdma": for ls1046A Board +- reg: Should contain the register's base address and length. +- interrupts: Should contain a reference to the interrupt used by this + device. +- interrupt-names: Should contain interrupt names: + "qdma-queue0": the block0 interrupt + "qdma-queue1": the block1 interrupt + "qdma-queue2": the block2 interrupt + "qdma-queue3": the block3 interrupt + "qdma-error": the error interrupt +- fsl,dma-queues: Should contain number of queues supported. +- dma-channels: Number of DMA channels supported +- block-number: the virtual block number +- block-offset: the offset of different virtual block +- status-sizes: status queue size of per virtual block +- queue-sizes: command queue size of per virtual block, the size number + based on queues + +Optional properties: + +- dma-channels: Number of DMA channels supported by the controller. +- big-endian: If present registers and hardware scatter/gather descriptors + of the qDMA are implemented in big endian mode, otherwise in little + mode. + +Examples: + + qdma: dma-controller@8390000 { + compatible = "fsl,ls1021a-qdma"; + reg = <0x0 0x8388000 0x0 0x1000>, /* Controller regs */ + <0x0 0x8389000 0x0 0x1000>, /* Status regs */ + <0x0 0x838a000 0x0 0x2000>; /* Block regs */ + interrupts = <GIC_SPI 185 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 76 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 77 IRQ_TYPE_LEVEL_HIGH>; + interrupt-names = "qdma-error", + "qdma-queue0", "qdma-queue1"; + dma-channels = <8>; + block-number = <2>; + block-offset = <0x1000>; + fsl,dma-queues = <2>; + status-sizes = <64>; + queue-sizes = <64 64>; + big-endian; + }; + +DMA clients must use the format described in dma/dma.txt file. diff --git a/Documentation/devicetree/bindings/dma/k3dma.txt b/Documentation/devicetree/bindings/dma/k3dma.txt index 4945aeac4dc4..10a2f15b08a3 100644 --- a/Documentation/devicetree/bindings/dma/k3dma.txt +++ b/Documentation/devicetree/bindings/dma/k3dma.txt @@ -3,7 +3,9 @@ See dma.txt first Required properties: -- compatible: Should be "hisilicon,k3-dma-1.0" +- compatible: Must be one of +- "hisilicon,k3-dma-1.0" +- "hisilicon,hisi-pcm-asp-dma-1.0" - reg: Should contain DMA registers location and length. - interrupts: Should contain one interrupt shared by all channel - #dma-cells: see dma.txt, should be 1, para number diff --git a/Documentation/devicetree/bindings/dma/snps-dma.txt b/Documentation/devicetree/bindings/dma/snps-dma.txt index db757df7057d..0bedceed1963 100644 --- a/Documentation/devicetree/bindings/dma/snps-dma.txt +++ b/Documentation/devicetree/bindings/dma/snps-dma.txt @@ -23,8 +23,6 @@ Deprecated properties: Optional properties: -- is_private: The device channels should be marked as private and not for by the - general purpose DMA channel allocator. False if not passed. - multi-block: Multi block transfers supported by hardware. Array property with one cell per channel. 0: not supported, 1 (default): supported. - snps,dma-protection-control: AHB HPROT[3:1] protection setting. diff --git a/Documentation/devicetree/bindings/dma/sprd-dma.txt b/Documentation/devicetree/bindings/dma/sprd-dma.txt index 7a10fea2e51b..adccea9941f1 100644 --- a/Documentation/devicetree/bindings/dma/sprd-dma.txt +++ b/Documentation/devicetree/bindings/dma/sprd-dma.txt @@ -31,7 +31,7 @@ DMA clients connected to the Spreadtrum DMA controller must use the format described in the dma.txt file, using a two-cell specifier for each channel. The two cells in order are: 1. A phandle pointing to the DMA controller. -2. The channel id. +2. The slave id. spi0: spi@70a00000{ ... diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt b/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt index 174af2c45e77..93b6d961dd4f 100644 --- a/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt +++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt @@ -37,10 +37,11 @@ Required properties: Required properties for VDMA: - xlnx,num-fstores: Should be the number of framebuffers as configured in h/w. -Optional properties: -- xlnx,include-sg: Tells configured for Scatter-mode in - the hardware. Optional properties for AXI DMA: +- xlnx,sg-length-width: Should be set to the width in bits of the length + register as configured in h/w. Takes values {8...26}. If the property + is missing or invalid then the default value 23 is used. This is the + maximum value that is supported by all IP versions. - xlnx,mcdma: Tells whether configured for multi-channel mode in the hardware. Optional properties for VDMA: - xlnx,flush-fsync: Tells which channel to Flush on Frame sync. diff --git a/Documentation/devicetree/bindings/input/cypress,tm2-touchkey.txt b/Documentation/devicetree/bindings/input/cypress,tm2-touchkey.txt index 0c252d9306da..ef2ae729718f 100644 --- a/Documentation/devicetree/bindings/input/cypress,tm2-touchkey.txt +++ b/Documentation/devicetree/bindings/input/cypress,tm2-touchkey.txt @@ -1,13 +1,19 @@ Samsung tm2-touchkey Required properties: -- compatible: must be "cypress,tm2-touchkey" +- compatible: + * "cypress,tm2-touchkey" - for the touchkey found on the tm2 board + * "cypress,midas-touchkey" - for the touchkey found on midas boards + * "cypress,aries-touchkey" - for the touchkey found on aries boards - reg: I2C address of the chip. - interrupts: interrupt to which the chip is connected (see interrupt binding[0]). - vcc-supply : internal regulator output. 1.8V - vdd-supply : power supply for IC 3.3V +Optional properties: +- linux,keycodes: array of keycodes (max 4), default KEY_PHONE and KEY_BACK + [0]: Documentation/devicetree/bindings/interrupt-controller/interrupts.txt Example: @@ -21,5 +27,6 @@ Example: interrupts = <2 IRQ_TYPE_EDGE_FALLING>; vcc-supply=<&ldo32_reg>; vdd-supply=<&ldo33_reg>; + linux,keycodes = <KEY_PHONE KEY_BACK>; }; }; diff --git a/Documentation/devicetree/bindings/input/ilitek,ili2xxx.txt b/Documentation/devicetree/bindings/input/ilitek,ili2xxx.txt new file mode 100644 index 000000000000..b2a76301e632 --- /dev/null +++ b/Documentation/devicetree/bindings/input/ilitek,ili2xxx.txt @@ -0,0 +1,25 @@ +Ilitek ILI210x/ILI251x touchscreen controller + +Required properties: +- compatible: + ilitek,ili210x for ILI210x + ilitek,ili251x for ILI251x + +- reg: The I2C address of the device + +- interrupts: The sink for the touchscreen's IRQ output + See ../interrupt-controller/interrupts.txt + +Optional properties for main touchpad device: + +- reset-gpios: GPIO specifier for the touchscreen's reset pin (active low) + +Example: + + touchscreen@41 { + compatible = "ilitek,ili251x"; + reg = <0x41>; + interrupt-parent = <&gpio4>; + interrupts = <7 IRQ_TYPE_EDGE_FALLING>; + reset-gpios = <&gpio5 21 GPIO_ACTIVE_LOW>; + }; diff --git a/Documentation/devicetree/bindings/input/msm-vibrator.txt b/Documentation/devicetree/bindings/input/msm-vibrator.txt new file mode 100644 index 000000000000..8dcf014ef2e5 --- /dev/null +++ b/Documentation/devicetree/bindings/input/msm-vibrator.txt @@ -0,0 +1,36 @@ +* Device tree bindings for the Qualcomm MSM vibrator + +Required properties: + + - compatible: Should be one of + "qcom,msm8226-vibrator" + "qcom,msm8974-vibrator" + - reg: the base address and length of the IO memory for the registers. + - pinctrl-names: set to default. + - pinctrl-0: phandles pointing to pin configuration nodes. See + Documentation/devicetree/bindings/pinctrl/pinctrl-bindings.txt + - clock-names: set to pwm + - clocks: phandle of the clock. See + Documentation/devicetree/bindings/clock/clock-bindings.txt + - enable-gpios: GPIO that enables the vibrator. + +Optional properties: + + - vcc-supply: phandle to the regulator that provides power to the sensor. + +Example from a LG Nexus 5 (hammerhead) phone: + +vibrator@fd8c3450 { + reg = <0xfd8c3450 0x400>; + compatible = "qcom,msm8974-vibrator"; + + vcc-supply = <&pm8941_l19>; + + clocks = <&mmcc CAMSS_GP1_CLK>; + clock-names = "pwm"; + + enable-gpios = <&msmgpio 60 GPIO_ACTIVE_HIGH>; + + pinctrl-names = "default"; + pinctrl-0 = <&vibrator_pin>; +}; diff --git a/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt b/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt index da2dc5d6c98b..870b8c5cce9b 100644 --- a/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt +++ b/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt @@ -1,11 +1,12 @@ FocalTech EDT-FT5x06 Polytouch driver ===================================== -There are 3 variants of the chip for various touch panel sizes +There are 5 variants of the chip for various touch panel sizes FT5206GE1 2.8" .. 3.8" FT5306DE4 4.3" .. 7" FT5406EE8 7" .. 8.9" FT5506EEG 7" .. 8.9" +FT5726NEI 5.7” .. 11.6" The software interface is identical for all those chips, so that currently there is no need for the driver to distinguish between the @@ -19,6 +20,7 @@ Required properties: or: "edt,edt-ft5306" or: "edt,edt-ft5406" or: "edt,edt-ft5506" + or: "evervision,ev-ft5726" or: "focaltech,ft6236" - reg: I2C slave address of the chip (0x38) @@ -42,6 +44,15 @@ Optional properties: - offset: allows setting the edge compensation in the range from 0 to 31. + + - offset-x: Same as offset, but applies only to the horizontal position. + Range from 0 to 80, only supported by evervision,ev-ft5726 + devices. + + - offset-y: Same as offset, but applies only to the vertical position. + Range from 0 to 80, only supported by evervision,ev-ft5726 + devices. + - touchscreen-size-x : See touchscreen.txt - touchscreen-size-y : See touchscreen.txt - touchscreen-fuzz-x : See touchscreen.txt diff --git a/Documentation/devicetree/bindings/input/touchscreen/goodix.txt b/Documentation/devicetree/bindings/input/touchscreen/goodix.txt index f7e95c52f3c7..8cf0b4d38a7e 100644 --- a/Documentation/devicetree/bindings/input/touchscreen/goodix.txt +++ b/Documentation/devicetree/bindings/input/touchscreen/goodix.txt @@ -3,6 +3,7 @@ Device tree bindings for Goodix GT9xx series touchscreen controller Required properties: - compatible : Should be "goodix,gt1151" + or "goodix,gt5688" or "goodix,gt911" or "goodix,gt9110" or "goodix,gt912" @@ -18,11 +19,14 @@ Optional properties: - irq-gpios : GPIO pin used for IRQ. The driver uses the interrupt gpio pin as output to reset the device. - reset-gpios : GPIO pin used for reset - - - touchscreen-inverted-x : X axis is inverted (boolean) - - touchscreen-inverted-y : Y axis is inverted (boolean) - - touchscreen-swapped-x-y : X and Y axis are swapped (boolean) - (swapping is done after inverting the axis) + - touchscreen-inverted-x + - touchscreen-inverted-y + - touchscreen-size-x + - touchscreen-size-y + - touchscreen-swapped-x-y + +The touchscreen-* properties are documented in touchscreen.txt in this +directory. Example: diff --git a/Documentation/devicetree/bindings/input/touchscreen/sitronix-st1232.txt b/Documentation/devicetree/bindings/input/touchscreen/sitronix-st1232.txt index 64ad48b824a2..019373253b28 100644 --- a/Documentation/devicetree/bindings/input/touchscreen/sitronix-st1232.txt +++ b/Documentation/devicetree/bindings/input/touchscreen/sitronix-st1232.txt @@ -1,13 +1,17 @@ -* Sitronix st1232 touchscreen controller +* Sitronix st1232 or st1633 touchscreen controller Required properties: -- compatible: must be "sitronix,st1232" +- compatible: must contain one of + * "sitronix,st1232" + * "sitronix,st1633" - reg: I2C address of the chip - interrupts: interrupt to which the chip is connected Optional properties: - gpios: a phandle to the reset GPIO +For additional optional properties see: touchscreen.txt + Example: i2c@00000000 { diff --git a/Documentation/devicetree/bindings/input/touchscreen/sx8654.txt b/Documentation/devicetree/bindings/input/touchscreen/sx8654.txt index 4886c4aa2906..0ebe6dd043c7 100644 --- a/Documentation/devicetree/bindings/input/touchscreen/sx8654.txt +++ b/Documentation/devicetree/bindings/input/touchscreen/sx8654.txt @@ -1,10 +1,17 @@ * Semtech SX8654 I2C Touchscreen Controller Required properties: -- compatible: must be "semtech,sx8654" +- compatible: must be one of the following, depending on the model: + "semtech,sx8650" + "semtech,sx8654" + "semtech,sx8655" + "semtech,sx8656" - reg: i2c slave address - interrupts: touch controller interrupt +Optional properties: + - reset-gpios: GPIO specification for the NRST input + Example: sx8654@48 { @@ -12,4 +19,5 @@ Example: reg = <0x48>; interrupt-parent = <&gpio6>; interrupts = <3 IRQ_TYPE_EDGE_FALLING>; + reset-gpios = <&gpio4 2 GPIO_ACTIVE_LOW>; }; diff --git a/Documentation/devicetree/bindings/mailbox/xlnx,zynqmp-ipi-mailbox.txt b/Documentation/devicetree/bindings/mailbox/xlnx,zynqmp-ipi-mailbox.txt new file mode 100644 index 000000000000..4438432bfe9b --- /dev/null +++ b/Documentation/devicetree/bindings/mailbox/xlnx,zynqmp-ipi-mailbox.txt @@ -0,0 +1,127 @@ +Xilinx IPI Mailbox Controller +======================================== + +The Xilinx IPI(Inter Processor Interrupt) mailbox controller is to manage +messaging between two Xilinx Zynq UltraScale+ MPSoC IPI agents. Each IPI +agent owns registers used for notification and buffers for message. + + +-------------------------------------+ + | Xilinx ZynqMP IPI Controller | + +-------------------------------------+ + +--------------------------------------------------+ +ATF | | + | | + | | + +--------------------------+ | + | | + | | + +--------------------------------------------------+ + +------------------------------------------+ + | +----------------+ +----------------+ | +Hardware | | IPI Agent | | IPI Buffers | | + | | Registers | | | | + | | | | | | + | +----------------+ +----------------+ | + | | + | Xilinx IPI Agent Block | + +------------------------------------------+ + + +Controller Device Node: +=========================== +Required properties: +-------------------- +IPI agent node: +- compatible: Shall be: "xlnx,zynqmp-ipi-mailbox" +- interrupt-parent: Phandle for the interrupt controller +- interrupts: Interrupt information corresponding to the + interrupt-names property. +- xlnx,ipi-id: local Xilinx IPI agent ID +- #address-cells: number of address cells of internal IPI mailbox nodes +- #size-cells: number of size cells of internal IPI mailbox nodes + +Internal IPI mailbox node: +- reg: IPI buffers address ranges +- reg-names: Names of the reg resources. It should have: + * local_request_region + - IPI request msg buffer written by local and read + by remote + * local_response_region + - IPI response msg buffer written by local and read + by remote + * remote_request_region + - IPI request msg buffer written by remote and read + by local + * remote_response_region + - IPI response msg buffer written by remote and read + by local +- #mbox-cells: Shall be 1. It contains: + * tx(0) or rx(1) channel +- xlnx,ipi-id: remote Xilinx IPI agent ID of which the mailbox is + connected to. + +Optional properties: +-------------------- +- method: The method of accessing the IPI agent registers. + Permitted values are: "smc" and "hvc". Default is + "smc". + +Client Device Node: +=========================== +Required properties: +-------------------- +- mboxes: Standard property to specify a mailbox + (See ./mailbox.txt) +- mbox-names: List of identifier strings for each mailbox + channel. + +Example: +=========================== + zynqmp_ipi { + compatible = "xlnx,zynqmp-ipi-mailbox"; + interrupt-parent = <&gic>; + interrupts = <0 29 4>; + xlnx,ipi-id = <0>; + #address-cells = <1>; + #size-cells = <1>; + ranges; + + /* APU<->RPU0 IPI mailbox controller */ + ipi_mailbox_rpu0: mailbox@ff90400 { + reg = <0xff990400 0x20>, + <0xff990420 0x20>, + <0xff990080 0x20>, + <0xff9900a0 0x20>; + reg-names = "local_request_region", + "local_response_region", + "remote_request_region", + "remote_response_region"; + #mbox-cells = <1>; + xlnx,ipi-id = <1>; + }; + /* APU<->RPU1 IPI mailbox controller */ + ipi_mailbox_rpu1: mailbox@ff990440 { + reg = <0xff990440 0x20>, + <0xff990460 0x20>, + <0xff990280 0x20>, + <0xff9902a0 0x20>; + reg-names = "local_request_region", + "local_response_region", + "remote_request_region", + "remote_response_region"; + #mbox-cells = <1>; + xlnx,ipi-id = <2>; + }; + }; + rpu0 { + ... + mboxes = <&ipi_mailbox_rpu0 0>, + <&ipi_mailbox_rpu0 1>; + mbox-names = "tx", "rx"; + }; + rpu1 { + ... + mboxes = <&ipi_mailbox_rpu1 0>, + <&ipi_mailbox_rpu1 1>; + mbox-names = "tx", "rx"; + }; diff --git a/Documentation/devicetree/bindings/net/dsa/dsa.txt b/Documentation/devicetree/bindings/net/dsa/dsa.txt index 35694c0c376b..d66a5292b9d3 100644 --- a/Documentation/devicetree/bindings/net/dsa/dsa.txt +++ b/Documentation/devicetree/bindings/net/dsa/dsa.txt @@ -71,6 +71,10 @@ properties, described in binding documents: Documentation/devicetree/bindings/net/fixed-link.txt for details. +- local-mac-address : See + Documentation/devicetree/bindings/net/ethernet.txt + for details. + Example The following example shows three switches on three MDIO busses, @@ -97,6 +101,7 @@ linked into one DSA cluster. port@1 { reg = <1>; label = "lan1"; + local-mac-address = [00 00 00 00 00 00]; }; port@2 { diff --git a/Documentation/devicetree/bindings/net/stm32-dwmac.txt b/Documentation/devicetree/bindings/net/stm32-dwmac.txt index 1341012722aa..a90eef11dc46 100644 --- a/Documentation/devicetree/bindings/net/stm32-dwmac.txt +++ b/Documentation/devicetree/bindings/net/stm32-dwmac.txt @@ -14,8 +14,7 @@ Required properties: - clock-names: Should be "stmmaceth" for the host clock. Should be "mac-clk-tx" for the MAC TX clock. Should be "mac-clk-rx" for the MAC RX clock. - For MPU family need to add also "ethstp" for power mode clock and, - "syscfg-clk" for SYSCFG clock. + For MPU family need to add also "ethstp" for power mode clock - interrupt-names: Should contain a list of interrupt names corresponding to the interrupts in the interrupts property, if available. Should be "macirq" for the main MAC IRQ @@ -24,9 +23,9 @@ Required properties: encompases the glue register, and the offset of the control register. Optional properties: -- clock-names: For MPU family "mac-clk-ck" for PHY without quartz -- st,int-phyclk (boolean) : valid only where PHY do not have quartz and need to be clock - by RCC +- clock-names: For MPU family "eth-ck" for PHY without quartz +- st,eth-clk-sel (boolean) : set this property in RGMII PHY when you want to select RCC clock instead of ETH_CLK125. +- st,eth-ref-clk-sel (boolean) : set this property in RMII mode when you have PHY without crystal 50MHz and want to select RCC clock instead of ETH_REF_CLK. Example: diff --git a/Documentation/devicetree/bindings/pinctrl/atmel,at91-pinctrl.txt b/Documentation/devicetree/bindings/pinctrl/atmel,at91-pinctrl.txt index 3e23fece99da..eb39f5051159 100644 --- a/Documentation/devicetree/bindings/pinctrl/atmel,at91-pinctrl.txt +++ b/Documentation/devicetree/bindings/pinctrl/atmel,at91-pinctrl.txt @@ -19,7 +19,7 @@ such as pull-up, multi drive, etc. Required properties for iomux controller: - compatible: "atmel,at91rm9200-pinctrl" or "atmel,at91sam9x5-pinctrl" - or "atmel,sama5d3-pinctrl" + or "atmel,sama5d3-pinctrl" or "microchip,sam9x60-pinctrl" - atmel,mux-mask: array of mask (periph per bank) to describe if a pin can be configured in this periph mode. All the periph and bank need to be describe. @@ -100,6 +100,7 @@ DRIVE_STRENGTH (3 << 5): indicate the drive strength of the pin using the 11 - High OUTPUT (1 << 7): indicate this pin need to be configured as an output. OUTPUT_VAL (1 << 8): output val (1 = high, 0 = low) +SLEWRATE (1 << 9): slew rate of the pin: 0 = disable, 1 = enable DEBOUNCE (1 << 16): indicate this pin needs debounce. DEBOUNCE_VAL (0x3fff << 17): debounce value. @@ -116,6 +117,19 @@ Some requirements for using atmel,at91rm9200-pinctrl binding: configurations by referring to the phandle of that pin configuration node. 4. The gpio controller must be describe in the pinctrl simple-bus. +For each bank the required properties are: +- compatible: "atmel,at91sam9x5-gpio" or "atmel,at91rm9200-gpio" or + "microchip,sam9x60-gpio" +- reg: physical base address and length of the controller's registers +- interrupts: interrupt outputs from the controller +- interrupt-controller: marks the device node as an interrupt controller +- #interrupt-cells: should be 2; refer to ../interrupt-controller/interrupts.txt + for more details. +- gpio-controller +- #gpio-cells: should be 2; the first cell is the GPIO number and the second + cell specifies GPIO flags as defined in <dt-bindings/gpio/gpio.h>. +- clocks: bank clock + Examples: pinctrl@fffff400 { @@ -125,6 +139,17 @@ pinctrl@fffff400 { compatible = "atmel,at91rm9200-pinctrl", "simple-bus"; reg = <0xfffff400 0x600>; + pioA: gpio@fffff400 { + compatible = "atmel,at91sam9x5-gpio"; + reg = <0xfffff400 0x200>; + interrupts = <2 IRQ_TYPE_LEVEL_HIGH 1>; + #gpio-cells = <2>; + gpio-controller; + interrupt-controller; + #interrupt-cells = <2>; + clocks = <&pmc PMC_TYPE_PERIPHERAL 2>; + }; + atmel,mux-mask = < /* A B */ 0xffffffff 0xffc00c3b /* pioA */ diff --git a/Documentation/devicetree/bindings/pinctrl/fsl,imx50-pinctrl.txt b/Documentation/devicetree/bindings/pinctrl/fsl,imx50-pinctrl.txt new file mode 100644 index 000000000000..6da01d619d33 --- /dev/null +++ b/Documentation/devicetree/bindings/pinctrl/fsl,imx50-pinctrl.txt @@ -0,0 +1,32 @@ +* Freescale IMX50 IOMUX Controller + +Please refer to fsl,imx-pinctrl.txt in this directory for common binding part +and usage. + +Required properties: +- compatible: "fsl,imx50-iomuxc" +- fsl,pins: two integers array, represents a group of pins mux and config + setting. The format is fsl,pins = <PIN_FUNC_ID CONFIG>, PIN_FUNC_ID is a + pin working on a specific function, CONFIG is the pad setting value like + pull-up for this pin. Please refer to imx50 datasheet for the valid pad + config settings. + +CONFIG bits definition: +PAD_CTL_HVE (1 << 13) +PAD_CTL_HYS (1 << 8) +PAD_CTL_PKE (1 << 7) +PAD_CTL_PUE (1 << 6) +PAD_CTL_PUS_100K_DOWN (0 << 4) +PAD_CTL_PUS_47K_UP (1 << 4) +PAD_CTL_PUS_100K_UP (2 << 4) +PAD_CTL_PUS_22K_UP (3 << 4) +PAD_CTL_ODE (1 << 3) +PAD_CTL_DSE_LOW (0 << 1) +PAD_CTL_DSE_MED (1 << 1) +PAD_CTL_DSE_HIGH (2 << 1) +PAD_CTL_DSE_MAX (3 << 1) +PAD_CTL_SRE_FAST (1 << 0) +PAD_CTL_SRE_SLOW (0 << 0) + +Refer to imx50-pinfunc.h in device tree source folder for all available +imx50 PIN_FUNC_ID. diff --git a/Documentation/devicetree/bindings/pinctrl/fsl,imx8mm-pinctrl.txt b/Documentation/devicetree/bindings/pinctrl/fsl,imx8mm-pinctrl.txt new file mode 100644 index 000000000000..524a16fca666 --- /dev/null +++ b/Documentation/devicetree/bindings/pinctrl/fsl,imx8mm-pinctrl.txt @@ -0,0 +1,36 @@ +* Freescale IMX8MM IOMUX Controller + +Please refer to fsl,imx-pinctrl.txt and pinctrl-bindings.txt in this directory +for common binding part and usage. + +Required properties: +- compatible: "fsl,imx8mm-iomuxc" +- reg: should contain the base physical address and size of the iomuxc + registers. + +Required properties in sub-nodes: +- fsl,pins: each entry consists of 6 integers and represents the mux and config + setting for one pin. The first 5 integers <mux_reg conf_reg input_reg mux_val + input_val> are specified using a PIN_FUNC_ID macro, which can be found in + <dt-bindings/pinctrl/imx8mm-pinfunc.h>. The last integer CONFIG is + the pad setting value like pull-up on this pin. Please refer to i.MX8M Mini + Reference Manual for detailed CONFIG settings. + +Examples: + +&uart1 { + pinctrl-names = "default"; + pinctrl-0 = <&pinctrl_uart1>; +}; + +iomuxc: pinctrl@30330000 { + compatible = "fsl,imx8mm-iomuxc"; + reg = <0x0 0x30330000 0x0 0x10000>; + + pinctrl_uart1: uart1grp { + fsl,pins = < + MX8MM_IOMUXC_UART1_RXD_UART1_DCE_RX 0x140 + MX8MM_IOMUXC_UART1_TXD_UART1_DCE_TX 0x140 + >; + }; +}; diff --git a/Documentation/devicetree/bindings/pinctrl/marvell,armada-37xx-pinctrl.txt b/Documentation/devicetree/bindings/pinctrl/marvell,armada-37xx-pinctrl.txt index c7c088d2dd50..38dc56a57760 100644 --- a/Documentation/devicetree/bindings/pinctrl/marvell,armada-37xx-pinctrl.txt +++ b/Documentation/devicetree/bindings/pinctrl/marvell,armada-37xx-pinctrl.txt @@ -58,11 +58,11 @@ group pwm3 - functions pwm, gpio group pmic1 - - pin 17 + - pin 7 - functions pmic, gpio group pmic0 - - pin 16 + - pin 6 - functions pmic, gpio group i2c2 @@ -112,19 +112,31 @@ group usb2_drvvbus1 - functions drvbus, gpio group sdio_sb - - pins 60-64 + - pins 60-65 - functions sdio, gpio group rgmii - - pins 42-55 + - pins 42-53 - functions mii, gpio group pcie1 - - pins 39-40 + - pins 39 + - functions pcie, gpio + +group pcie1_clkreq + - pins 40 - functions pcie, gpio +group pcie1_wakeup + - pins 41 + - functions pcie, gpio + +group smi + - pins 54-55 + - functions smi, gpio + group ptp - - pins 56-58 + - pins 56 - functions ptp, gpio group ptp_clk diff --git a/Documentation/devicetree/bindings/pinctrl/meson,pinctrl.txt b/Documentation/devicetree/bindings/pinctrl/meson,pinctrl.txt index 82ead40311f6..a47dd990a8d3 100644 --- a/Documentation/devicetree/bindings/pinctrl/meson,pinctrl.txt +++ b/Documentation/devicetree/bindings/pinctrl/meson,pinctrl.txt @@ -23,11 +23,11 @@ The GPIO bank for the controller is represented as a sub-node and it acts as a GPIO controller. Required properties for sub-nodes are: - - reg: should contain address and size for mux, pull-enable, pull and - gpio register sets - - reg-names: an array of strings describing the "reg" entries. Must - contain "mux", "pull" and "gpio". "pull-enable" is optional and - when it is missing the "pull" registers are used instead + - reg: should contain a list of address and size, one tuple for each entry + in reg-names. + - reg-names: an array of strings describing the "reg" entries. + Must contain "mux" and "gpio". + May contain "pull", "pull-enable" and "ds" when appropriate. - gpio-controller: identifies the node as a gpio controller - #gpio-cells: must be 2 diff --git a/Documentation/devicetree/bindings/pwm/atmel-pwm.txt b/Documentation/devicetree/bindings/pwm/atmel-pwm.txt index c8c831d7b0d1..591ecdd39c7b 100644 --- a/Documentation/devicetree/bindings/pwm/atmel-pwm.txt +++ b/Documentation/devicetree/bindings/pwm/atmel-pwm.txt @@ -5,6 +5,7 @@ Required properties: - "atmel,at91sam9rl-pwm" - "atmel,sama5d3-pwm" - "atmel,sama5d2-pwm" + - "microchip,sam9x60-pwm" - reg: physical base address and length of the controller's registers - #pwm-cells: Should be 3. See pwm.txt in this directory for a description of the cells format. diff --git a/Documentation/devicetree/bindings/pwm/pwm-hibvt.txt b/Documentation/devicetree/bindings/pwm/pwm-hibvt.txt index fa7849d67836..daedfef09bb6 100644 --- a/Documentation/devicetree/bindings/pwm/pwm-hibvt.txt +++ b/Documentation/devicetree/bindings/pwm/pwm-hibvt.txt @@ -5,6 +5,8 @@ Required properties: The SoC specific strings supported including: "hisilicon,hi3516cv300-pwm" "hisilicon,hi3519v100-pwm" + "hisilicon,hi3559v100-shub-pwm" + "hisilicon,hi3559v100-pwm - reg: physical base address and length of the controller's registers. - clocks: phandle and clock specifier of the PWM reference clock. - resets: phandle and reset specifier for the PWM controller reset. diff --git a/Documentation/devicetree/bindings/remoteproc/qcom,adsp-pil.txt b/Documentation/devicetree/bindings/remoteproc/qcom,adsp-pil.txt index a842a782b557..66af2c30944f 100644 --- a/Documentation/devicetree/bindings/remoteproc/qcom,adsp-pil.txt +++ b/Documentation/devicetree/bindings/remoteproc/qcom,adsp-pil.txt @@ -35,7 +35,7 @@ on the Qualcomm Technology Inc. ADSP Hexagon core. Value type: <stringlist> Definition: List of clock input name strings sorted in the same order as the clocks property. Definition must have - "xo", "sway_cbcr", "lpass_aon", "lpass_ahbs_aon_cbcr", + "xo", "sway_cbcr", "lpass_ahbs_aon_cbcr", "lpass_ahbm_aon_cbcr", "qdsp6ss_xo", "qdsp6ss_sleep" and "qdsp6ss_core". @@ -100,13 +100,12 @@ ADSP, as it is found on SDM845 boards. clocks = <&rpmhcc RPMH_CXO_CLK>, <&gcc GCC_LPASS_SWAY_CLK>, - <&lpasscc LPASS_AUDIO_WRAPPER_AON_CLK>, <&lpasscc LPASS_Q6SS_AHBS_AON_CLK>, <&lpasscc LPASS_Q6SS_AHBM_AON_CLK>, <&lpasscc LPASS_QDSP6SS_XO_CLK>, <&lpasscc LPASS_QDSP6SS_SLEEP_CLK>, <&lpasscc LPASS_QDSP6SS_CORE_CLK>; - clock-names = "xo", "sway_cbcr", "lpass_aon", + clock-names = "xo", "sway_cbcr", "lpass_ahbs_aon_cbcr", "lpass_ahbm_aon_cbcr", "qdsp6ss_xo", "qdsp6ss_sleep", "qdsp6ss_core"; diff --git a/Documentation/devicetree/bindings/remoteproc/qcom,adsp.txt b/Documentation/devicetree/bindings/remoteproc/qcom,adsp.txt index 9c0cff3a5ed8..292dfda9770d 100644 --- a/Documentation/devicetree/bindings/remoteproc/qcom,adsp.txt +++ b/Documentation/devicetree/bindings/remoteproc/qcom,adsp.txt @@ -19,13 +19,30 @@ on the Qualcomm ADSP Hexagon core. - interrupts-extended: Usage: required Value type: <prop-encoded-array> - Definition: must list the watchdog, fatal IRQs ready, handover and - stop-ack IRQs + Definition: reference to the interrupts that match interrupt-names - interrupt-names: Usage: required Value type: <stringlist> - Definition: must be "wdog", "fatal", "ready", "handover", "stop-ack" + Definition: The interrupts needed depends on the compatible + string: + qcom,msm8974-adsp-pil: + qcom,msm8996-adsp-pil: + qcom,msm8996-slpi-pil: + qcom,qcs404-adsp-pas: + qcom,qcs404-cdsp-pas: + qcom,sdm845-adsp-pas: + qcom,sdm845-cdsp-pas: + must be "wdog", "fatal", "ready", "handover", "stop-ack" + qcom,qcs404-wcss-pas: + must be "wdog", "fatal", "ready", "handover", "stop-ack", + "shutdown-ack" + +- firmware-name: + Usage: optional + Value type: <string> + Definition: must list the relative firmware image path for the + Hexagon Core. - clocks: Usage: required diff --git a/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt b/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt index 9ff5b0309417..41ca5df5be5a 100644 --- a/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt +++ b/Documentation/devicetree/bindings/remoteproc/qcom,q6v5.txt @@ -28,24 +28,51 @@ on the Qualcomm Hexagon core. - interrupts-extended: Usage: required Value type: <prop-encoded-array> - Definition: must list the watchdog, fatal IRQs ready, handover and - stop-ack IRQs + Definition: reference to the interrupts that match interrupt-names - interrupt-names: Usage: required Value type: <stringlist> - Definition: must be "wdog", "fatal", "ready", "handover", "stop-ack" + Definition: The interrupts needed depends on the the compatible + string: + qcom,q6v5-pil: + qcom,ipq8074-wcss-pil: + qcom,msm8916-mss-pil: + qcom,msm8974-mss-pil: + must be "wdog", "fatal", "ready", "handover", "stop-ack" + qcom,msm8996-mss-pil: + qcom,sdm845-mss-pil: + must be "wdog", "fatal", "ready", "handover", "stop-ack", + "shutdown-ack" + +- firmware-name: + Usage: optional + Value type: <stringlist> + Definition: must list the relative firmware image paths for mba and + modem. They are used for booting and authenticating the + Hexagon core. - clocks: Usage: required Value type: <phandle> - Definition: reference to the iface, bus and mem clocks to be held on - behalf of the booting of the Hexagon core + Definition: reference to the clocks that match clock-names - clock-names: Usage: required Value type: <stringlist> - Definition: must be "iface", "bus", "mem" + Definition: The clocks needed depend on the compatible string: + qcom,ipq8074-wcss-pil: + no clock names required + qcom,q6v5-pil: + qcom,msm8916-mss-pil: + qcom,msm8974-mss-pil: + must be "iface", "bus", "mem", "xo" + qcom,msm8996-mss-pil: + must be "iface", "bus", "mem", "xo", "gpll0_mss", + "snoc_axi", "mnoc_axi", "pnoc", "qdss" + qcom,sdm845-mss-pil: + must be "iface", "bus", "mem", "xo", "gpll0_mss", + "snoc_axi", "mnoc_axi", "prng" - resets: Usage: required @@ -65,6 +92,19 @@ on the Qualcomm Hexagon core. must be "mss_restart", "pdc_reset" for the modem sub-system on SDM845 SoCs +For the compatible strings below the following supplies are required: + "qcom,q6v5-pil" + "qcom,msm8916-mss-pil", +- cx-supply: +- mx-supply: +- pll-supply: + Usage: required + Value type: <phandle> + Definition: reference to the regulators to be held on behalf of the + booting of the Hexagon core + +For the compatible string below the following supplies are required: + "qcom,msm8974-mss-pil" - cx-supply: - mss-supply: - mx-supply: @@ -74,6 +114,33 @@ on the Qualcomm Hexagon core. Definition: reference to the regulators to be held on behalf of the booting of the Hexagon core +For the compatible string below the following supplies are required: + "qcom,msm8996-mss-pil" +- pll-supply: + Usage: required + Value type: <phandle> + Definition: reference to the regulators to be held on behalf of the + booting of the Hexagon core + +- power-domains: + Usage: required + Value type: <phandle> + Definition: reference to power-domains that match power-domain-names + +- power-domain-names: + Usage: required + Value type: <stringlist> + Definition: The power-domains needed depend on the compatible string: + qcom,q6v5-pil: + qcom,ipq8074-wcss-pil: + qcom,msm8916-mss-pil: + qcom,msm8974-mss-pil: + no power-domain names required + qcom,msm8996-mss-pil: + must be "cx", "mx" + qcom,sdm845-mss-pil: + must be "cx", "mx", "mss", "load_state" + - qcom,smem-states: Usage: required Value type: <phandle> diff --git a/Documentation/devicetree/bindings/watchdog/renesas-wdt.txt b/Documentation/devicetree/bindings/watchdog/renesas-wdt.txt index ef2b97b72e08..9f365c1a3399 100644 --- a/Documentation/devicetree/bindings/watchdog/renesas-wdt.txt +++ b/Documentation/devicetree/bindings/watchdog/renesas-wdt.txt @@ -8,6 +8,7 @@ Required properties: - "renesas,r8a7743-wdt" (RZ/G1M) - "renesas,r8a7744-wdt" (RZ/G1N) - "renesas,r8a7745-wdt" (RZ/G1E) + - "renesas,r8a77470-wdt" (RZ/G1C) - "renesas,r8a774a1-wdt" (RZ/G2M) - "renesas,r8a774c0-wdt" (RZ/G2E) - "renesas,r8a7790-wdt" (R-Car H2) diff --git a/Documentation/driver-api/dmaengine/client.rst b/Documentation/driver-api/dmaengine/client.rst index d728e50105eb..45953f171500 100644 --- a/Documentation/driver-api/dmaengine/client.rst +++ b/Documentation/driver-api/dmaengine/client.rst @@ -172,7 +172,7 @@ The details of these operations are: After calling ``dmaengine_submit()`` the submitted transfer descriptor (``struct dma_async_tx_descriptor``) belongs to the DMA engine. - Consequentially, the client must consider invalid the pointer to that + Consequently, the client must consider invalid the pointer to that descriptor. 5. Issue pending DMA requests and wait for callback notification diff --git a/Documentation/driver-api/dmaengine/dmatest.rst b/Documentation/driver-api/dmaengine/dmatest.rst index 8d81f1a7169b..e78d070bb468 100644 --- a/Documentation/driver-api/dmaengine/dmatest.rst +++ b/Documentation/driver-api/dmaengine/dmatest.rst @@ -59,6 +59,7 @@ parameter, that specific channel is requested using the dmaengine and a thread is created with the existing parameters. This thread is set as pending and will be executed once run is set to 1. Any parameters set after the thread is created are not applied. + .. hint:: available channel list could be extracted by running the following command:: diff --git a/Documentation/driver-api/pinctl.rst b/Documentation/driver-api/pinctl.rst index 6cb68d67fa75..2bb1bc484278 100644 --- a/Documentation/driver-api/pinctl.rst +++ b/Documentation/driver-api/pinctl.rst @@ -274,15 +274,6 @@ configuration in the pin controller ops like this:: .confops = &foo_pconf_ops, }; -Since some controllers have special logic for handling entire groups of pins -they can exploit the special whole-group pin control function. The -pin_config_group_set() callback is allowed to return the error code -EAGAIN, -for groups it does not want to handle, or if it just wants to do some -group-level handling and then fall through to iterate over all pins, in which -case each individual pin will be treated by separate pin_config_set() calls as -well. - - Interaction with the GPIO subsystem =================================== diff --git a/Documentation/driver-model/devres.txt b/Documentation/driver-model/devres.txt index b277cafce71e..d7d6f01e81ff 100644 --- a/Documentation/driver-model/devres.txt +++ b/Documentation/driver-model/devres.txt @@ -242,9 +242,11 @@ certainly invest a bit more effort into libata core layer). CLOCK devm_clk_get() + devm_clk_get_optional() devm_clk_put() devm_clk_hw_register() devm_of_clk_add_hw_provider() + devm_clk_hw_register_clkdev() DMA dmaenginem_async_device_register() diff --git a/Documentation/filesystems/ceph.txt b/Documentation/filesystems/ceph.txt index 1177052701e1..d2c6a5ccf0f5 100644 --- a/Documentation/filesystems/ceph.txt +++ b/Documentation/filesystems/ceph.txt @@ -22,9 +22,7 @@ In contrast to cluster filesystems like GFS, OCFS2, and GPFS that rely on symmetric access by all clients to shared block devices, Ceph separates data and metadata management into independent server clusters, similar to Lustre. Unlike Lustre, however, metadata and -storage nodes run entirely as user space daemons. Storage nodes -utilize btrfs to store data objects, leveraging its advanced features -(checksumming, metadata replication, etc.). File data is striped +storage nodes run entirely as user space daemons. File data is striped across storage nodes in large chunks to distribute workload and facilitate high throughputs. When storage nodes fail, data is re-replicated in a distributed fashion by the storage nodes themselves @@ -118,6 +116,10 @@ Mount Options of a non-responsive Ceph file system. The default is 30 seconds. + caps_max=X + Specify the maximum number of caps to hold. Unused caps are released + when number of caps exceeds the limit. The default is 0 (no limit) + rbytes When stat() is called on a directory, set st_size to 'rbytes', the summation of file sizes over all files nested beneath that @@ -160,11 +162,11 @@ More Information ================ For more information on Ceph, see the home page at - http://ceph.newdream.net/ + https://ceph.com/ The Linux kernel client source tree is available at - git://ceph.newdream.net/git/ceph-client.git + https://github.com/ceph/ceph-client.git git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git and the source for the full system is at - git://ceph.newdream.net/git/ceph.git + https://github.com/ceph/ceph.git diff --git a/Documentation/filesystems/cifs/TODO b/Documentation/filesystems/cifs/TODO index 66b3f54aa6dc..9267f3fb131f 100644 --- a/Documentation/filesystems/cifs/TODO +++ b/Documentation/filesystems/cifs/TODO @@ -111,7 +111,8 @@ negotiated size) and send larger write sizes to modern servers. 5) Continue to extend the smb3 "buildbot" which does automated xfstesting against Windows, Samba and Azure currently - to add additional tests and -to allow the buildbot to execute the tests faster. +to allow the buildbot to execute the tests faster. The URL for the +buildbot is: http://smb3-test-rhel-75.southcentralus.cloudapp.azure.com 6) Address various coverity warnings (most are not bugs per-se, but the more warnings are addressed, the easier it is to spot real diff --git a/Documentation/filesystems/cifs/cifs.txt b/Documentation/filesystems/cifs/cifs.txt index 67756607246e..1be3d21c286e 100644 --- a/Documentation/filesystems/cifs/cifs.txt +++ b/Documentation/filesystems/cifs/cifs.txt @@ -1,16 +1,21 @@ This is the client VFS module for the SMB3 NAS protocol as well - older dialects such as the Common Internet File System (CIFS) + as for older dialects such as the Common Internet File System (CIFS) protocol which was the successor to the Server Message Block (SMB) protocol, the native file sharing mechanism for most early PC operating systems. New and improved versions of CIFS are now - called SMB2 and SMB3. These dialects are also supported by the - CIFS VFS module. CIFS is fully supported by network - file servers such as Windows 2000, 2003, 2008, 2012 and 2016 - as well by Samba (which provides excellent CIFS - server support for Linux and many other operating systems), Apple - systems, as well as most Network Attached Storage vendors, so - this network filesystem client can mount to a wide variety of - servers. + called SMB2 and SMB3. Use of SMB3 (and later, including SMB3.1.1) + is strongly preferred over using older dialects like CIFS due to + security reaasons. All modern dialects, including the most recent, + SMB3.1.1 are supported by the CIFS VFS module. The SMB3 protocol + is implemented and supported by all major file servers + such as all modern versions of Windows (including Windows 2016 + Server), as well as by Samba (which provides excellent + CIFS/SMB2/SMB3 server support and tools for Linux and many other + operating systems). Apple systems also support SMB3 well, as + do most Network Attached Storage vendors, so this network + filesystem client can mount to a wide variety of systems. + It also supports mounting to the cloud (for example + Microsoft Azure), including the necessary security features. The intent of this module is to provide the most advanced network file system function for SMB3 compliant servers, including advanced @@ -24,12 +29,17 @@ cluster file systems for fileserving in some Linux to Linux environments, not just in Linux to Windows (or Linux to Mac) environments. - This filesystem has an mount utility (mount.cifs) that can be obtained from + This filesystem has a mount utility (mount.cifs) and various user space + tools (including smbinfo and setcifsacl) that can be obtained from - https://ftp.samba.org/pub/linux-cifs/cifs-utils/ + https://git.samba.org/?p=cifs-utils.git + or + git://git.samba.org/cifs-utils.git - It must be installed in the directory with the other mount helpers. + mount.cifs should be installed in the directory with the other mount helpers. For more information on the module see the project wiki page at + https://wiki.samba.org/index.php/LinuxCIFS + and https://wiki.samba.org/index.php/LinuxCIFS_utils diff --git a/Documentation/filesystems/f2fs.txt b/Documentation/filesystems/f2fs.txt index e46c2147ddf8..f7b5e4ff0de3 100644 --- a/Documentation/filesystems/f2fs.txt +++ b/Documentation/filesystems/f2fs.txt @@ -126,6 +126,8 @@ disable_ext_identify Disable the extension list configured by mkfs, so f2fs does not aware of cold files such as media files. inline_xattr Enable the inline xattrs feature. noinline_xattr Disable the inline xattrs feature. +inline_xattr_size=%u Support configuring inline xattr size, it depends on + flexible inline xattr feature. inline_data Enable the inline data feature: New created small(<~3.4k) files can be written into inode block. inline_dentry Enable the inline dir feature: data in new created diff --git a/Documentation/filesystems/mount_api.txt b/Documentation/filesystems/mount_api.txt new file mode 100644 index 000000000000..944d1965e917 --- /dev/null +++ b/Documentation/filesystems/mount_api.txt @@ -0,0 +1,709 @@ + ==================== + FILESYSTEM MOUNT API + ==================== + +CONTENTS + + (1) Overview. + + (2) The filesystem context. + + (3) The filesystem context operations. + + (4) Filesystem context security. + + (5) VFS filesystem context operations. + + (6) Parameter description. + + (7) Parameter helper functions. + + +======== +OVERVIEW +======== + +The creation of new mounts is now to be done in a multistep process: + + (1) Create a filesystem context. + + (2) Parse the parameters and attach them to the context. Parameters are + expected to be passed individually from userspace, though legacy binary + parameters can also be handled. + + (3) Validate and pre-process the context. + + (4) Get or create a superblock and mountable root. + + (5) Perform the mount. + + (6) Return an error message attached to the context. + + (7) Destroy the context. + +To support this, the file_system_type struct gains a new field: + + int (*init_fs_context)(struct fs_context *fc); + +which is invoked to set up the filesystem-specific parts of a filesystem +context, including the additional space. + +Note that security initialisation is done *after* the filesystem is called so +that the namespaces may be adjusted first. + + +====================== +THE FILESYSTEM CONTEXT +====================== + +The creation and reconfiguration of a superblock is governed by a filesystem +context. This is represented by the fs_context structure: + + struct fs_context { + const struct fs_context_operations *ops; + struct file_system_type *fs_type; + void *fs_private; + struct dentry *root; + struct user_namespace *user_ns; + struct net *net_ns; + const struct cred *cred; + char *source; + char *subtype; + void *security; + void *s_fs_info; + unsigned int sb_flags; + unsigned int sb_flags_mask; + enum fs_context_purpose purpose:8; + bool sloppy:1; + bool silent:1; + ... + }; + +The fs_context fields are as follows: + + (*) const struct fs_context_operations *ops + + These are operations that can be done on a filesystem context (see + below). This must be set by the ->init_fs_context() file_system_type + operation. + + (*) struct file_system_type *fs_type + + A pointer to the file_system_type of the filesystem that is being + constructed or reconfigured. This retains a reference on the type owner. + + (*) void *fs_private + + A pointer to the file system's private data. This is where the filesystem + will need to store any options it parses. + + (*) struct dentry *root + + A pointer to the root of the mountable tree (and indirectly, the + superblock thereof). This is filled in by the ->get_tree() op. If this + is set, an active reference on root->d_sb must also be held. + + (*) struct user_namespace *user_ns + (*) struct net *net_ns + + There are a subset of the namespaces in use by the invoking process. They + retain references on each namespace. The subscribed namespaces may be + replaced by the filesystem to reflect other sources, such as the parent + mount superblock on an automount. + + (*) const struct cred *cred + + The mounter's credentials. This retains a reference on the credentials. + + (*) char *source + + This specifies the source. It may be a block device (e.g. /dev/sda1) or + something more exotic, such as the "host:/path" that NFS desires. + + (*) char *subtype + + This is a string to be added to the type displayed in /proc/mounts to + qualify it (used by FUSE). This is available for the filesystem to set if + desired. + + (*) void *security + + A place for the LSMs to hang their security data for the superblock. The + relevant security operations are described below. + + (*) void *s_fs_info + + The proposed s_fs_info for a new superblock, set in the superblock by + sget_fc(). This can be used to distinguish superblocks. + + (*) unsigned int sb_flags + (*) unsigned int sb_flags_mask + + Which bits SB_* flags are to be set/cleared in super_block::s_flags. + + (*) enum fs_context_purpose + + This indicates the purpose for which the context is intended. The + available values are: + + FS_CONTEXT_FOR_MOUNT, -- New superblock for explicit mount + FS_CONTEXT_FOR_SUBMOUNT -- New automatic submount of extant mount + FS_CONTEXT_FOR_RECONFIGURE -- Change an existing mount + + (*) bool sloppy + (*) bool silent + + These are set if the sloppy or silent mount options are given. + + [NOTE] sloppy is probably unnecessary when userspace passes over one + option at a time since the error can just be ignored if userspace deems it + to be unimportant. + + [NOTE] silent is probably redundant with sb_flags & SB_SILENT. + +The mount context is created by calling vfs_new_fs_context() or +vfs_dup_fs_context() and is destroyed with put_fs_context(). Note that the +structure is not refcounted. + +VFS, security and filesystem mount options are set individually with +vfs_parse_mount_option(). Options provided by the old mount(2) system call as +a page of data can be parsed with generic_parse_monolithic(). + +When mounting, the filesystem is allowed to take data from any of the pointers +and attach it to the superblock (or whatever), provided it clears the pointer +in the mount context. + +The filesystem is also allowed to allocate resources and pin them with the +mount context. For instance, NFS might pin the appropriate protocol version +module. + + +================================= +THE FILESYSTEM CONTEXT OPERATIONS +================================= + +The filesystem context points to a table of operations: + + struct fs_context_operations { + void (*free)(struct fs_context *fc); + int (*dup)(struct fs_context *fc, struct fs_context *src_fc); + int (*parse_param)(struct fs_context *fc, + struct struct fs_parameter *param); + int (*parse_monolithic)(struct fs_context *fc, void *data); + int (*get_tree)(struct fs_context *fc); + int (*reconfigure)(struct fs_context *fc); + }; + +These operations are invoked by the various stages of the mount procedure to +manage the filesystem context. They are as follows: + + (*) void (*free)(struct fs_context *fc); + + Called to clean up the filesystem-specific part of the filesystem context + when the context is destroyed. It should be aware that parts of the + context may have been removed and NULL'd out by ->get_tree(). + + (*) int (*dup)(struct fs_context *fc, struct fs_context *src_fc); + + Called when a filesystem context has been duplicated to duplicate the + filesystem-private data. An error may be returned to indicate failure to + do this. + + [!] Note that even if this fails, put_fs_context() will be called + immediately thereafter, so ->dup() *must* make the + filesystem-private data safe for ->free(). + + (*) int (*parse_param)(struct fs_context *fc, + struct struct fs_parameter *param); + + Called when a parameter is being added to the filesystem context. param + points to the key name and maybe a value object. VFS-specific options + will have been weeded out and fc->sb_flags updated in the context. + Security options will also have been weeded out and fc->security updated. + + The parameter can be parsed with fs_parse() and fs_lookup_param(). Note + that the source(s) are presented as parameters named "source". + + If successful, 0 should be returned or a negative error code otherwise. + + (*) int (*parse_monolithic)(struct fs_context *fc, void *data); + + Called when the mount(2) system call is invoked to pass the entire data + page in one go. If this is expected to be just a list of "key[=val]" + items separated by commas, then this may be set to NULL. + + The return value is as for ->parse_param(). + + If the filesystem (e.g. NFS) needs to examine the data first and then + finds it's the standard key-val list then it may pass it off to + generic_parse_monolithic(). + + (*) int (*get_tree)(struct fs_context *fc); + + Called to get or create the mountable root and superblock, using the + information stored in the filesystem context (reconfiguration goes via a + different vector). It may detach any resources it desires from the + filesystem context and transfer them to the superblock it creates. + + On success it should set fc->root to the mountable root and return 0. In + the case of an error, it should return a negative error code. + + The phase on a userspace-driven context will be set to only allow this to + be called once on any particular context. + + (*) int (*reconfigure)(struct fs_context *fc); + + Called to effect reconfiguration of a superblock using information stored + in the filesystem context. It may detach any resources it desires from + the filesystem context and transfer them to the superblock. The + superblock can be found from fc->root->d_sb. + + On success it should return 0. In the case of an error, it should return + a negative error code. + + [NOTE] reconfigure is intended as a replacement for remount_fs. + + +=========================== +FILESYSTEM CONTEXT SECURITY +=========================== + +The filesystem context contains a security pointer that the LSMs can use for +building up a security context for the superblock to be mounted. There are a +number of operations used by the new mount code for this purpose: + + (*) int security_fs_context_alloc(struct fs_context *fc, + struct dentry *reference); + + Called to initialise fc->security (which is preset to NULL) and allocate + any resources needed. It should return 0 on success or a negative error + code on failure. + + reference will be non-NULL if the context is being created for superblock + reconfiguration (FS_CONTEXT_FOR_RECONFIGURE) in which case it indicates + the root dentry of the superblock to be reconfigured. It will also be + non-NULL in the case of a submount (FS_CONTEXT_FOR_SUBMOUNT) in which case + it indicates the automount point. + + (*) int security_fs_context_dup(struct fs_context *fc, + struct fs_context *src_fc); + + Called to initialise fc->security (which is preset to NULL) and allocate + any resources needed. The original filesystem context is pointed to by + src_fc and may be used for reference. It should return 0 on success or a + negative error code on failure. + + (*) void security_fs_context_free(struct fs_context *fc); + + Called to clean up anything attached to fc->security. Note that the + contents may have been transferred to a superblock and the pointer cleared + during get_tree. + + (*) int security_fs_context_parse_param(struct fs_context *fc, + struct fs_parameter *param); + + Called for each mount parameter, including the source. The arguments are + as for the ->parse_param() method. It should return 0 to indicate that + the parameter should be passed on to the filesystem, 1 to indicate that + the parameter should be discarded or an error to indicate that the + parameter should be rejected. + + The value pointed to by param may be modified (if a string) or stolen + (provided the value pointer is NULL'd out). If it is stolen, 1 must be + returned to prevent it being passed to the filesystem. + + (*) int security_fs_context_validate(struct fs_context *fc); + + Called after all the options have been parsed to validate the collection + as a whole and to do any necessary allocation so that + security_sb_get_tree() and security_sb_reconfigure() are less likely to + fail. It should return 0 or a negative error code. + + In the case of reconfiguration, the target superblock will be accessible + via fc->root. + + (*) int security_sb_get_tree(struct fs_context *fc); + + Called during the mount procedure to verify that the specified superblock + is allowed to be mounted and to transfer the security data there. It + should return 0 or a negative error code. + + (*) void security_sb_reconfigure(struct fs_context *fc); + + Called to apply any reconfiguration to an LSM's context. It must not + fail. Error checking and resource allocation must be done in advance by + the parameter parsing and validation hooks. + + (*) int security_sb_mountpoint(struct fs_context *fc, struct path *mountpoint, + unsigned int mnt_flags); + + Called during the mount procedure to verify that the root dentry attached + to the context is permitted to be attached to the specified mountpoint. + It should return 0 on success or a negative error code on failure. + + +================================= +VFS FILESYSTEM CONTEXT OPERATIONS +================================= + +There are four operations for creating a filesystem context and +one for destroying a context: + + (*) struct fs_context *vfs_new_fs_context(struct file_system_type *fs_type, + struct dentry *reference, + unsigned int sb_flags, + unsigned int sb_flags_mask, + enum fs_context_purpose purpose); + + Create a filesystem context for a given filesystem type and purpose. This + allocates the filesystem context, sets the superblock flags, initialises + the security and calls fs_type->init_fs_context() to initialise the + filesystem private data. + + reference can be NULL or it may indicate the root dentry of a superblock + that is going to be reconfigured (FS_CONTEXT_FOR_RECONFIGURE) or + the automount point that triggered a submount (FS_CONTEXT_FOR_SUBMOUNT). + This is provided as a source of namespace information. + + (*) struct fs_context *vfs_dup_fs_context(struct fs_context *src_fc); + + Duplicate a filesystem context, copying any options noted and duplicating + or additionally referencing any resources held therein. This is available + for use where a filesystem has to get a mount within a mount, such as NFS4 + does by internally mounting the root of the target server and then doing a + private pathwalk to the target directory. + + The purpose in the new context is inherited from the old one. + + (*) void put_fs_context(struct fs_context *fc); + + Destroy a filesystem context, releasing any resources it holds. This + calls the ->free() operation. This is intended to be called by anyone who + created a filesystem context. + + [!] filesystem contexts are not refcounted, so this causes unconditional + destruction. + +In all the above operations, apart from the put op, the return is a mount +context pointer or a negative error code. + +For the remaining operations, if an error occurs, a negative error code will be +returned. + + (*) int vfs_get_tree(struct fs_context *fc); + + Get or create the mountable root and superblock, using the parameters in + the filesystem context to select/configure the superblock. This invokes + the ->validate() op and then the ->get_tree() op. + + [NOTE] ->validate() could perhaps be rolled into ->get_tree() and + ->reconfigure(). + + (*) struct vfsmount *vfs_create_mount(struct fs_context *fc); + + Create a mount given the parameters in the specified filesystem context. + Note that this does not attach the mount to anything. + + (*) int vfs_parse_fs_param(struct fs_context *fc, + struct fs_parameter *param); + + Supply a single mount parameter to the filesystem context. This include + the specification of the source/device which is specified as the "source" + parameter (which may be specified multiple times if the filesystem + supports that). + + param specifies the parameter key name and the value. The parameter is + first checked to see if it corresponds to a standard mount flag (in which + case it is used to set an SB_xxx flag and consumed) or a security option + (in which case the LSM consumes it) before it is passed on to the + filesystem. + + The parameter value is typed and can be one of: + + fs_value_is_flag, Parameter not given a value. + fs_value_is_string, Value is a string + fs_value_is_blob, Value is a binary blob + fs_value_is_filename, Value is a filename* + dirfd + fs_value_is_filename_empty, Value is a filename* + dirfd + AT_EMPTY_PATH + fs_value_is_file, Value is an open file (file*) + + If there is a value, that value is stored in a union in the struct in one + of param->{string,blob,name,file}. Note that the function may steal and + clear the pointer, but then becomes responsible for disposing of the + object. + + (*) int vfs_parse_fs_string(struct fs_context *fc, char *key, + const char *value, size_t v_size); + + A wrapper around vfs_parse_fs_param() that just passes a constant string. + + (*) int generic_parse_monolithic(struct fs_context *fc, void *data); + + Parse a sys_mount() data page, assuming the form to be a text list + consisting of key[=val] options separated by commas. Each item in the + list is passed to vfs_mount_option(). This is the default when the + ->parse_monolithic() operation is NULL. + + +===================== +PARAMETER DESCRIPTION +===================== + +Parameters are described using structures defined in linux/fs_parser.h. +There's a core description struct that links everything together: + + struct fs_parameter_description { + const char name[16]; + u8 nr_params; + u8 nr_alt_keys; + u8 nr_enums; + bool ignore_unknown; + bool no_source; + const char *const *keys; + const struct constant_table *alt_keys; + const struct fs_parameter_spec *specs; + const struct fs_parameter_enum *enums; + }; + +For example: + + enum afs_param { + Opt_autocell, + Opt_bar, + Opt_dyn, + Opt_foo, + Opt_source, + nr__afs_params + }; + + static const struct fs_parameter_description afs_fs_parameters = { + .name = "kAFS", + .nr_params = nr__afs_params, + .nr_alt_keys = ARRAY_SIZE(afs_param_alt_keys), + .nr_enums = ARRAY_SIZE(afs_param_enums), + .keys = afs_param_keys, + .alt_keys = afs_param_alt_keys, + .specs = afs_param_specs, + .enums = afs_param_enums, + }; + +The members are as follows: + + (1) const char name[16]; + + The name to be used in error messages generated by the parse helper + functions. + + (2) u8 nr_params; + + The number of discrete parameter identifiers. This indicates the number + of elements in the ->types[] array and also limits the values that may be + used in the values that the ->keys[] array maps to. + + It is expected that, for example, two parameters that are related, say + "acl" and "noacl" with have the same ID, but will be flagged to indicate + that one is the inverse of the other. The value can then be picked out + from the parse result. + + (3) const struct fs_parameter_specification *specs; + + Table of parameter specifications, where the entries are of type: + + struct fs_parameter_type { + enum fs_parameter_spec type:8; + u8 flags; + }; + + and the parameter identifier is the index to the array. 'type' indicates + the desired value type and must be one of: + + TYPE NAME EXPECTED VALUE RESULT IN + ======================= ======================= ===================== + fs_param_is_flag No value n/a + fs_param_is_bool Boolean value result->boolean + fs_param_is_u32 32-bit unsigned int result->uint_32 + fs_param_is_u32_octal 32-bit octal int result->uint_32 + fs_param_is_u32_hex 32-bit hex int result->uint_32 + fs_param_is_s32 32-bit signed int result->int_32 + fs_param_is_enum Enum value name result->uint_32 + fs_param_is_string Arbitrary string param->string + fs_param_is_blob Binary blob param->blob + fs_param_is_blockdev Blockdev path * Needs lookup + fs_param_is_path Path * Needs lookup + fs_param_is_fd File descriptor param->file + + And each parameter can be qualified with 'flags': + + fs_param_v_optional The value is optional + fs_param_neg_with_no If key name is prefixed with "no", it is false + fs_param_neg_with_empty If value is "", it is false + fs_param_deprecated The parameter is deprecated. + + For example: + + static const struct fs_parameter_spec afs_param_specs[nr__afs_params] = { + [Opt_autocell] = { fs_param_is flag }, + [Opt_bar] = { fs_param_is_enum }, + [Opt_dyn] = { fs_param_is flag }, + [Opt_foo] = { fs_param_is_bool, fs_param_neg_with_no }, + [Opt_source] = { fs_param_is_string }, + }; + + Note that if the value is of fs_param_is_bool type, fs_parse() will try + to match any string value against "0", "1", "no", "yes", "false", "true". + + [!] NOTE that the table must be sorted according to primary key name so + that ->keys[] is also sorted. + + (4) const char *const *keys; + + Table of primary key names for the parameters. There must be one entry + per defined parameter. The table is optional if ->nr_params is 0. The + table is just an array of names e.g.: + + static const char *const afs_param_keys[nr__afs_params] = { + [Opt_autocell] = "autocell", + [Opt_bar] = "bar", + [Opt_dyn] = "dyn", + [Opt_foo] = "foo", + [Opt_source] = "source", + }; + + [!] NOTE that the table must be sorted such that the table can be searched + with bsearch() using strcmp(). This means that the Opt_* values must + correspond to the entries in this table. + + (5) const struct constant_table *alt_keys; + u8 nr_alt_keys; + + Table of additional key names and their mappings to parameter ID plus the + number of elements in the table. This is optional. The table is just an + array of { name, integer } pairs, e.g.: + + static const struct constant_table afs_param_keys[] = { + { "baz", Opt_bar }, + { "dynamic", Opt_dyn }, + }; + + [!] NOTE that the table must be sorted such that strcmp() can be used with + bsearch() to search the entries. + + The parameter ID can also be fs_param_key_removed to indicate that a + deprecated parameter has been removed and that an error will be given. + This differs from fs_param_deprecated where the parameter may still have + an effect. + + Further, the behaviour of the parameter may differ when an alternate name + is used (for instance with NFS, "v3", "v4.2", etc. are alternate names). + + (6) const struct fs_parameter_enum *enums; + u8 nr_enums; + + Table of enum value names to integer mappings and the number of elements + stored therein. This is of type: + + struct fs_parameter_enum { + u8 param_id; + char name[14]; + u8 value; + }; + + Where the array is an unsorted list of { parameter ID, name }-keyed + elements that indicate the value to map to, e.g.: + + static const struct fs_parameter_enum afs_param_enums[] = { + { Opt_bar, "x", 1}, + { Opt_bar, "y", 23}, + { Opt_bar, "z", 42}, + }; + + If a parameter of type fs_param_is_enum is encountered, fs_parse() will + try to look the value up in the enum table and the result will be stored + in the parse result. + + (7) bool no_source; + + If this is set, fs_parse() will ignore any "source" parameter and not + pass it to the filesystem. + +The parser should be pointed to by the parser pointer in the file_system_type +struct as this will provide validation on registration (if +CONFIG_VALIDATE_FS_PARSER=y) and will allow the description to be queried from +userspace using the fsinfo() syscall. + + +========================== +PARAMETER HELPER FUNCTIONS +========================== + +A number of helper functions are provided to help a filesystem or an LSM +process the parameters it is given. + + (*) int lookup_constant(const struct constant_table tbl[], + const char *name, int not_found); + + Look up a constant by name in a table of name -> integer mappings. The + table is an array of elements of the following type: + + struct constant_table { + const char *name; + int value; + }; + + and it must be sorted such that it can be searched using bsearch() using + strcmp(). If a match is found, the corresponding value is returned. If a + match isn't found, the not_found value is returned instead. + + (*) bool validate_constant_table(const struct constant_table *tbl, + size_t tbl_size, + int low, int high, int special); + + Validate a constant table. Checks that all the elements are appropriately + ordered, that there are no duplicates and that the values are between low + and high inclusive, though provision is made for one allowable special + value outside of that range. If no special value is required, special + should just be set to lie inside the low-to-high range. + + If all is good, true is returned. If the table is invalid, errors are + logged to dmesg, the stack is dumped and false is returned. + + (*) int fs_parse(struct fs_context *fc, + const struct fs_param_parser *parser, + struct fs_parameter *param, + struct fs_param_parse_result *result); + + This is the main interpreter of parameters. It uses the parameter + description (parser) to look up the name of the parameter to use and to + convert that to a parameter ID (stored in result->key). + + If successful, and if the parameter type indicates the result is a + boolean, integer or enum type, the value is converted by this function and + the result stored in result->{boolean,int_32,uint_32}. + + If a match isn't initially made, the key is prefixed with "no" and no + value is present then an attempt will be made to look up the key with the + prefix removed. If this matches a parameter for which the type has flag + fs_param_neg_with_no set, then a match will be made and the value will be + set to false/0/NULL. + + If the parameter is successfully matched and, optionally, parsed + correctly, 1 is returned. If the parameter isn't matched and + parser->ignore_unknown is set, then 0 is returned. Otherwise -EINVAL is + returned. + + (*) bool fs_validate_description(const struct fs_parameter_description *desc); + + This is validates the parameter description. It returns true if the + description is good and false if it is not. + + (*) int fs_lookup_param(struct fs_context *fc, + struct fs_parameter *value, + bool want_bdev, + struct path *_path); + + This takes a parameter that carries a string or filename type and attempts + to do a path lookup on it. If the parameter expects a blockdev, a check + is made that the inode actually represents one. + + Returns 0 if successful and *_path will be set; returns a negative error + code if not. diff --git a/Documentation/flexible-arrays.txt b/Documentation/flexible-arrays.txt deleted file mode 100644 index a0f2989dd804..000000000000 --- a/Documentation/flexible-arrays.txt +++ /dev/null @@ -1,123 +0,0 @@ -=================================== -Using flexible arrays in the kernel -=================================== - -:Updated: Last updated for 2.6.32 -:Author: Jonathan Corbet <corbet@lwn.net> - -Large contiguous memory allocations can be unreliable in the Linux kernel. -Kernel programmers will sometimes respond to this problem by allocating -pages with vmalloc(). This solution not ideal, though. On 32-bit systems, -memory from vmalloc() must be mapped into a relatively small address space; -it's easy to run out. On SMP systems, the page table changes required by -vmalloc() allocations can require expensive cross-processor interrupts on -all CPUs. And, on all systems, use of space in the vmalloc() range -increases pressure on the translation lookaside buffer (TLB), reducing the -performance of the system. - -In many cases, the need for memory from vmalloc() can be eliminated by -piecing together an array from smaller parts; the flexible array library -exists to make this task easier. - -A flexible array holds an arbitrary (within limits) number of fixed-sized -objects, accessed via an integer index. Sparse arrays are handled -reasonably well. Only single-page allocations are made, so memory -allocation failures should be relatively rare. The down sides are that the -arrays cannot be indexed directly, individual object size cannot exceed the -system page size, and putting data into a flexible array requires a copy -operation. It's also worth noting that flexible arrays do no internal -locking at all; if concurrent access to an array is possible, then the -caller must arrange for appropriate mutual exclusion. - -The creation of a flexible array is done with:: - - #include <linux/flex_array.h> - - struct flex_array *flex_array_alloc(int element_size, - unsigned int total, - gfp_t flags); - -The individual object size is provided by element_size, while total is the -maximum number of objects which can be stored in the array. The flags -argument is passed directly to the internal memory allocation calls. With -the current code, using flags to ask for high memory is likely to lead to -notably unpleasant side effects. - -It is also possible to define flexible arrays at compile time with:: - - DEFINE_FLEX_ARRAY(name, element_size, total); - -This macro will result in a definition of an array with the given name; the -element size and total will be checked for validity at compile time. - -Storing data into a flexible array is accomplished with a call to:: - - int flex_array_put(struct flex_array *array, unsigned int element_nr, - void *src, gfp_t flags); - -This call will copy the data from src into the array, in the position -indicated by element_nr (which must be less than the maximum specified when -the array was created). If any memory allocations must be performed, flags -will be used. The return value is zero on success, a negative error code -otherwise. - -There might possibly be a need to store data into a flexible array while -running in some sort of atomic context; in this situation, sleeping in the -memory allocator would be a bad thing. That can be avoided by using -GFP_ATOMIC for the flags value, but, often, there is a better way. The -trick is to ensure that any needed memory allocations are done before -entering atomic context, using:: - - int flex_array_prealloc(struct flex_array *array, unsigned int start, - unsigned int nr_elements, gfp_t flags); - -This function will ensure that memory for the elements indexed in the range -defined by start and nr_elements has been allocated. Thereafter, a -flex_array_put() call on an element in that range is guaranteed not to -block. - -Getting data back out of the array is done with:: - - void *flex_array_get(struct flex_array *fa, unsigned int element_nr); - -The return value is a pointer to the data element, or NULL if that -particular element has never been allocated. - -Note that it is possible to get back a valid pointer for an element which -has never been stored in the array. Memory for array elements is allocated -one page at a time; a single allocation could provide memory for several -adjacent elements. Flexible array elements are normally initialized to the -value FLEX_ARRAY_FREE (defined as 0x6c in <linux/poison.h>), so errors -involving that number probably result from use of unstored array entries. -Note that, if array elements are allocated with __GFP_ZERO, they will be -initialized to zero and this poisoning will not happen. - -Individual elements in the array can be cleared with:: - - int flex_array_clear(struct flex_array *array, unsigned int element_nr); - -This function will set the given element to FLEX_ARRAY_FREE and return -zero. If storage for the indicated element is not allocated for the array, -flex_array_clear() will return -EINVAL instead. Note that clearing an -element does not release the storage associated with it; to reduce the -allocated size of an array, call:: - - int flex_array_shrink(struct flex_array *array); - -The return value will be the number of pages of memory actually freed. -This function works by scanning the array for pages containing nothing but -FLEX_ARRAY_FREE bytes, so (1) it can be expensive, and (2) it will not work -if the array's pages are allocated with __GFP_ZERO. - -It is possible to remove all elements of an array with a call to:: - - void flex_array_free_parts(struct flex_array *array); - -This call frees all elements, but leaves the array itself in place. -Freeing the entire array is done with:: - - void flex_array_free(struct flex_array *array); - -As of this writing, there are no users of flexible arrays in the mainline -kernel. The functions described here are also not exported to modules; -that will probably be fixed when somebody comes up with a need for it. diff --git a/Documentation/kbuild/kbuild.txt b/Documentation/kbuild/kbuild.txt index c9e3d93e7a89..8a3830b39c7d 100644 --- a/Documentation/kbuild/kbuild.txt +++ b/Documentation/kbuild/kbuild.txt @@ -232,17 +232,12 @@ KBUILD_LDS -------------------------------------------------- The linker script with full path. Assigned by the top-level Makefile. -KBUILD_VMLINUX_INIT +KBUILD_VMLINUX_OBJS -------------------------------------------------- -All object files for the init (first) part of vmlinux. -Files specified with KBUILD_VMLINUX_INIT are linked first. - -KBUILD_VMLINUX_MAIN --------------------------------------------------- -All object files for the main part of vmlinux. +All object files for vmlinux. They are linked to vmlinux in the same +order as listed in KBUILD_VMLINUX_OBJS. KBUILD_VMLINUX_LIBS -------------------------------------------------- -All .a "lib" files for vmlinux. -KBUILD_VMLINUX_INIT, KBUILD_VMLINUX_MAIN, and KBUILD_VMLINUX_LIBS together -specify all the object files used to link vmlinux. +All .a "lib" files for vmlinux. KBUILD_VMLINUX_OBJS and KBUILD_VMLINUX_LIBS +together specify all the object files used to link vmlinux. diff --git a/Documentation/kbuild/makefiles.txt b/Documentation/kbuild/makefiles.txt index bf28c47bfd72..03c065855eaf 100644 --- a/Documentation/kbuild/makefiles.txt +++ b/Documentation/kbuild/makefiles.txt @@ -154,13 +154,8 @@ more details, with real examples. Kbuild compiles all the $(obj-y) files. It then calls "$(AR) rcSTP" to merge these files into one built-in.a file. - This is a thin archive without a symbol table, which makes it - unsuitable as a linker input. - - The scripts/link-vmlinux.sh script later makes an aggregate - built-in.a with "${AR} rcsTP", which creates the thin archive - with a symbol table and an index, making it a valid input for - the final vmlinux link passes. + This is a thin archive without a symbol table. It will be later + linked into vmlinux by scripts/link-vmlinux.sh The order of files in $(obj-y) is significant. Duplicates in the lists are allowed: the first instance will be linked into @@ -504,23 +499,6 @@ more details, with real examples. In the above example, -Wno-unused-but-set-variable will be added to KBUILD_CFLAGS only if gcc really accepts it. - cc-version - cc-version returns a numerical version of the $(CC) compiler version. - The format is <major><minor> where both are two digits. So for example - gcc 3.41 would return 0341. - cc-version is useful when a specific $(CC) version is faulty in one - area, for example -mregparm=3 was broken in some gcc versions - even though the option was accepted by gcc. - - Example: - #arch/x86/Makefile - cflags-y += $(shell \ - if [ $(cc-version) -ge 0300 ] ; then \ - echo "-mregparm=3"; fi ;) - - In the above example, -mregparm=3 is only used for gcc version greater - than or equal to gcc 3.0. - cc-ifversion cc-ifversion tests the version of $(CC) and equals the fourth parameter if version expression is true, or the fifth (if given) if the version @@ -1296,7 +1274,7 @@ See subsequent chapter for the syntax of the Kbuild file. --- 7.4 mandatory-y - mandatory-y is essentially used by include/(uapi/)asm-generic/Kbuild.asm + mandatory-y is essentially used by include/(uapi/)asm-generic/Kbuild to define the minimum set of ASM headers that all architectures must have. This works like optional generic-y. If a mandatory header is missing diff --git a/Documentation/kbuild/modules.txt b/Documentation/kbuild/modules.txt index 3fb39e0116b4..80295c613e37 100644 --- a/Documentation/kbuild/modules.txt +++ b/Documentation/kbuild/modules.txt @@ -140,7 +140,7 @@ executed to make module versioning work. make -C $KDIR M=$PWD bar.lst make -C $KDIR M=$PWD baz.o make -C $KDIR M=$PWD foo.ko - make -C $KDIR M=$PWD / + make -C $KDIR M=$PWD ./ === 3. Creating a Kbuild File for an External Module diff --git a/Documentation/trace/ftrace.rst b/Documentation/trace/ftrace.rst index 0131df7f5968..7c5e6d6ab5d1 100644 --- a/Documentation/trace/ftrace.rst +++ b/Documentation/trace/ftrace.rst @@ -233,6 +233,12 @@ of ftrace. Here is a list of some of the key files: This interface also allows for commands to be used. See the "Filter commands" section for more details. + As a speed up, since processing strings can't be quite expensive + and requires a check of all functions registered to tracing, instead + an index can be written into this file. A number (starting with "1") + written will instead select the same corresponding at the line position + of the "available_filter_functions" file. + set_ftrace_notrace: This has an effect opposite to that of @@ -1396,6 +1402,57 @@ enabling function tracing, we incur an added overhead. This overhead may extend the latency times. But nevertheless, this trace has provided some very helpful debugging information. +If we prefer function graph output instead of function, we can set +display-graph option:: + with echo 1 > options/display-graph + + # tracer: irqsoff + # + # irqsoff latency trace v1.1.5 on 4.20.0-rc6+ + # -------------------------------------------------------------------- + # latency: 3751 us, #274/274, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4) + # ----------------- + # | task: bash-1507 (uid:0 nice:0 policy:0 rt_prio:0) + # ----------------- + # => started at: free_debug_processing + # => ended at: return_to_handler + # + # + # _-----=> irqs-off + # / _----=> need-resched + # | / _---=> hardirq/softirq + # || / _--=> preempt-depth + # ||| / + # REL TIME CPU TASK/PID |||| DURATION FUNCTION CALLS + # | | | | |||| | | | | | | + 0 us | 0) bash-1507 | d... | 0.000 us | _raw_spin_lock_irqsave(); + 0 us | 0) bash-1507 | d..1 | 0.378 us | do_raw_spin_trylock(); + 1 us | 0) bash-1507 | d..2 | | set_track() { + 2 us | 0) bash-1507 | d..2 | | save_stack_trace() { + 2 us | 0) bash-1507 | d..2 | | __save_stack_trace() { + 3 us | 0) bash-1507 | d..2 | | __unwind_start() { + 3 us | 0) bash-1507 | d..2 | | get_stack_info() { + 3 us | 0) bash-1507 | d..2 | 0.351 us | in_task_stack(); + 4 us | 0) bash-1507 | d..2 | 1.107 us | } + [...] + 3750 us | 0) bash-1507 | d..1 | 0.516 us | do_raw_spin_unlock(); + 3750 us | 0) bash-1507 | d..1 | 0.000 us | _raw_spin_unlock_irqrestore(); + 3764 us | 0) bash-1507 | d..1 | 0.000 us | tracer_hardirqs_on(); + bash-1507 0d..1 3792us : <stack trace> + => free_debug_processing + => __slab_free + => kmem_cache_free + => vm_area_free + => remove_vma + => exit_mmap + => mmput + => flush_old_exec + => load_elf_binary + => search_binary_handler + => __do_execve_file.isra.32 + => __x64_sys_execve + => do_syscall_64 + => entry_SYSCALL_64_after_hwframe preemptoff ---------- @@ -2784,6 +2841,38 @@ Produces:: We can see that there's no more lock or preempt tracing. +Selecting function filters via index +------------------------------------ + +Because processing of strings is expensive (the address of the function +needs to be looked up before comparing to the string being passed in), +an index can be used as well to enable functions. This is useful in the +case of setting thousands of specific functions at a time. By passing +in a list of numbers, no string processing will occur. Instead, the function +at the specific location in the internal array (which corresponds to the +functions in the "available_filter_functions" file), is selected. + +:: + + # echo 1 > set_ftrace_filter + +Will select the first function listed in "available_filter_functions" + +:: + + # head -1 available_filter_functions + trace_initcall_finish_cb + + # cat set_ftrace_filter + trace_initcall_finish_cb + + # head -50 available_filter_functions | tail -1 + x86_pmu_commit_txn + + # echo 1 50 > set_ftrace_filter + # cat set_ftrace_filter + trace_initcall_finish_cb + x86_pmu_commit_txn Dynamic ftrace with the function graph tracer --------------------------------------------- diff --git a/Documentation/trace/histogram.rst b/Documentation/trace/histogram.rst index 7dda76503127..0ea59d45aef1 100644 --- a/Documentation/trace/histogram.rst +++ b/Documentation/trace/histogram.rst @@ -25,7 +25,7 @@ Documentation written by Tom Zanussi hist:keys=<field1[,field2,...]>[:values=<field1[,field2,...]>] [:sort=<field1[,field2,...]>][:size=#entries][:pause][:continue] - [:clear][:name=histname1] [if <filter>] + [:clear][:name=histname1][:<handler>.<action>] [if <filter>] When a matching event is hit, an entry is added to a hash table using the key(s) and value(s) named. Keys and values correspond to @@ -1831,41 +1831,87 @@ and looks and behaves just like any other event:: Like any other event, once a histogram is enabled for the event, the output can be displayed by reading the event's 'hist' file. -2.2.3 Hist trigger 'actions' ----------------------------- +2.2.3 Hist trigger 'handlers' and 'actions' +------------------------------------------- -A hist trigger 'action' is a function that's executed whenever a -histogram entry is added or updated. +A hist trigger 'action' is a function that's executed (in most cases +conditionally) whenever a histogram entry is added or updated. -The default 'action' if no special function is explicitly specified is -as it always has been, to simply update the set of values associated -with an entry. Some applications, however, may want to perform -additional actions at that point, such as generate another event, or -compare and save a maximum. +When a histogram entry is added or updated, a hist trigger 'handler' +is what decides whether the corresponding action is actually invoked +or not. -The following additional actions are available. To specify an action -for a given event, simply specify the action between colons in the -hist trigger specification. +Hist trigger handlers and actions are paired together in the general +form: - - onmatch(matching.event).<synthetic_event_name>(param list) + <handler>.<action> - The 'onmatch(matching.event).<synthetic_event_name>(params)' hist - trigger action is invoked whenever an event matches and the - histogram entry would be added or updated. It causes the named - synthetic event to be generated with the values given in the +To specify a handler.action pair for a given event, simply specify +that handler.action pair between colons in the hist trigger +specification. + +In theory, any handler can be combined with any action, but in +practice, not every handler.action combination is currently supported; +if a given handler.action combination isn't supported, the hist +trigger will fail with -EINVAL; + +The default 'handler.action' if none is explicity specified is as it +always has been, to simply update the set of values associated with an +entry. Some applications, however, may want to perform additional +actions at that point, such as generate another event, or compare and +save a maximum. + +The supported handlers and actions are listed below, and each is +described in more detail in the following paragraphs, in the context +of descriptions of some common and useful handler.action combinations. + +The available handlers are: + + - onmatch(matching.event) - invoke action on any addition or update + - onmax(var) - invoke action if var exceeds current max + - onchange(var) - invoke action if var changes + +The available actions are: + + - trace(<synthetic_event_name>,param list) - generate synthetic event + - save(field,...) - save current event fields + - snapshot() - snapshot the trace buffer + +The following commonly-used handler.action pairs are available: + + - onmatch(matching.event).trace(<synthetic_event_name>,param list) + + The 'onmatch(matching.event).trace(<synthetic_event_name>,param + list)' hist trigger action is invoked whenever an event matches + and the histogram entry would be added or updated. It causes the + named synthetic event to be generated with the values given in the 'param list'. The result is the generation of a synthetic event that consists of the values contained in those variables at the - time the invoking event was hit. - - The 'param list' consists of one or more parameters which may be - either variables or fields defined on either the 'matching.event' - or the target event. The variables or fields specified in the - param list may be either fully-qualified or unqualified. If a - variable is specified as unqualified, it must be unique between - the two events. A field name used as a param can be unqualified - if it refers to the target event, but must be fully qualified if - it refers to the matching event. A fully-qualified name is of the - form 'system.event_name.$var_name' or 'system.event_name.field'. + time the invoking event was hit. For example, if the synthetic + event name is 'wakeup_latency', a wakeup_latency event is + generated using onmatch(event).trace(wakeup_latency,arg1,arg2). + + There is also an equivalent alternative form available for + generating synthetic events. In this form, the synthetic event + name is used as if it were a function name. For example, using + the 'wakeup_latency' synthetic event name again, the + wakeup_latency event would be generated by invoking it as if it + were a function call, with the event field values passed in as + arguments: onmatch(event).wakeup_latency(arg1,arg2). The syntax + for this form is: + + onmatch(matching.event).<synthetic_event_name>(param list) + + In either case, the 'param list' consists of one or more + parameters which may be either variables or fields defined on + either the 'matching.event' or the target event. The variables or + fields specified in the param list may be either fully-qualified + or unqualified. If a variable is specified as unqualified, it + must be unique between the two events. A field name used as a + param can be unqualified if it refers to the target event, but + must be fully qualified if it refers to the matching event. A + fully-qualified name is of the form 'system.event_name.$var_name' + or 'system.event_name.field'. The 'matching.event' specification is simply the fully qualified event name of the event that matches the target event for the @@ -1896,6 +1942,12 @@ hist trigger specification. wakeup_new_test($testpid) if comm=="cyclictest"' >> \ /sys/kernel/debug/tracing/events/sched/sched_wakeup_new/trigger + Or, equivalently, using the 'trace' keyword syntax: + + # echo 'hist:keys=$testpid:testpid=pid:onmatch(sched.sched_wakeup_new).\ + trace(wakeup_new_test,$testpid) if comm=="cyclictest"' >> \ + /sys/kernel/debug/tracing/events/sched/sched_wakeup_new/trigger + Creating and displaying a histogram based on those events is now just a matter of using the fields and new synthetic event in the tracing/events/synthetic directory, as usual:: @@ -2000,6 +2052,212 @@ hist trigger specification. Entries: 2 Dropped: 0 + - onmax(var).snapshot() + + The 'onmax(var).snapshot()' hist trigger action is invoked + whenever the value of 'var' associated with a histogram entry + exceeds the current maximum contained in that variable. + + The end result is that a global snapshot of the trace buffer will + be saved in the tracing/snapshot file if 'var' exceeds the current + maximum for any hist trigger entry. + + Note that in this case the maximum is a global maximum for the + current trace instance, which is the maximum across all buckets of + the histogram. The key of the specific trace event that caused + the global maximum and the global maximum itself are displayed, + along with a message stating that a snapshot has been taken and + where to find it. The user can use the key information displayed + to locate the corresponding bucket in the histogram for even more + detail. + + As an example the below defines a couple of hist triggers, one for + sched_waking and another for sched_switch, keyed on pid. Whenever + a sched_waking event occurs, the timestamp is saved in the entry + corresponding to the current pid, and when the scheduler switches + back to that pid, the timestamp difference is calculated. If the + resulting latency, stored in wakeup_lat, exceeds the current + maximum latency, a snapshot is taken. As part of the setup, all + the scheduler events are also enabled, which are the events that + will show up in the snapshot when it is taken at some point: + + # echo 1 > /sys/kernel/debug/tracing/events/sched/enable + + # echo 'hist:keys=pid:ts0=common_timestamp.usecs \ + if comm=="cyclictest"' >> \ + /sys/kernel/debug/tracing/events/sched/sched_waking/trigger + + # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0: \ + onmax($wakeup_lat).save(next_prio,next_comm,prev_pid,prev_prio, \ + prev_comm):onmax($wakeup_lat).snapshot() \ + if next_comm=="cyclictest"' >> \ + /sys/kernel/debug/tracing/events/sched/sched_switch/trigger + + When the histogram is displayed, for each bucket the max value + and the saved values corresponding to the max are displayed + following the rest of the fields. + + If a snaphot was taken, there is also a message indicating that, + along with the value and event that triggered the global maximum: + + # cat /sys/kernel/debug/tracing/events/sched/sched_switch/hist + { next_pid: 2101 } hitcount: 200 + max: 52 next_prio: 120 next_comm: cyclictest \ + prev_pid: 0 prev_prio: 120 prev_comm: swapper/6 + + { next_pid: 2103 } hitcount: 1326 + max: 572 next_prio: 19 next_comm: cyclictest \ + prev_pid: 0 prev_prio: 120 prev_comm: swapper/1 + + { next_pid: 2102 } hitcount: 1982 \ + max: 74 next_prio: 19 next_comm: cyclictest \ + prev_pid: 0 prev_prio: 120 prev_comm: swapper/5 + + Snapshot taken (see tracing/snapshot). Details: + triggering value { onmax($wakeup_lat) }: 572 \ + triggered by event with key: { next_pid: 2103 } + + Totals: + Hits: 3508 + Entries: 3 + Dropped: 0 + + In the above case, the event that triggered the global maximum has + the key with next_pid == 2103. If you look at the bucket that has + 2103 as the key, you'll find the additional values save()'d along + with the local maximum for that bucket, which should be the same + as the global maximum (since that was the same value that + triggered the global snapshot). + + And finally, looking at the snapshot data should show at or near + the end the event that triggered the snapshot (in this case you + can verify the timestamps between the sched_waking and + sched_switch events, which should match the time displayed in the + global maximum): + + # cat /sys/kernel/debug/tracing/snapshot + + <...>-2103 [005] d..3 309.873125: sched_switch: prev_comm=cyclictest prev_pid=2103 prev_prio=19 prev_state=D ==> next_comm=swapper/5 next_pid=0 next_prio=120 + <idle>-0 [005] d.h3 309.873611: sched_waking: comm=cyclictest pid=2102 prio=19 target_cpu=005 + <idle>-0 [005] dNh4 309.873613: sched_wakeup: comm=cyclictest pid=2102 prio=19 target_cpu=005 + <idle>-0 [005] d..3 309.873616: sched_switch: prev_comm=swapper/5 prev_pid=0 prev_prio=120 prev_state=S ==> next_comm=cyclictest next_pid=2102 next_prio=19 + <...>-2102 [005] d..3 309.873625: sched_switch: prev_comm=cyclictest prev_pid=2102 prev_prio=19 prev_state=D ==> next_comm=swapper/5 next_pid=0 next_prio=120 + <idle>-0 [005] d.h3 309.874624: sched_waking: comm=cyclictest pid=2102 prio=19 target_cpu=005 + <idle>-0 [005] dNh4 309.874626: sched_wakeup: comm=cyclictest pid=2102 prio=19 target_cpu=005 + <idle>-0 [005] dNh3 309.874628: sched_waking: comm=cyclictest pid=2103 prio=19 target_cpu=005 + <idle>-0 [005] dNh4 309.874630: sched_wakeup: comm=cyclictest pid=2103 prio=19 target_cpu=005 + <idle>-0 [005] d..3 309.874633: sched_switch: prev_comm=swapper/5 prev_pid=0 prev_prio=120 prev_state=S ==> next_comm=cyclictest next_pid=2102 next_prio=19 + <idle>-0 [004] d.h3 309.874757: sched_waking: comm=gnome-terminal- pid=1699 prio=120 target_cpu=004 + <idle>-0 [004] dNh4 309.874762: sched_wakeup: comm=gnome-terminal- pid=1699 prio=120 target_cpu=004 + <idle>-0 [004] d..3 309.874766: sched_switch: prev_comm=swapper/4 prev_pid=0 prev_prio=120 prev_state=S ==> next_comm=gnome-terminal- next_pid=1699 next_prio=120 + gnome-terminal--1699 [004] d.h2 309.874941: sched_stat_runtime: comm=gnome-terminal- pid=1699 runtime=180706 [ns] vruntime=1126870572 [ns] + <idle>-0 [003] d.s4 309.874956: sched_waking: comm=rcu_sched pid=9 prio=120 target_cpu=007 + <idle>-0 [003] d.s5 309.874960: sched_wake_idle_without_ipi: cpu=7 + <idle>-0 [003] d.s5 309.874961: sched_wakeup: comm=rcu_sched pid=9 prio=120 target_cpu=007 + <idle>-0 [007] d..3 309.874963: sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=S ==> next_comm=rcu_sched next_pid=9 next_prio=120 + rcu_sched-9 [007] d..3 309.874973: sched_stat_runtime: comm=rcu_sched pid=9 runtime=13646 [ns] vruntime=22531430286 [ns] + rcu_sched-9 [007] d..3 309.874978: sched_switch: prev_comm=rcu_sched prev_pid=9 prev_prio=120 prev_state=R+ ==> next_comm=swapper/7 next_pid=0 next_prio=120 + <...>-2102 [005] d..4 309.874994: sched_migrate_task: comm=cyclictest pid=2103 prio=19 orig_cpu=5 dest_cpu=1 + <...>-2102 [005] d..4 309.875185: sched_wake_idle_without_ipi: cpu=1 + <idle>-0 [001] d..3 309.875200: sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=S ==> next_comm=cyclictest next_pid=2103 next_prio=19 + + - onchange(var).save(field,.. .) + + The 'onchange(var).save(field,...)' hist trigger action is invoked + whenever the value of 'var' associated with a histogram entry + changes. + + The end result is that the trace event fields specified as the + onchange.save() params will be saved if 'var' changes for that + hist trigger entry. This allows context from the event that + changed the value to be saved for later reference. When the + histogram is displayed, additional fields displaying the saved + values will be printed. + + - onchange(var).snapshot() + + The 'onchange(var).snapshot()' hist trigger action is invoked + whenever the value of 'var' associated with a histogram entry + changes. + + The end result is that a global snapshot of the trace buffer will + be saved in the tracing/snapshot file if 'var' changes for any + hist trigger entry. + + Note that in this case the changed value is a global variable + associated withe current trace instance. The key of the specific + trace event that caused the value to change and the global value + itself are displayed, along with a message stating that a snapshot + has been taken and where to find it. The user can use the key + information displayed to locate the corresponding bucket in the + histogram for even more detail. + + As an example the below defines a hist trigger on the tcp_probe + event, keyed on dport. Whenever a tcp_probe event occurs, the + cwnd field is checked against the current value stored in the + $cwnd variable. If the value has changed, a snapshot is taken. + As part of the setup, all the scheduler and tcp events are also + enabled, which are the events that will show up in the snapshot + when it is taken at some point: + + # echo 1 > /sys/kernel/debug/tracing/events/sched/enable + # echo 1 > /sys/kernel/debug/tracing/events/tcp/enable + + # echo 'hist:keys=dport:cwnd=snd_cwnd: \ + onchange($cwnd).save(snd_wnd,srtt,rcv_wnd): \ + onchange($cwnd).snapshot()' >> \ + /sys/kernel/debug/tracing/events/tcp/tcp_probe/trigger + + When the histogram is displayed, for each bucket the tracked value + and the saved values corresponding to that value are displayed + following the rest of the fields. + + If a snaphot was taken, there is also a message indicating that, + along with the value and event that triggered the snapshot: + + # cat /sys/kernel/debug/tracing/events/tcp/tcp_probe/hist + { dport: 1521 } hitcount: 8 + changed: 10 snd_wnd: 35456 srtt: 154262 rcv_wnd: 42112 + + { dport: 80 } hitcount: 23 + changed: 10 snd_wnd: 28960 srtt: 19604 rcv_wnd: 29312 + + { dport: 9001 } hitcount: 172 + changed: 10 snd_wnd: 48384 srtt: 260444 rcv_wnd: 55168 + + { dport: 443 } hitcount: 211 + changed: 10 snd_wnd: 26960 srtt: 17379 rcv_wnd: 28800 + + Snapshot taken (see tracing/snapshot). Details: + triggering value { onchange($cwnd) }: 10 + triggered by event with key: { dport: 80 } + + Totals: + Hits: 414 + Entries: 4 + Dropped: 0 + + In the above case, the event that triggered the snapshot has the + key with dport == 80. If you look at the bucket that has 80 as + the key, you'll find the additional values save()'d along with the + changed value for that bucket, which should be the same as the + global changed value (since that was the same value that triggered + the global snapshot). + + And finally, looking at the snapshot data should show at or near + the end the event that triggered the snapshot: + + # cat /sys/kernel/debug/tracing/snapshot + + gnome-shell-1261 [006] dN.3 49.823113: sched_stat_runtime: comm=gnome-shell pid=1261 runtime=49347 [ns] vruntime=1835730389 [ns] + kworker/u16:4-773 [003] d..3 49.823114: sched_switch: prev_comm=kworker/u16:4 prev_pid=773 prev_prio=120 prev_state=R+ ==> next_comm=kworker/3:2 next_pid=135 next_prio=120 + gnome-shell-1261 [006] d..3 49.823114: sched_switch: prev_comm=gnome-shell prev_pid=1261 prev_prio=120 prev_state=R+ ==> next_comm=kworker/6:2 next_pid=387 next_prio=120 + kworker/3:2-135 [003] d..3 49.823118: sched_stat_runtime: comm=kworker/3:2 pid=135 runtime=5339 [ns] vruntime=17815800388 [ns] + kworker/6:2-387 [006] d..3 49.823120: sched_stat_runtime: comm=kworker/6:2 pid=387 runtime=9594 [ns] vruntime=14589605367 [ns] + kworker/6:2-387 [006] d..3 49.823122: sched_switch: prev_comm=kworker/6:2 prev_pid=387 prev_prio=120 prev_state=R+ ==> next_comm=gnome-shell next_pid=1261 next_prio=120 + kworker/3:2-135 [003] d..3 49.823123: sched_switch: prev_comm=kworker/3:2 prev_pid=135 prev_prio=120 prev_state=T ==> next_comm=swapper/3 next_pid=0 next_prio=120 + <idle>-0 [004] ..s7 49.823798: tcp_probe: src=10.0.0.10:54326 dest=23.215.104.193:80 mark=0x0 length=32 snd_nxt=0xe3ae2ff5 snd_una=0xe3ae2ecd snd_cwnd=10 ssthresh=2147483647 snd_wnd=28960 srtt=19604 rcv_wnd=29312 + 3. User space creating a trigger -------------------------------- diff --git a/Documentation/trace/uprobetracer.rst b/Documentation/trace/uprobetracer.rst index 4c3bfde2ba47..4346e23e3ae7 100644 --- a/Documentation/trace/uprobetracer.rst +++ b/Documentation/trace/uprobetracer.rst @@ -73,10 +73,9 @@ For $comm, the default type is "string"; any other type is invalid. Event Profiling --------------- -You can check the total number of probe hits and probe miss-hits via -/sys/kernel/debug/tracing/uprobe_profile. -The first column is event name, the second is the number of probe hits, -the third is the number of probe miss-hits. +You can check the total number of probe hits per event via +/sys/kernel/debug/tracing/uprobe_profile. The first column is the filename, +the second is the event name, the third is the number of probe hits. Usage examples -------------- diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 356156f5c52d..7de9eee73fcd 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -45,6 +45,23 @@ the API. The only supported use is one virtual machine per process, and one vcpu per thread. +It is important to note that althought VM ioctls may only be issued from +the process that created the VM, a VM's lifecycle is associated with its +file descriptor, not its creator (process). In other words, the VM and +its resources, *including the associated address space*, are not freed +until the last reference to the VM's file descriptor has been released. +For example, if fork() is issued after ioctl(KVM_CREATE_VM), the VM will +not be freed until both the parent (original) process and its child have +put their references to the VM's file descriptor. + +Because a VM's resources are not freed until the last reference to its +file descriptor is released, creating additional references to a VM via +via fork(), dup(), etc... without careful consideration is strongly +discouraged and may have unwanted side effects, e.g. memory allocated +by and on behalf of the VM's process may not be freed/unaccounted when +the VM is shut down. + + 3. Extensions ------------- diff --git a/Documentation/virtual/kvm/halt-polling.txt b/Documentation/virtual/kvm/halt-polling.txt index 4a8418318769..4f791b128dd2 100644 --- a/Documentation/virtual/kvm/halt-polling.txt +++ b/Documentation/virtual/kvm/halt-polling.txt @@ -53,7 +53,8 @@ the global max polling interval then the polling interval can be increased in the hope that next time during the longer polling interval the wake up source will be received while the host is polling and the latency benefits will be received. The polling interval is grown in the function grow_halt_poll_ns() and -is multiplied by the module parameter halt_poll_ns_grow. +is multiplied by the module parameters halt_poll_ns_grow and +halt_poll_ns_grow_start. In the event that the total block time was greater than the global max polling interval then the host will never poll for long enough (limited by the global @@ -80,22 +81,30 @@ shrunk. These variables are defined in include/linux/kvm_host.h and as module parameters in virt/kvm/kvm_main.c, or arch/powerpc/kvm/book3s_hv.c in the powerpc kvm-hv case. -Module Parameter | Description | Default Value +Module Parameter | Description | Default Value -------------------------------------------------------------------------------- -halt_poll_ns | The global max polling interval | KVM_HALT_POLL_NS_DEFAULT - | which defines the ceiling value | - | of the polling interval for | (per arch value) - | each vcpu. | +halt_poll_ns | The global max polling | KVM_HALT_POLL_NS_DEFAULT + | interval which defines | + | the ceiling value of the | + | polling interval for | (per arch value) + | each vcpu. | -------------------------------------------------------------------------------- -halt_poll_ns_grow | The value by which the halt | 2 - | polling interval is multiplied | - | in the grow_halt_poll_ns() | - | function. | +halt_poll_ns_grow | The value by which the | 2 + | halt polling interval is | + | multiplied in the | + | grow_halt_poll_ns() | + | function. | -------------------------------------------------------------------------------- -halt_poll_ns_shrink | The value by which the halt | 0 - | polling interval is divided in | - | the shrink_halt_poll_ns() | - | function. | +halt_poll_ns_grow_start | The initial value to grow | 10000 + | to from zero in the | + | grow_halt_poll_ns() | + | function. | +-------------------------------------------------------------------------------- +halt_poll_ns_shrink | The value by which the | 0 + | halt polling interval is | + | divided in the | + | shrink_halt_poll_ns() | + | function. | -------------------------------------------------------------------------------- These module parameters can be set from the debugfs files in: diff --git a/Documentation/virtual/kvm/mmu.txt b/Documentation/virtual/kvm/mmu.txt index e507a9e0421e..f365102c80f5 100644 --- a/Documentation/virtual/kvm/mmu.txt +++ b/Documentation/virtual/kvm/mmu.txt @@ -224,10 +224,6 @@ Shadow pages contain the following information: A bitmap indicating which sptes in spt point (directly or indirectly) at pages that may be unsynchronized. Used to quickly locate all unsychronized pages reachable from a given page. - mmu_valid_gen: - Generation number of the page. It is compared with kvm->arch.mmu_valid_gen - during hash table lookup, and used to skip invalidated shadow pages (see - "Zapping all pages" below.) clear_spte_count: Only present on 32-bit hosts, where a 64-bit spte cannot be written atomically. The reader uses this while running out of the MMU lock @@ -402,27 +398,6 @@ causes its disallow_lpage to be incremented, thus preventing instantiation of a large spte. The frames at the end of an unaligned memory slot have artificially inflated ->disallow_lpages so they can never be instantiated. -Zapping all pages (page generation count) -========================================= - -For the large memory guests, walking and zapping all pages is really slow -(because there are a lot of pages), and also blocks memory accesses of -all VCPUs because it needs to hold the MMU lock. - -To make it be more scalable, kvm maintains a global generation number -which is stored in kvm->arch.mmu_valid_gen. Every shadow page stores -the current global generation-number into sp->mmu_valid_gen when it -is created. Pages with a mismatching generation number are "obsolete". - -When KVM need zap all shadow pages sptes, it just simply increases the global -generation-number then reload root shadow pages on all vcpus. As the VCPUs -create new shadow page tables, the old pages are not used because of the -mismatching generation number. - -KVM then walks through all pages and zaps obsolete pages. While the zap -operation needs to take the MMU lock, the lock can be released periodically -so that the VCPUs can make progress. - Fast invalidation of MMIO sptes =============================== @@ -435,8 +410,7 @@ shadow pages, and is made more scalable with a similar technique. MMIO sptes have a few spare bits, which are used to store a generation number. The global generation number is stored in kvm_memslots(kvm)->generation, and increased whenever guest memory info -changes. This generation number is distinct from the one described in -the previous section. +changes. When KVM finds an MMIO spte, it checks the generation number of the spte. If the generation number of the spte does not equal the global generation @@ -452,13 +426,16 @@ stored into the MMIO spte. Thus, the MMIO spte might be created based on out-of-date information, but with an up-to-date generation number. To avoid this, the generation number is incremented again after synchronize_srcu -returns; thus, the low bit of kvm_memslots(kvm)->generation is only 1 during a +returns; thus, bit 63 of kvm_memslots(kvm)->generation set to 1 only during a memslot update, while some SRCU readers might be using the old copy. We do not want to use an MMIO sptes created with an odd generation number, and we can do -this without losing a bit in the MMIO spte. The low bit of the generation -is not stored in MMIO spte, and presumed zero when it is extracted out of the -spte. If KVM is unlucky and creates an MMIO spte while the low bit is 1, -the next access to the spte will always be a cache miss. +this without losing a bit in the MMIO spte. The "update in-progress" bit of the +generation is not stored in MMIO spte, and is so is implicitly zero when the +generation is extracted out of the spte. If KVM is unlucky and creates an MMIO +spte while an update is in-progress, the next access to the spte will always be +a cache miss. For example, a subsequent access during the update window will +miss due to the in-progress flag diverging, while an access after the update +window closes will have a higher generation number (as compared to the spte). Further reading diff --git a/Documentation/watchdog/mlx-wdt.txt b/Documentation/watchdog/mlx-wdt.txt new file mode 100644 index 000000000000..66eeb78505c3 --- /dev/null +++ b/Documentation/watchdog/mlx-wdt.txt @@ -0,0 +1,52 @@ + Mellanox watchdog drivers + for x86 based system switches + +This driver provides watchdog functionality for various Mellanox +Ethernet and Infiniband switch systems. + +Mellanox watchdog device is implemented in a programmable logic device. + +There are 2 types of HW watchdog implementations. + +Type 1: +Actual HW timeout can be defined as a power of 2 msec. +e.g. timeout 20 sec will be rounded up to 32768 msec. +The maximum timeout period is 32 sec (32768 msec.), +Get time-left isn't supported + +Type 2: +Actual HW timeout is defined in sec. and it's the same as +a user-defined timeout. +Maximum timeout is 255 sec. +Get time-left is supported. + +Type 1 HW watchdog implementation exist in old systems and +all new systems have type 2 HW watchdog. +Two types of HW implementation have also different register map. + +Mellanox system can have 2 watchdogs: main and auxiliary. +Main and auxiliary watchdog devices can be enabled together +on the same system. +There are several actions that can be defined in the watchdog: +system reset, start fans on full speed and increase register counter. +The last 2 actions are performed without a system reset. +Actions without reset are provided for auxiliary watchdog device, +which is optional. +Watchdog can be started during a probe, in this case it will be +pinged by watchdog core before watchdog device will be opened by +user space application. +Watchdog can be initialised in nowayout way, i.e. oncse started +it can't be stopped. + +This mlx-wdt driver supports both HW watchdog implementations. + +Watchdog driver is probed from the common mlx_platform driver. +Mlx_platform driver provides an appropriate set of registers for +Mellanox watchdog device, identity name (mlx-wdt-main or mlx-wdt-aux), +initial timeout, performed action in expiration and configuration flags. +watchdog configuration flags: nowayout and start_at_boot, hw watchdog +version - type1 or type2. +The driver checks during initialization if the previous system reset +was done by the watchdog. If yes, it makes a notification about this event. + +Access to HW registers is performed through a generic regmap interface. |