diff options
Diffstat (limited to 'Documentation')
55 files changed, 1165 insertions, 301 deletions
diff --git a/Documentation/ABI/testing/sysfs-driver-hid-logitech-lg4ff b/Documentation/ABI/testing/sysfs-driver-hid-logitech-lg4ff index b3f6a2ac5007..db197a879580 100644 --- a/Documentation/ABI/testing/sysfs-driver-hid-logitech-lg4ff +++ b/Documentation/ABI/testing/sysfs-driver-hid-logitech-lg4ff @@ -1,7 +1,7 @@ -What: /sys/module/hid_logitech/drivers/hid:logitech/<dev>/range. +What: /sys/bus/hid/drivers/logitech/<dev>/range Date: July 2011 KernelVersion: 3.2 -Contact: Michal Malý <madcatxster@gmail.com> +Contact: Michal Malý <madcatxster@devoid-pointer.net> Description: Display minimum, maximum and current range of the steering wheel. Writing a value within min and max boundaries sets the range of the wheel. @@ -9,7 +9,7 @@ Description: Display minimum, maximum and current range of the steering What: /sys/bus/hid/drivers/logitech/<dev>/alternate_modes Date: Feb 2015 KernelVersion: 4.1 -Contact: Michal Malý <madcatxster@gmail.com> +Contact: Michal Malý <madcatxster@devoid-pointer.net> Description: Displays a set of alternate modes supported by a wheel. Each mode is listed as follows: Tag: Mode Name @@ -45,7 +45,7 @@ Description: Displays a set of alternate modes supported by a wheel. Each What: /sys/bus/hid/drivers/logitech/<dev>/real_id Date: Feb 2015 KernelVersion: 4.1 -Contact: Michal Malý <madcatxster@gmail.com> +Contact: Michal Malý <madcatxster@devoid-pointer.net> Description: Displays the real model of the wheel regardless of any alternate mode the wheel might be switched to. It is a read-only value. diff --git a/Documentation/ABI/testing/sysfs-firmware-efi b/Documentation/ABI/testing/sysfs-firmware-efi index 05874da7ce80..e794eac32a90 100644 --- a/Documentation/ABI/testing/sysfs-firmware-efi +++ b/Documentation/ABI/testing/sysfs-firmware-efi @@ -18,3 +18,13 @@ Contact: Dave Young <dyoung@redhat.com> Description: It shows the physical address of config table entry in the EFI system table. Users: Kexec + +What: /sys/firmware/efi/systab +Date: April 2005 +Contact: linux-efi@vger.kernel.org +Description: Displays the physical addresses of all EFI Configuration + Tables found via the EFI System Table. The order in + which the tables are printed forms an ABI and newer + versions are always printed first, i.e. ACPI20 comes + before ACPI. +Users: dmidecode diff --git a/Documentation/ABI/testing/sysfs-firmware-efi-esrt b/Documentation/ABI/testing/sysfs-firmware-efi-esrt new file mode 100644 index 000000000000..6e431d1a4e79 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-firmware-efi-esrt @@ -0,0 +1,81 @@ +What: /sys/firmware/efi/esrt/ +Date: February 2015 +Contact: Peter Jones <pjones@redhat.com> +Description: Provides userland access to read the EFI System Resource Table + (ESRT), a catalog of firmware for which can be updated with + the UEFI UpdateCapsule mechanism described in section 7.5 of + the UEFI Standard. +Users: fwupdate - https://github.com/rhinstaller/fwupdate + +What: /sys/firmware/efi/esrt/fw_resource_count +Date: February 2015 +Contact: Peter Jones <pjones@redhat.com> +Description: The number of entries in the ESRT + +What: /sys/firmware/efi/esrt/fw_resource_count_max +Date: February 2015 +Contact: Peter Jones <pjones@redhat.com> +Description: The maximum number of entries that /could/ be registered + in the allocation the table is currently in. This is + really only useful to the system firmware itself. + +What: /sys/firmware/efi/esrt/fw_resource_version +Date: February 2015 +Contact: Peter Jones <pjones@redhat.com> +Description: The version of the ESRT structure provided by the firmware. + +What: /sys/firmware/efi/esrt/entries/entry$N/ +Date: February 2015 +Contact: Peter Jones <pjones@redhat.com> +Description: Each ESRT entry is identified by a GUID, and each gets a + subdirectory under entries/ . + example: /sys/firmware/efi/esrt/entries/entry0/ + +What: /sys/firmware/efi/esrt/entries/entry$N/fw_type +Date: February 2015 +Contact: Peter Jones <pjones@redhat.com> +Description: What kind of firmware entry this is: + 0 - Unknown + 1 - System Firmware + 2 - Device Firmware + 3 - UEFI Driver + +What: /sys/firmware/efi/esrt/entries/entry$N/fw_class +Date: February 2015 +Contact: Peter Jones <pjones@redhat.com> +Description: This is the entry's guid, and will match the directory name. + +What: /sys/firmware/efi/esrt/entries/entry$N/fw_version +Date: February 2015 +Contact: Peter Jones <pjones@redhat.com> +Description: The version of the firmware currently installed. This is a + 32-bit unsigned integer. + +What: /sys/firmware/efi/esrt/entries/entry$N/lowest_supported_fw_version +Date: February 2015 +Contact: Peter Jones <pjones@redhat.com> +Description: The lowest version of the firmware that can be installed. + +What: /sys/firmware/efi/esrt/entries/entry$N/capsule_flags +Date: February 2015 +Contact: Peter Jones <pjones@redhat.com> +Description: Flags that must be passed to UpdateCapsule() + +What: /sys/firmware/efi/esrt/entries/entry$N/last_attempt_version +Date: February 2015 +Contact: Peter Jones <pjones@redhat.com> +Description: The last firmware version for which an update was attempted. + +What: /sys/firmware/efi/esrt/entries/entry$N/last_attempt_status +Date: February 2015 +Contact: Peter Jones <pjones@redhat.com> +Description: The result of the last firmware update attempt for the + firmware resource entry. + 0 - Success + 1 - Insufficient resources + 2 - Incorrect version + 3 - Invalid format + 4 - Authentication error + 5 - AC power event + 6 - Battery power event + diff --git a/Documentation/DMA-API-HOWTO.txt b/Documentation/DMA-API-HOWTO.txt index 0f7afb2bb442..aef8cc5a677b 100644 --- a/Documentation/DMA-API-HOWTO.txt +++ b/Documentation/DMA-API-HOWTO.txt @@ -25,13 +25,18 @@ physical addresses. These are the addresses in /proc/iomem. The physical address is not directly useful to a driver; it must use ioremap() to map the space and produce a virtual address. -I/O devices use a third kind of address: a "bus address" or "DMA address". -If a device has registers at an MMIO address, or if it performs DMA to read -or write system memory, the addresses used by the device are bus addresses. -In some systems, bus addresses are identical to CPU physical addresses, but -in general they are not. IOMMUs and host bridges can produce arbitrary +I/O devices use a third kind of address: a "bus address". If a device has +registers at an MMIO address, or if it performs DMA to read or write system +memory, the addresses used by the device are bus addresses. In some +systems, bus addresses are identical to CPU physical addresses, but in +general they are not. IOMMUs and host bridges can produce arbitrary mappings between physical and bus addresses. +From a device's point of view, DMA uses the bus address space, but it may +be restricted to a subset of that space. For example, even if a system +supports 64-bit addresses for main memory and PCI BARs, it may use an IOMMU +so devices only need to use 32-bit DMA addresses. + Here's a picture and some examples: CPU CPU Bus @@ -72,11 +77,11 @@ can use virtual address X to access the buffer, but the device itself cannot because DMA doesn't go through the CPU virtual memory system. In some simple systems, the device can do DMA directly to physical address -Y. But in many others, there is IOMMU hardware that translates bus +Y. But in many others, there is IOMMU hardware that translates DMA addresses to physical addresses, e.g., it translates Z to Y. This is part of the reason for the DMA API: the driver can give a virtual address X to an interface like dma_map_single(), which sets up any required IOMMU -mapping and returns the bus address Z. The driver then tells the device to +mapping and returns the DMA address Z. The driver then tells the device to do DMA to Z, and the IOMMU maps it to the buffer at address Y in system RAM. @@ -98,7 +103,7 @@ First of all, you should make sure #include <linux/dma-mapping.h> is in your driver, which provides the definition of dma_addr_t. This type -can hold any valid DMA or bus address for the platform and should be used +can hold any valid DMA address for the platform and should be used everywhere you hold a DMA address returned from the DMA mapping functions. What memory is DMA'able? @@ -316,7 +321,7 @@ There are two types of DMA mappings: Think of "consistent" as "synchronous" or "coherent". The current default is to return consistent memory in the low 32 - bits of the bus space. However, for future compatibility you should + bits of the DMA space. However, for future compatibility you should set the consistent mask even if this default is fine for your driver. @@ -403,7 +408,7 @@ dma_alloc_coherent() returns two values: the virtual address which you can use to access it from the CPU and dma_handle which you pass to the card. -The CPU virtual address and the DMA bus address are both +The CPU virtual address and the DMA address are both guaranteed to be aligned to the smallest PAGE_SIZE order which is greater than or equal to the requested size. This invariant exists (for example) to guarantee that if you allocate a chunk @@ -645,8 +650,8 @@ PLEASE NOTE: The 'nents' argument to the dma_unmap_sg call must be dma_map_sg call. Every dma_map_{single,sg}() call should have its dma_unmap_{single,sg}() -counterpart, because the bus address space is a shared resource and -you could render the machine unusable by consuming all bus addresses. +counterpart, because the DMA address space is a shared resource and +you could render the machine unusable by consuming all DMA addresses. If you need to use the same streaming DMA region multiple times and touch the data in between the DMA transfers, the buffer needs to be synced diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt index 52088408668a..7eba542eff7c 100644 --- a/Documentation/DMA-API.txt +++ b/Documentation/DMA-API.txt @@ -18,10 +18,10 @@ Part I - dma_ API To get the dma_ API, you must #include <linux/dma-mapping.h>. This provides dma_addr_t and the interfaces described below. -A dma_addr_t can hold any valid DMA or bus address for the platform. It -can be given to a device to use as a DMA source or target. A CPU cannot -reference a dma_addr_t directly because there may be translation between -its physical address space and the bus address space. +A dma_addr_t can hold any valid DMA address for the platform. It can be +given to a device to use as a DMA source or target. A CPU cannot reference +a dma_addr_t directly because there may be translation between its physical +address space and the DMA address space. Part Ia - Using large DMA-coherent buffers ------------------------------------------ @@ -42,7 +42,7 @@ It returns a pointer to the allocated region (in the processor's virtual address space) or NULL if the allocation failed. It also returns a <dma_handle> which may be cast to an unsigned integer the -same width as the bus and given to the device as the bus address base of +same width as the bus and given to the device as the DMA address base of the region. Note: consistent memory can be expensive on some platforms, and the @@ -193,7 +193,7 @@ dma_map_single(struct device *dev, void *cpu_addr, size_t size, enum dma_data_direction direction) Maps a piece of processor virtual memory so it can be accessed by the -device and returns the bus address of the memory. +device and returns the DMA address of the memory. The direction for both APIs may be converted freely by casting. However the dma_ API uses a strongly typed enumerator for its @@ -212,20 +212,20 @@ contiguous piece of memory. For this reason, memory to be mapped by this API should be obtained from sources which guarantee it to be physically contiguous (like kmalloc). -Further, the bus address of the memory must be within the +Further, the DMA address of the memory must be within the dma_mask of the device (the dma_mask is a bit mask of the -addressable region for the device, i.e., if the bus address of -the memory ANDed with the dma_mask is still equal to the bus +addressable region for the device, i.e., if the DMA address of +the memory ANDed with the dma_mask is still equal to the DMA address, then the device can perform DMA to the memory). To ensure that the memory allocated by kmalloc is within the dma_mask, the driver may specify various platform-dependent flags to restrict -the bus address range of the allocation (e.g., on x86, GFP_DMA -guarantees to be within the first 16MB of available bus addresses, +the DMA address range of the allocation (e.g., on x86, GFP_DMA +guarantees to be within the first 16MB of available DMA addresses, as required by ISA devices). Note also that the above constraints on physical contiguity and dma_mask may not apply if the platform has an IOMMU (a device which -maps an I/O bus address to a physical memory address). However, to be +maps an I/O DMA address to a physical memory address). However, to be portable, device driver writers may *not* assume that such an IOMMU exists. @@ -296,7 +296,7 @@ reduce current DMA mapping usage or delay and try again later). dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, enum dma_data_direction direction) -Returns: the number of bus address segments mapped (this may be shorter +Returns: the number of DMA address segments mapped (this may be shorter than <nents> passed in if some elements of the scatter/gather list are physically or virtually adjacent and an IOMMU maps them with a single entry). @@ -340,7 +340,7 @@ must be the same as those and passed in to the scatter/gather mapping API. Note: <nents> must be the number you passed in, *not* the number of -bus address entries returned. +DMA address entries returned. void dma_sync_single_for_cpu(struct device *dev, dma_addr_t dma_handle, size_t size, @@ -507,7 +507,7 @@ it's asked for coherent memory for this device. phys_addr is the CPU physical address to which the memory is currently assigned (this will be ioremapped so the CPU can access the region). -device_addr is the bus address the device needs to be programmed +device_addr is the DMA address the device needs to be programmed with to actually address this memory (this will be handed out as the dma_addr_t in dma_alloc_coherent()). diff --git a/Documentation/DocBook/crypto-API.tmpl b/Documentation/DocBook/crypto-API.tmpl index efc8d90a9a3f..0992531ffefb 100644 --- a/Documentation/DocBook/crypto-API.tmpl +++ b/Documentation/DocBook/crypto-API.tmpl @@ -119,7 +119,7 @@ <para> Note: The terms "transformation" and cipher algorithm are used - interchangably. + interchangeably. </para> </sect1> @@ -536,8 +536,8 @@ <para> For other use cases of AEAD ciphers, the ASCII art applies as - well, but the caller may not use the GIVCIPHER interface. In - this case, the caller must generate the IV. + well, but the caller may not use the AEAD cipher with a separate + IV generator. In this case, the caller must generate the IV. </para> <para> @@ -584,8 +584,8 @@ kernel crypto API | IPSEC Layer | +-----------+ | | | (1) -| givcipher | <----------------------------------- esp_output -| (seqiv) | ---+ +| aead | <----------------------------------- esp_output +| (seqniv) | ---+ +-----------+ | | (2) +-----------+ | @@ -620,8 +620,8 @@ kernel crypto API | IPSEC Layer <orderedlist> <listitem> <para> - esp_output() invokes crypto_aead_givencrypt() to trigger an encryption - operation of the GIVCIPHER implementation. + esp_output() invokes crypto_aead_encrypt() to trigger an encryption + operation of the AEAD cipher with IV generator. </para> <para> @@ -1563,7 +1563,7 @@ struct sockaddr_alg sa = { <sect1><title>Zero-Copy Interface</title> <para> - In addition to the send/write/read/recv system call familty, the AF_ALG + In addition to the send/write/read/recv system call family, the AF_ALG interface can be accessed with the zero-copy interface of splice/vmsplice. As the name indicates, the kernel tries to avoid a copy operation into kernel space. @@ -1669,9 +1669,19 @@ read(opfd, out, outlen); </chapter> <chapter id="API"><title>Programming Interface</title> + <para> + Please note that the kernel crypto API contains the AEAD givcrypt + API (crypto_aead_giv* and aead_givcrypt_* function calls in + include/crypto/aead.h). This API is obsolete and will be removed + in the future. To obtain the functionality of an AEAD cipher with + internal IV generation, use the IV generator as a regular cipher. + For example, rfc4106(gcm(aes)) is the AEAD cipher with external + IV generation and seqniv(rfc4106(gcm(aes))) implies that the kernel + crypto API generates the IV. Different IV generators are available. + </para> <sect1><title>Block Cipher Context Data Structures</title> !Pinclude/linux/crypto.h Block Cipher Context Data Structures -!Finclude/linux/crypto.h aead_request +!Finclude/crypto/aead.h aead_request </sect1> <sect1><title>Block Cipher Algorithm Definitions</title> !Pinclude/linux/crypto.h Block Cipher Algorithm Definitions @@ -1680,7 +1690,7 @@ read(opfd, out, outlen); !Finclude/linux/crypto.h aead_alg !Finclude/linux/crypto.h blkcipher_alg !Finclude/linux/crypto.h cipher_alg -!Finclude/linux/crypto.h rng_alg +!Finclude/crypto/rng.h rng_alg </sect1> <sect1><title>Asynchronous Block Cipher API</title> !Pinclude/linux/crypto.h Asynchronous Block Cipher API @@ -1704,26 +1714,27 @@ read(opfd, out, outlen); !Finclude/linux/crypto.h ablkcipher_request_set_crypt </sect1> <sect1><title>Authenticated Encryption With Associated Data (AEAD) Cipher API</title> -!Pinclude/linux/crypto.h Authenticated Encryption With Associated Data (AEAD) Cipher API -!Finclude/linux/crypto.h crypto_alloc_aead -!Finclude/linux/crypto.h crypto_free_aead -!Finclude/linux/crypto.h crypto_aead_ivsize -!Finclude/linux/crypto.h crypto_aead_authsize -!Finclude/linux/crypto.h crypto_aead_blocksize -!Finclude/linux/crypto.h crypto_aead_setkey -!Finclude/linux/crypto.h crypto_aead_setauthsize -!Finclude/linux/crypto.h crypto_aead_encrypt -!Finclude/linux/crypto.h crypto_aead_decrypt +!Pinclude/crypto/aead.h Authenticated Encryption With Associated Data (AEAD) Cipher API +!Finclude/crypto/aead.h crypto_alloc_aead +!Finclude/crypto/aead.h crypto_free_aead +!Finclude/crypto/aead.h crypto_aead_ivsize +!Finclude/crypto/aead.h crypto_aead_authsize +!Finclude/crypto/aead.h crypto_aead_blocksize +!Finclude/crypto/aead.h crypto_aead_setkey +!Finclude/crypto/aead.h crypto_aead_setauthsize +!Finclude/crypto/aead.h crypto_aead_encrypt +!Finclude/crypto/aead.h crypto_aead_decrypt </sect1> <sect1><title>Asynchronous AEAD Request Handle</title> -!Pinclude/linux/crypto.h Asynchronous AEAD Request Handle -!Finclude/linux/crypto.h crypto_aead_reqsize -!Finclude/linux/crypto.h aead_request_set_tfm -!Finclude/linux/crypto.h aead_request_alloc -!Finclude/linux/crypto.h aead_request_free -!Finclude/linux/crypto.h aead_request_set_callback -!Finclude/linux/crypto.h aead_request_set_crypt -!Finclude/linux/crypto.h aead_request_set_assoc +!Pinclude/crypto/aead.h Asynchronous AEAD Request Handle +!Finclude/crypto/aead.h crypto_aead_reqsize +!Finclude/crypto/aead.h aead_request_set_tfm +!Finclude/crypto/aead.h aead_request_alloc +!Finclude/crypto/aead.h aead_request_free +!Finclude/crypto/aead.h aead_request_set_callback +!Finclude/crypto/aead.h aead_request_set_crypt +!Finclude/crypto/aead.h aead_request_set_assoc +!Finclude/crypto/aead.h aead_request_set_ad </sect1> <sect1><title>Synchronous Block Cipher API</title> !Pinclude/linux/crypto.h Synchronous Block Cipher API diff --git a/Documentation/RCU/arrayRCU.txt b/Documentation/RCU/arrayRCU.txt index 453ebe6953ee..f05a9afb2c39 100644 --- a/Documentation/RCU/arrayRCU.txt +++ b/Documentation/RCU/arrayRCU.txt @@ -10,7 +10,19 @@ also be used to protect arrays. Three situations are as follows: 3. Resizeable Arrays -Each of these situations are discussed below. +Each of these three situations involves an RCU-protected pointer to an +array that is separately indexed. It might be tempting to consider use +of RCU to instead protect the index into an array, however, this use +case is -not- supported. The problem with RCU-protected indexes into +arrays is that compilers can play way too many optimization games with +integers, which means that the rules governing handling of these indexes +are far more trouble than they are worth. If RCU-protected indexes into +arrays prove to be particularly valuable (which they have not thus far), +explicit cooperation from the compiler will be required to permit them +to be safely used. + +That aside, each of the three RCU-protected pointer situations are +described in the following sections. Situation 1: Hash Tables @@ -36,9 +48,9 @@ Quick Quiz: Why is it so important that updates be rare when Situation 3: Resizeable Arrays Use of RCU for resizeable arrays is demonstrated by the grow_ary() -function used by the System V IPC code. The array is used to map from -semaphore, message-queue, and shared-memory IDs to the data structure -that represents the corresponding IPC construct. The grow_ary() +function formerly used by the System V IPC code. The array is used +to map from semaphore, message-queue, and shared-memory IDs to the data +structure that represents the corresponding IPC construct. The grow_ary() function does not acquire any locks; instead its caller must hold the ids->sem semaphore. diff --git a/Documentation/RCU/lockdep.txt b/Documentation/RCU/lockdep.txt index cd83d2348fef..da51d3068850 100644 --- a/Documentation/RCU/lockdep.txt +++ b/Documentation/RCU/lockdep.txt @@ -47,11 +47,6 @@ checking of rcu_dereference() primitives: Use explicit check expression "c" along with srcu_read_lock_held()(). This is useful in code that is invoked by both SRCU readers and updaters. - rcu_dereference_index_check(p, c): - Use explicit check expression "c", but the caller - must supply one of the rcu_read_lock_held() functions. - This is useful in code that uses RCU-protected arrays - that is invoked by both RCU readers and updaters. rcu_dereference_raw(p): Don't check. (Use sparingly, if at all.) rcu_dereference_protected(p, c): @@ -64,11 +59,6 @@ checking of rcu_dereference() primitives: but retain the compiler constraints that prevent duplicating or coalescsing. This is useful when when testing the value of the pointer itself, for example, against NULL. - rcu_access_index(idx): - Return the value of the index and omit all barriers, but - retain the compiler constraints that prevent duplicating - or coalescsing. This is useful when when testing the - value of the index itself, for example, against -1. The rcu_dereference_check() check expression can be any boolean expression, but would normally include a lockdep expression. However, diff --git a/Documentation/RCU/rcu_dereference.txt b/Documentation/RCU/rcu_dereference.txt index ceb05da5a5ac..1e6c0da994f5 100644 --- a/Documentation/RCU/rcu_dereference.txt +++ b/Documentation/RCU/rcu_dereference.txt @@ -25,17 +25,6 @@ o You must use one of the rcu_dereference() family of primitives for an example where the compiler can in fact deduce the exact value of the pointer, and thus cause misordering. -o Do not use single-element RCU-protected arrays. The compiler - is within its right to assume that the value of an index into - such an array must necessarily evaluate to zero. The compiler - could then substitute the constant zero for the computation, so - that the array index no longer depended on the value returned - by rcu_dereference(). If the array index no longer depends - on rcu_dereference(), then both the compiler and the CPU - are within their rights to order the array access before the - rcu_dereference(), which can cause the array access to return - garbage. - o Avoid cancellation when using the "+" and "-" infix arithmetic operators. For example, for a given variable "x", avoid "(x-x)". There are similar arithmetic pitfalls from other @@ -76,14 +65,15 @@ o Do not use the results from the boolean "&&" and "||" when dereferencing. For example, the following (rather improbable) code is buggy: - int a[2]; - int index; - int force_zero_index = 1; + int *p; + int *q; ... - r1 = rcu_dereference(i1) - r2 = a[r1 && force_zero_index]; /* BUGGY!!! */ + p = rcu_dereference(gp) + q = &global_q; + q += p != &oom_p1 && p != &oom_p2; + r1 = *q; /* BUGGY!!! */ The reason this is buggy is that "&&" and "||" are often compiled using branches. While weak-memory machines such as ARM or PowerPC @@ -94,14 +84,15 @@ o Do not use the results from relational operators ("==", "!=", ">", ">=", "<", or "<=") when dereferencing. For example, the following (quite strange) code is buggy: - int a[2]; - int index; - int flip_index = 0; + int *p; + int *q; ... - r1 = rcu_dereference(i1) - r2 = a[r1 != flip_index]; /* BUGGY!!! */ + p = rcu_dereference(gp) + q = &global_q; + q += p > &oom_p; + r1 = *q; /* BUGGY!!! */ As before, the reason this is buggy is that relational operators are often compiled using branches. And as before, although @@ -193,6 +184,11 @@ o Be very careful about comparing pointers obtained from pointer. Note that the volatile cast in rcu_dereference() will normally prevent the compiler from knowing too much. + However, please note that if the compiler knows that the + pointer takes on only one of two values, a not-equal + comparison will provide exactly the information that the + compiler needs to deduce the value of the pointer. + o Disable any value-speculation optimizations that your compiler might provide, especially if you are making use of feedback-based optimizations that take data collected from prior runs. Such diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt index 88dfce182f66..5746b0c77f3e 100644 --- a/Documentation/RCU/whatisRCU.txt +++ b/Documentation/RCU/whatisRCU.txt @@ -256,7 +256,9 @@ rcu_dereference() If you are going to be fetching multiple fields from the RCU-protected structure, using the local variable is of course preferred. Repeated rcu_dereference() calls look - ugly and incur unnecessary overhead on Alpha CPUs. + ugly, do not guarantee that the same pointer will be returned + if an update happened while in the critical section, and incur + unnecessary overhead on Alpha CPUs. Note that the value returned by rcu_dereference() is valid only within the enclosing RCU read-side critical section. @@ -879,9 +881,7 @@ SRCU: Initialization/cleanup All: lockdep-checked RCU-protected pointer access - rcu_access_index rcu_access_pointer - rcu_dereference_index_check rcu_dereference_raw rcu_lockdep_assert rcu_sleep_check diff --git a/Documentation/cputopology.txt b/Documentation/cputopology.txt index 0aad6deb2d96..12b1b25b4da9 100644 --- a/Documentation/cputopology.txt +++ b/Documentation/cputopology.txt @@ -1,6 +1,6 @@ Export CPU topology info via sysfs. Items (attributes) are similar -to /proc/cpuinfo. +to /proc/cpuinfo output of some architectures: 1) /sys/devices/system/cpu/cpuX/topology/physical_package_id: @@ -23,20 +23,35 @@ to /proc/cpuinfo. 4) /sys/devices/system/cpu/cpuX/topology/thread_siblings: internal kernel map of cpuX's hardware threads within the same - core as cpuX + core as cpuX. -5) /sys/devices/system/cpu/cpuX/topology/core_siblings: +5) /sys/devices/system/cpu/cpuX/topology/thread_siblings_list: + + human-readable list of cpuX's hardware threads within the same + core as cpuX. + +6) /sys/devices/system/cpu/cpuX/topology/core_siblings: internal kernel map of cpuX's hardware threads within the same physical_package_id. -6) /sys/devices/system/cpu/cpuX/topology/book_siblings: +7) /sys/devices/system/cpu/cpuX/topology/core_siblings_list: + + human-readable list of cpuX's hardware threads within the same + physical_package_id. + +8) /sys/devices/system/cpu/cpuX/topology/book_siblings: internal kernel map of cpuX's hardware threads within the same book_id. +9) /sys/devices/system/cpu/cpuX/topology/book_siblings_list: + + human-readable list of cpuX's hardware threads within the same + book_id. + To implement it in an architecture-neutral way, a new source file, -drivers/base/topology.c, is to export the 4 or 6 attributes. The two book +drivers/base/topology.c, is to export the 6 or 9 attributes. The three book related sysfs files will only be created if CONFIG_SCHED_BOOK is selected. For an architecture to support this feature, it must define some of @@ -44,20 +59,22 @@ these macros in include/asm-XXX/topology.h: #define topology_physical_package_id(cpu) #define topology_core_id(cpu) #define topology_book_id(cpu) -#define topology_thread_cpumask(cpu) +#define topology_sibling_cpumask(cpu) #define topology_core_cpumask(cpu) #define topology_book_cpumask(cpu) -The type of **_id is int. -The type of siblings is (const) struct cpumask *. +The type of **_id macros is int. +The type of **_cpumask macros is (const) struct cpumask *. The latter +correspond with appropriate **_siblings sysfs attributes (except for +topology_sibling_cpumask() which corresponds with thread_siblings). To be consistent on all architectures, include/linux/topology.h provides default definitions for any of the above macros that are not defined by include/asm-XXX/topology.h: 1) physical_package_id: -1 2) core_id: 0 -3) thread_siblings: just the given CPU -4) core_siblings: just the given CPU +3) sibling_cpumask: just the given CPU +4) core_cpumask: just the given CPU For architectures that don't support books (CONFIG_SCHED_BOOK) there are no default definitions for topology_book_id() and topology_book_cpumask(). diff --git a/Documentation/devicetree/bindings/arm/armv7m_systick.txt b/Documentation/devicetree/bindings/arm/armv7m_systick.txt new file mode 100644 index 000000000000..7cf4a24601eb --- /dev/null +++ b/Documentation/devicetree/bindings/arm/armv7m_systick.txt @@ -0,0 +1,26 @@ +* ARMv7M System Timer + +ARMv7-M includes a system timer, known as SysTick. Current driver only +implements the clocksource feature. + +Required properties: +- compatible : Should be "arm,armv7m-systick" +- reg : The address range of the timer + +Required clocking property, have to be one of: +- clocks : The input clock of the timer +- clock-frequency : The rate in HZ in input of the ARM SysTick + +Examples: + +systick: timer@e000e010 { + compatible = "arm,armv7m-systick"; + reg = <0xe000e010 0x10>; + clocks = <&clk_systick>; +}; + +systick: timer@e000e010 { + compatible = "arm,armv7m-systick"; + reg = <0xe000e010 0x10>; + clock-frequency = <90000000>; +}; diff --git a/Documentation/devicetree/bindings/clock/at91-clock.txt b/Documentation/devicetree/bindings/clock/at91-clock.txt index 7a4d4926f44e..5ba6450693b9 100644 --- a/Documentation/devicetree/bindings/clock/at91-clock.txt +++ b/Documentation/devicetree/bindings/clock/at91-clock.txt @@ -248,7 +248,7 @@ Required properties for peripheral clocks: - #address-cells : shall be 1 (reg is used to encode clk id). - clocks : shall be the master clock phandle. e.g. clocks = <&mck>; -- name: device tree node describing a specific system clock. +- name: device tree node describing a specific peripheral clock. * #clock-cells : from common clock binding; shall be set to 0. * reg: peripheral id. See Atmel's datasheets to get a full list of peripheral ids. diff --git a/Documentation/devicetree/bindings/crypto/fsl-sec2.txt b/Documentation/devicetree/bindings/crypto/fsl-sec2.txt index 38988ef1336b..f0d926bf9f36 100644 --- a/Documentation/devicetree/bindings/crypto/fsl-sec2.txt +++ b/Documentation/devicetree/bindings/crypto/fsl-sec2.txt @@ -1,9 +1,11 @@ -Freescale SoC SEC Security Engines versions 2.x-3.x +Freescale SoC SEC Security Engines versions 1.x-2.x-3.x Required properties: - compatible : Should contain entries for this and backward compatible - SEC versions, high to low, e.g., "fsl,sec2.1", "fsl,sec2.0" + SEC versions, high to low, e.g., "fsl,sec2.1", "fsl,sec2.0" (SEC2/3) + e.g., "fsl,sec1.2", "fsl,sec1.0" (SEC1) + warning: SEC1 and SEC2 are mutually exclusive - reg : Offset and length of the register set for the device - interrupts : the SEC's interrupt number - fsl,num-channels : An integer representing the number of channels diff --git a/Documentation/devicetree/bindings/crypto/marvell-cesa.txt b/Documentation/devicetree/bindings/crypto/marvell-cesa.txt new file mode 100644 index 000000000000..c6c6a4a045bd --- /dev/null +++ b/Documentation/devicetree/bindings/crypto/marvell-cesa.txt @@ -0,0 +1,45 @@ +Marvell Cryptographic Engines And Security Accelerator + +Required properties: +- compatible: should be one of the following string + "marvell,orion-crypto" + "marvell,kirkwood-crypto" + "marvell,dove-crypto" + "marvell,armada-370-crypto" + "marvell,armada-xp-crypto" + "marvell,armada-375-crypto" + "marvell,armada-38x-crypto" +- reg: base physical address of the engine and length of memory mapped + region. Can also contain an entry for the SRAM attached to the CESA, + but this representation is deprecated and marvell,crypto-srams should + be used instead +- reg-names: "regs". Can contain an "sram" entry, but this representation + is deprecated and marvell,crypto-srams should be used instead +- interrupts: interrupt number +- clocks: reference to the crypto engines clocks. This property is not + required for orion and kirkwood platforms +- clock-names: "cesaX" and "cesazX", X should be replaced by the crypto engine + id. + This property is not required for the orion and kirkwoord + platforms. + "cesazX" clocks are not required on armada-370 platforms +- marvell,crypto-srams: phandle to crypto SRAM definitions + +Optional properties: +- marvell,crypto-sram-size: SRAM size reserved for crypto operations, if not + specified the whole SRAM is used (2KB) + + +Examples: + + crypto@90000 { + compatible = "marvell,armada-xp-crypto"; + reg = <0x90000 0x10000>; + reg-names = "regs"; + interrupts = <48>, <49>; + clocks = <&gateclk 23>, <&gateclk 23>; + clock-names = "cesa0", "cesa1"; + marvell,crypto-srams = <&crypto_sram0>, <&crypto_sram1>; + marvell,crypto-sram-size = <0x600>; + status = "okay"; + }; diff --git a/Documentation/devicetree/bindings/crypto/mv_cesa.txt b/Documentation/devicetree/bindings/crypto/mv_cesa.txt index 47229b1a594b..c0c35f00335b 100644 --- a/Documentation/devicetree/bindings/crypto/mv_cesa.txt +++ b/Documentation/devicetree/bindings/crypto/mv_cesa.txt @@ -1,20 +1,33 @@ Marvell Cryptographic Engines And Security Accelerator Required properties: -- compatible : should be "marvell,orion-crypto" -- reg : base physical address of the engine and length of memory mapped - region, followed by base physical address of sram and its memory - length -- reg-names : "regs" , "sram"; -- interrupts : interrupt number +- compatible: should be one of the following string + "marvell,orion-crypto" + "marvell,kirkwood-crypto" + "marvell,dove-crypto" +- reg: base physical address of the engine and length of memory mapped + region. Can also contain an entry for the SRAM attached to the CESA, + but this representation is deprecated and marvell,crypto-srams should + be used instead +- reg-names: "regs". Can contain an "sram" entry, but this representation + is deprecated and marvell,crypto-srams should be used instead +- interrupts: interrupt number +- clocks: reference to the crypto engines clocks. This property is only + required for Dove platforms +- marvell,crypto-srams: phandle to crypto SRAM definitions + +Optional properties: +- marvell,crypto-sram-size: SRAM size reserved for crypto operations, if not + specified the whole SRAM is used (2KB) Examples: crypto@30000 { compatible = "marvell,orion-crypto"; - reg = <0x30000 0x10000>, - <0x4000000 0x800>; - reg-names = "regs" , "sram"; + reg = <0x30000 0x10000>; + reg-names = "regs"; interrupts = <22>; + marvell,crypto-srams = <&crypto_sram>; + marvell,crypto-sram-size = <0x600>; status = "okay"; }; diff --git a/Documentation/devicetree/bindings/gpio/brcm,brcmstb-gpio.txt b/Documentation/devicetree/bindings/gpio/brcm,brcmstb-gpio.txt new file mode 100644 index 000000000000..435f1bcca341 --- /dev/null +++ b/Documentation/devicetree/bindings/gpio/brcm,brcmstb-gpio.txt @@ -0,0 +1,65 @@ +Broadcom STB "UPG GIO" GPIO controller + +The controller's registers are organized as sets of eight 32-bit +registers with each set controlling a bank of up to 32 pins. A single +interrupt is shared for all of the banks handled by the controller. + +Required properties: + +- compatible: + Must be "brcm,brcmstb-gpio" + +- reg: + Define the base and range of the I/O address space containing + the brcmstb GPIO controller registers + +- #gpio-cells: + Should be <2>. The first cell is the pin number (within the controller's + pin space), and the second is used for the following: + bit[0]: polarity (0 for active-high, 1 for active-low) + +- gpio-controller: + Specifies that the node is a GPIO controller. + +- brcm,gpio-bank-widths: + Number of GPIO lines for each bank. Number of elements must + correspond to number of banks suggested by the 'reg' property. + +Optional properties: + +- interrupts: + The interrupt shared by all GPIO lines for this controller. + +- interrupt-parent: + phandle of the parent interrupt controller + +- #interrupt-cells: + Should be <2>. The first cell is the GPIO number, the second should specify + flags. The following subset of flags is supported: + - bits[3:0] trigger type and level flags + 1 = low-to-high edge triggered + 2 = high-to-low edge triggered + 4 = active high level-sensitive + 8 = active low level-sensitive + Valid combinations are 1, 2, 3, 4, 8. + See also Documentation/devicetree/bindings/interrupt-controller/interrupts.txt + +- interrupt-controller: + Marks the device node as an interrupt controller + +- interrupt-names: + The name of the IRQ resource used by this controller + +Example: + upg_gio: gpio@f040a700 { + #gpio-cells = <0x2>; + #interrupt-cells = <0x2>; + compatible = "brcm,bcm7445-gpio", "brcm,brcmstb-gpio"; + gpio-controller; + interrupt-controller; + reg = <0xf040a700 0x80>; + interrupt-parent = <0xf>; + interrupts = <0x6>; + interrupt-names = "upg_gio"; + brcm,gpio-bank-widths = <0x20 0x20 0x20 0x18>; + }; diff --git a/Documentation/devicetree/bindings/gpio/gpio-etraxfs.txt b/Documentation/devicetree/bindings/gpio/gpio-etraxfs.txt new file mode 100644 index 000000000000..abf4db736c6e --- /dev/null +++ b/Documentation/devicetree/bindings/gpio/gpio-etraxfs.txt @@ -0,0 +1,21 @@ +Axis ETRAX FS General I/O controller bindings + +Required properties: + +- compatible: + - "axis,etraxfs-gio" +- reg: Physical base address and length of the controller's registers. +- #gpio-cells: Should be 3 + - The first cell is the gpio offset number. + - The second cell is reserved and is currently unused. + - The third cell is the port number (hex). +- gpio-controller: Marks the device node as a GPIO controller. + +Example: + + gio: gpio@b001a000 { + compatible = "axis,etraxfs-gio"; + reg = <0xb001a000 0x1000>; + gpio-controller; + #gpio-cells = <3>; + }; diff --git a/Documentation/devicetree/bindings/gpio/gpio-xlp.txt b/Documentation/devicetree/bindings/gpio/gpio-xlp.txt new file mode 100644 index 000000000000..262ee4ddf2cb --- /dev/null +++ b/Documentation/devicetree/bindings/gpio/gpio-xlp.txt @@ -0,0 +1,47 @@ +Netlogic XLP Family GPIO +======================== + +This GPIO driver is used for following Netlogic XLP SoCs: + XLP832, XLP316, XLP208, XLP980, XLP532 + +Required properties: +------------------- + +- compatible: Should be one of the following: + - "netlogic,xlp832-gpio": For Netlogic XLP832 + - "netlogic,xlp316-gpio": For Netlogic XLP316 + - "netlogic,xlp208-gpio": For Netlogic XLP208 + - "netlogic,xlp980-gpio": For Netlogic XLP980 + - "netlogic,xlp532-gpio": For Netlogic XLP532 +- reg: Physical base address and length of the controller's registers. +- #gpio-cells: Should be two. The first cell is the pin number and the second + cell is used to specify optional parameters (currently unused). +- gpio-controller: Marks the device node as a GPIO controller. +- nr-gpios: Number of GPIO pins supported by the controller. +- interrupt-cells: Should be two. The first cell is the GPIO Number. The + second cell is used to specify flags. The following subset of flags is + supported: + - trigger type: + 1 = low to high edge triggered. + 2 = high to low edge triggered. + 4 = active high level-sensitive. + 8 = active low level-sensitive. +- interrupts: Interrupt number for this device. +- interrupt-parent: phandle of the parent interrupt controller. +- interrupt-controller: Identifies the node as an interrupt controller. + +Example: + + gpio: xlp_gpio@34000 { + compatible = "netlogic,xlp316-gpio"; + reg = <0 0x34100 0x1000 + 0 0x35100 0x1000>; + #gpio-cells = <2>; + gpio-controller; + nr-gpios = <57>; + + #interrupt-cells = <2>; + interrupt-parent = <&pic>; + interrupts = <39>; + interrupt-controller; + }; diff --git a/Documentation/devicetree/bindings/gpio/gpio-zynq.txt b/Documentation/devicetree/bindings/gpio/gpio-zynq.txt index 986371a4be2c..db4c6a663c03 100644 --- a/Documentation/devicetree/bindings/gpio/gpio-zynq.txt +++ b/Documentation/devicetree/bindings/gpio/gpio-zynq.txt @@ -6,7 +6,7 @@ Required properties: - First cell is the GPIO line number - Second cell is used to specify optional parameters (unused) -- compatible : Should be "xlnx,zynq-gpio-1.0" +- compatible : Should be "xlnx,zynq-gpio-1.0" or "xlnx,zynqmp-gpio-1.0" - clocks : Clock specifier (see clock bindings for details) - gpio-controller : Marks the device node as a GPIO controller. - interrupts : Interrupt specifier (see interrupt bindings for diff --git a/Documentation/devicetree/bindings/gpio/nxp,lpc1850-gpio.txt b/Documentation/devicetree/bindings/gpio/nxp,lpc1850-gpio.txt new file mode 100644 index 000000000000..eb7cdd69e10b --- /dev/null +++ b/Documentation/devicetree/bindings/gpio/nxp,lpc1850-gpio.txt @@ -0,0 +1,39 @@ +NXP LPC18xx/43xx GPIO controller Device Tree Bindings +----------------------------------------------------- + +Required properties: +- compatible : Should be "nxp,lpc1850-gpio" +- reg : Address and length of the register set for the device +- clocks : Clock specifier (see clock bindings for details) +- gpio-controller : Marks the device node as a GPIO controller. +- #gpio-cells : Should be two + - First cell is the GPIO line number + - Second cell is used to specify polarity + +Optional properties: +- gpio-ranges : Mapping between GPIO and pinctrl + +Example: +#define LPC_GPIO(port, pin) (port * 32 + pin) +#define LPC_PIN(port, pin) (0x##port * 32 + pin) + +gpio: gpio@400f4000 { + compatible = "nxp,lpc1850-gpio"; + reg = <0x400f4000 0x4000>; + clocks = <&ccu1 CLK_CPU_GPIO>; + gpio-controller; + #gpio-cells = <2>; + gpio-ranges = <&pinctrl LPC_GPIO(0,0) LPC_PIN(0,0) 2>, + ... + <&pinctrl LPC_GPIO(7,19) LPC_PIN(f,5) 7>; +}; + +gpio_joystick { + compatible = "gpio-keys-polled"; + ... + + button@0 { + ... + gpios = <&gpio LPC_GPIO(4,8) GPIO_ACTIVE_LOW>; + }; +}; diff --git a/Documentation/devicetree/bindings/hwmon/ntc_thermistor.txt b/Documentation/devicetree/bindings/hwmon/ntc_thermistor.txt index fcca8e744f41..a04a80f9cc70 100644 --- a/Documentation/devicetree/bindings/hwmon/ntc_thermistor.txt +++ b/Documentation/devicetree/bindings/hwmon/ntc_thermistor.txt @@ -9,6 +9,7 @@ Requires node properties: "murata,ncp21wb473" "murata,ncp03wb473" "murata,ncp15wl333" + "murata,ncp03wf104" /* Usage of vendor name "ntc" is deprecated */ <DEPRECATED> "ntc,ncp15wb473" diff --git a/Documentation/devicetree/bindings/interrupt-controller/atmel,aic.txt b/Documentation/devicetree/bindings/interrupt-controller/atmel,aic.txt index f292917fa00d..0e9f09a6a2fe 100644 --- a/Documentation/devicetree/bindings/interrupt-controller/atmel,aic.txt +++ b/Documentation/devicetree/bindings/interrupt-controller/atmel,aic.txt @@ -2,7 +2,7 @@ Required properties: - compatible: Should be "atmel,<chip>-aic" - <chip> can be "at91rm9200", "sama5d3" or "sama5d4" + <chip> can be "at91rm9200", "sama5d2", "sama5d3" or "sama5d4" - interrupt-controller: Identifies the node as an interrupt controller. - interrupt-parent: For single AIC system, it is an empty property. - #interrupt-cells: The number of cells to define the interrupts. It should be 3. diff --git a/Documentation/devicetree/bindings/interrupt-controller/renesas,intc-irqpin.txt b/Documentation/devicetree/bindings/interrupt-controller/renesas,intc-irqpin.txt index 4f7946ae8adc..772c550d3b4b 100644 --- a/Documentation/devicetree/bindings/interrupt-controller/renesas,intc-irqpin.txt +++ b/Documentation/devicetree/bindings/interrupt-controller/renesas,intc-irqpin.txt @@ -13,9 +13,12 @@ Required properties: - reg: Base address and length of each register bank used by the external IRQ pins driven by the interrupt controller hardware module. The base addresses, length and number of required register banks varies with soctype. - +- interrupt-controller: Identifies the node as an interrupt controller. - #interrupt-cells: has to be <2>: an interrupt index and flags, as defined in - interrupts.txt in this directory + interrupts.txt in this directory. +- interrupts: Must contain a list of interrupt specifiers. For each interrupt + provided by this irqpin controller instance, there must be one entry, + referring to the corresponding parent interrupt. Optional properties: @@ -25,3 +28,35 @@ Optional properties: if different from the default 4 bits - control-parent: disable and enable interrupts on the parent interrupt controller, needed for some broken implementations +- clocks: Must contain a reference to the functional clock. This property is + mandatory if the hardware implements a controllable functional clock for + the irqpin controller instance. +- power-domains: Must contain a reference to the power domain. This property is + mandatory if the irqpin controller instance is part of a controllable power + domain. + + +Example +------- + + irqpin1: interrupt-controller@e6900004 { + compatible = "renesas,intc-irqpin-r8a7740", + "renesas,intc-irqpin"; + #interrupt-cells = <2>; + interrupt-controller; + reg = <0xe6900004 4>, + <0xe6900014 4>, + <0xe6900024 1>, + <0xe6900044 1>, + <0xe6900064 1>; + interrupts = <0 149 IRQ_TYPE_LEVEL_HIGH + 0 149 IRQ_TYPE_LEVEL_HIGH + 0 149 IRQ_TYPE_LEVEL_HIGH + 0 149 IRQ_TYPE_LEVEL_HIGH + 0 149 IRQ_TYPE_LEVEL_HIGH + 0 149 IRQ_TYPE_LEVEL_HIGH + 0 149 IRQ_TYPE_LEVEL_HIGH + 0 149 IRQ_TYPE_LEVEL_HIGH>; + clocks = <&mstp2_clks R8A7740_CLK_INTCA>; + power-domains = <&pd_a4s>; + }; diff --git a/Documentation/devicetree/bindings/mmc/arasan,sdhci.txt b/Documentation/devicetree/bindings/mmc/arasan,sdhci.txt index 98ee2abbe138..7e9490313d5a 100644 --- a/Documentation/devicetree/bindings/mmc/arasan,sdhci.txt +++ b/Documentation/devicetree/bindings/mmc/arasan,sdhci.txt @@ -8,7 +8,8 @@ Device Tree Bindings for the Arasan SDHCI Controller [3] Documentation/devicetree/bindings/interrupt-controller/interrupts.txt Required Properties: - - compatible: Compatibility string. Must be 'arasan,sdhci-8.9a' + - compatible: Compatibility string. Must be 'arasan,sdhci-8.9a' or + 'arasan,sdhci-4.9a' - reg: From mmc bindings: Register location and length. - clocks: From clock bindings: Handles to clock inputs. - clock-names: From clock bindings: Tuple including "clk_xin" and "clk_ahb" diff --git a/Documentation/devicetree/bindings/mmc/fsl-imx-esdhc.txt b/Documentation/devicetree/bindings/mmc/fsl-imx-esdhc.txt index 415c5575cbf7..5d0376b8f202 100644 --- a/Documentation/devicetree/bindings/mmc/fsl-imx-esdhc.txt +++ b/Documentation/devicetree/bindings/mmc/fsl-imx-esdhc.txt @@ -7,7 +7,14 @@ This file documents differences between the core properties described by mmc.txt and the properties used by the sdhci-esdhc-imx driver. Required properties: -- compatible : Should be "fsl,<chip>-esdhc" +- compatible : Should be "fsl,<chip>-esdhc", the supported chips include + "fsl,imx25-esdhc" + "fsl,imx35-esdhc" + "fsl,imx51-esdhc" + "fsl,imx53-esdhc" + "fsl,imx6q-usdhc" + "fsl,imx6sl-usdhc" + "fsl,imx6sx-usdhc" Optional properties: - fsl,cd-controller : Indicate to use controller internal card detection diff --git a/Documentation/devicetree/bindings/mmc/k3-dw-mshc.txt b/Documentation/devicetree/bindings/mmc/k3-dw-mshc.txt index 3b3544931437..df370585cbcc 100644 --- a/Documentation/devicetree/bindings/mmc/k3-dw-mshc.txt +++ b/Documentation/devicetree/bindings/mmc/k3-dw-mshc.txt @@ -13,6 +13,10 @@ Required Properties: * compatible: should be one of the following. - "hisilicon,hi4511-dw-mshc": for controllers with hi4511 specific extensions. + - "hisilicon,hi6220-dw-mshc": for controllers with hi6220 specific extensions. + +Optional Properties: +- hisilicon,peripheral-syscon: phandle of syscon used to control peripheral. Example: @@ -42,3 +46,27 @@ Example: cap-mmc-highspeed; cap-sd-highspeed; }; + + /* for Hi6220 */ + + dwmmc_1: dwmmc1@f723e000 { + compatible = "hisilicon,hi6220-dw-mshc"; + num-slots = <0x1>; + bus-width = <0x4>; + disable-wp; + cap-sd-highspeed; + sd-uhs-sdr12; + sd-uhs-sdr25; + card-detect-delay = <200>; + hisilicon,peripheral-syscon = <&ao_ctrl>; + reg = <0x0 0xf723e000 0x0 0x1000>; + interrupts = <0x0 0x49 0x4>; + clocks = <&clock_sys HI6220_MMC1_CIUCLK>, <&clock_sys HI6220_MMC1_CLK>; + clock-names = "ciu", "biu"; + cd-gpios = <&gpio1 0 1>; + pinctrl-names = "default", "idle"; + pinctrl-0 = <&sd_pmx_func &sd_clk_cfg_func &sd_cfg_func>; + pinctrl-1 = <&sd_pmx_idle &sd_clk_cfg_idle &sd_cfg_idle>; + vqmmc-supply = <&ldo7>; + vmmc-supply = <&ldo10>; + }; diff --git a/Documentation/devicetree/bindings/mmc/mmc-pwrseq-simple.txt b/Documentation/devicetree/bindings/mmc/mmc-pwrseq-simple.txt index a462c50f19a8..ce0e76749671 100644 --- a/Documentation/devicetree/bindings/mmc/mmc-pwrseq-simple.txt +++ b/Documentation/devicetree/bindings/mmc/mmc-pwrseq-simple.txt @@ -21,5 +21,7 @@ Example: sdhci0_pwrseq { compatible = "mmc-pwrseq-simple"; - reset-gpios = <&gpio1 12 0>; + reset-gpios = <&gpio1 12 GPIO_ACTIVE_LOW>; + clocks = <&clk_32768_ck>; + clock-names = "ext_clock"; } diff --git a/Documentation/devicetree/bindings/mmc/mmc.txt b/Documentation/devicetree/bindings/mmc/mmc.txt index 438899e8829b..0384fc3f64e8 100644 --- a/Documentation/devicetree/bindings/mmc/mmc.txt +++ b/Documentation/devicetree/bindings/mmc/mmc.txt @@ -21,6 +21,11 @@ Optional properties: below for the case, when a GPIO is used for the CD line - wp-inverted: when present, polarity on the WP line is inverted. See the note below for the case, when a GPIO is used for the WP line +- disable-wp: When set no physical WP line is present. This property should + only be specified when the controller has a dedicated write-protect + detection logic. If a GPIO is always used for the write-protect detection + logic it is sufficient to not specify wp-gpios property in the absence of a WP + line. - max-frequency: maximum operating clock frequency - no-1-8-v: when present, denotes that 1.8v card voltage is not supported on this system, even if the controller claims it is. diff --git a/Documentation/devicetree/bindings/mmc/mtk-sd.txt b/Documentation/devicetree/bindings/mmc/mtk-sd.txt new file mode 100644 index 000000000000..a1adfa495ad3 --- /dev/null +++ b/Documentation/devicetree/bindings/mmc/mtk-sd.txt @@ -0,0 +1,32 @@ +* MTK MMC controller + +The MTK MSDC can act as a MMC controller +to support MMC, SD, and SDIO types of memory cards. + +This file documents differences between the core properties in mmc.txt +and the properties used by the msdc driver. + +Required properties: +- compatible: Should be "mediatek,mt8173-mmc","mediatek,mt8135-mmc" +- interrupts: Should contain MSDC interrupt number +- clocks: MSDC source clock, HCLK +- clock-names: "source", "hclk" +- pinctrl-names: should be "default", "state_uhs" +- pinctrl-0: should contain default/high speed pin ctrl +- pinctrl-1: should contain uhs mode pin ctrl +- vmmc-supply: power to the Core +- vqmmc-supply: power to the IO + +Examples: +mmc0: mmc@11230000 { + compatible = "mediatek,mt8173-mmc", "mediatek,mt8135-mmc"; + reg = <0 0x11230000 0 0x108>; + interrupts = <GIC_SPI 39 IRQ_TYPE_LEVEL_LOW>; + vmmc-supply = <&mt6397_vemc_3v3_reg>; + vqmmc-supply = <&mt6397_vio18_reg>; + clocks = <&pericfg CLK_PERI_MSDC30_0>, <&topckgen CLK_TOP_MSDC50_0_H_SEL>; + clock-names = "source", "hclk"; + pinctrl-names = "default", "state_uhs"; + pinctrl-0 = <&mmc0_pins_default>; + pinctrl-1 = <&mmc0_pins_uhs>; +}; diff --git a/Documentation/devicetree/bindings/mmc/renesas,mmcif.txt b/Documentation/devicetree/bindings/mmc/renesas,mmcif.txt index 299081f94abd..d38942f6c5ae 100644 --- a/Documentation/devicetree/bindings/mmc/renesas,mmcif.txt +++ b/Documentation/devicetree/bindings/mmc/renesas,mmcif.txt @@ -18,6 +18,8 @@ Required properties: dma-names property. - dma-names: must contain "tx" for the transmit DMA channel and "rx" for the receive DMA channel. +- max-frequency: Maximum operating clock frequency, driver uses default clock + frequency if it is not set. Example: R8A7790 (R-Car H2) MMCIF0 @@ -29,4 +31,5 @@ Example: R8A7790 (R-Car H2) MMCIF0 clocks = <&mstp3_clks R8A7790_CLK_MMCIF0>; dmas = <&dmac0 0xd1>, <&dmac0 0xd2>; dma-names = "tx", "rx"; + max-frequency = <97500000>; }; diff --git a/Documentation/devicetree/bindings/pci/xgene-pci-msi.txt b/Documentation/devicetree/bindings/pci/xgene-pci-msi.txt new file mode 100644 index 000000000000..36d881c8e6d4 --- /dev/null +++ b/Documentation/devicetree/bindings/pci/xgene-pci-msi.txt @@ -0,0 +1,68 @@ +* AppliedMicro X-Gene v1 PCIe MSI controller + +Required properties: + +- compatible: should be "apm,xgene1-msi" to identify + X-Gene v1 PCIe MSI controller block. +- msi-controller: indicates that this is X-Gene v1 PCIe MSI controller node +- reg: physical base address (0x79000000) and length (0x900000) for controller + registers. These registers include the MSI termination address and data + registers as well as the MSI interrupt status registers. +- reg-names: not required +- interrupts: A list of 16 interrupt outputs of the controller, starting from + interrupt number 0x10 to 0x1f. +- interrupt-names: not required + +Each PCIe node needs to have property msi-parent that points to msi controller node + +Examples: + +SoC DTSI: + + + MSI node: + msi@79000000 { + compatible = "apm,xgene1-msi"; + msi-controller; + reg = <0x00 0x79000000 0x0 0x900000>; + interrupts = <0x0 0x10 0x4> + <0x0 0x11 0x4> + <0x0 0x12 0x4> + <0x0 0x13 0x4> + <0x0 0x14 0x4> + <0x0 0x15 0x4> + <0x0 0x16 0x4> + <0x0 0x17 0x4> + <0x0 0x18 0x4> + <0x0 0x19 0x4> + <0x0 0x1a 0x4> + <0x0 0x1b 0x4> + <0x0 0x1c 0x4> + <0x0 0x1d 0x4> + <0x0 0x1e 0x4> + <0x0 0x1f 0x4>; + }; + + + PCIe controller node with msi-parent property pointing to MSI node: + pcie0: pcie@1f2b0000 { + status = "disabled"; + device_type = "pci"; + compatible = "apm,xgene-storm-pcie", "apm,xgene-pcie"; + #interrupt-cells = <1>; + #size-cells = <2>; + #address-cells = <3>; + reg = < 0x00 0x1f2b0000 0x0 0x00010000 /* Controller registers */ + 0xe0 0xd0000000 0x0 0x00040000>; /* PCI config space */ + reg-names = "csr", "cfg"; + ranges = <0x01000000 0x00 0x00000000 0xe0 0x10000000 0x00 0x00010000 /* io */ + 0x02000000 0x00 0x80000000 0xe1 0x80000000 0x00 0x80000000>; /* mem */ + dma-ranges = <0x42000000 0x80 0x00000000 0x80 0x00000000 0x00 0x80000000 + 0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000>; + interrupt-map-mask = <0x0 0x0 0x0 0x7>; + interrupt-map = <0x0 0x0 0x0 0x1 &gic 0x0 0xc2 0x1 + 0x0 0x0 0x0 0x2 &gic 0x0 0xc3 0x1 + 0x0 0x0 0x0 0x3 &gic 0x0 0xc4 0x1 + 0x0 0x0 0x0 0x4 &gic 0x0 0xc5 0x1>; + dma-coherent; + clocks = <&pcie0clk 0>; + msi-parent= <&msi>; + }; diff --git a/Documentation/devicetree/bindings/timer/nxp,lpc3220-timer.txt b/Documentation/devicetree/bindings/timer/nxp,lpc3220-timer.txt new file mode 100644 index 000000000000..51b05a0e70d1 --- /dev/null +++ b/Documentation/devicetree/bindings/timer/nxp,lpc3220-timer.txt @@ -0,0 +1,26 @@ +* NXP LPC3220 timer + +The NXP LPC3220 timer is used on a wide range of NXP SoCs. This +includes LPC32xx, LPC178x, LPC18xx and LPC43xx parts. + +Required properties: +- compatible: + Should be "nxp,lpc3220-timer". +- reg: + Address and length of the register set. +- interrupts: + Reference to the timer interrupt +- clocks: + Should contain a reference to timer clock. +- clock-names: + Should contain "timerclk". + +Example: + +timer1: timer@40085000 { + compatible = "nxp,lpc3220-timer"; + reg = <0x40085000 0x1000>; + interrupts = <13>; + clocks = <&ccu1 CLK_CPU_TIMER1>; + clock-names = "timerclk"; +}; diff --git a/Documentation/devicetree/bindings/timer/st,stm32-timer.txt b/Documentation/devicetree/bindings/timer/st,stm32-timer.txt new file mode 100644 index 000000000000..8ef28e70d6e8 --- /dev/null +++ b/Documentation/devicetree/bindings/timer/st,stm32-timer.txt @@ -0,0 +1,22 @@ +. STMicroelectronics STM32 timer + +The STM32 MCUs family has several general-purpose 16 and 32 bits timers. + +Required properties: +- compatible : Should be "st,stm32-timer" +- reg : Address and length of the register set +- clocks : Reference on the timer input clock +- interrupts : Reference to the timer interrupt + +Optional properties: +- resets: Reference to a reset controller asserting the timer + +Example: + +timer5: timer@40000c00 { + compatible = "st,stm32-timer"; + reg = <0x40000c00 0x400>; + interrupts = <50>; + resets = <&rrc 259>; + clocks = <&clk_pmtr1>; +}; diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index 0a926e2ba3ab..6a34a0f4d37c 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -50,8 +50,8 @@ prototypes: int (*rename2) (struct inode *, struct dentry *, struct inode *, struct dentry *, unsigned int); int (*readlink) (struct dentry *, char __user *,int); - void * (*follow_link) (struct dentry *, struct nameidata *); - void (*put_link) (struct dentry *, struct nameidata *, void *); + const char *(*follow_link) (struct dentry *, void **); + void (*put_link) (struct inode *, void *); void (*truncate) (struct inode *); int (*permission) (struct inode *, int, unsigned int); int (*get_acl)(struct inode *, int); diff --git a/Documentation/filesystems/automount-support.txt b/Documentation/filesystems/automount-support.txt index 7cac200e2a85..7eb762eb3136 100644 --- a/Documentation/filesystems/automount-support.txt +++ b/Documentation/filesystems/automount-support.txt @@ -1,41 +1,15 @@ -Support is available for filesystems that wish to do automounting support (such -as kAFS which can be found in fs/afs/). This facility includes allowing -in-kernel mounts to be performed and mountpoint degradation to be -requested. The latter can also be requested by userspace. +Support is available for filesystems that wish to do automounting +support (such as kAFS which can be found in fs/afs/ and NFS in +fs/nfs/). This facility includes allowing in-kernel mounts to be +performed and mountpoint degradation to be requested. The latter can +also be requested by userspace. ====================== IN-KERNEL AUTOMOUNTING ====================== -A filesystem can now mount another filesystem on one of its directories by the -following procedure: - - (1) Give the directory a follow_link() operation. - - When the directory is accessed, the follow_link op will be called, and - it will be provided with the location of the mountpoint in the nameidata - structure (vfsmount and dentry). - - (2) Have the follow_link() op do the following steps: - - (a) Call vfs_kern_mount() to call the appropriate filesystem to set up a - superblock and gain a vfsmount structure representing it. - - (b) Copy the nameidata provided as an argument and substitute the dentry - argument into it the copy. - - (c) Call do_add_mount() to install the new vfsmount into the namespace's - mountpoint tree, thus making it accessible to userspace. Use the - nameidata set up in (b) as the destination. - - If the mountpoint will be automatically expired, then do_add_mount() - should also be given the location of an expiration list (see further - down). - - (d) Release the path in the nameidata argument and substitute in the new - vfsmount and its root dentry. The ref counts on these will need - incrementing. +See section "Mount Traps" of Documentation/filesystems/autofs4.txt Then from userspace, you can just do something like: @@ -61,17 +35,18 @@ AUTOMATIC MOUNTPOINT EXPIRY =========================== Automatic expiration of mountpoints is easy, provided you've mounted the -mountpoint to be expired in the automounting procedure outlined above. +mountpoint to be expired in the automounting procedure outlined separately. To do expiration, you need to follow these steps: - (3) Create at least one list off which the vfsmounts to be expired can be - hung. Access to this list will be governed by the vfsmount_lock. + (1) Create at least one list off which the vfsmounts to be expired can be + hung. - (4) In step (2c) above, the call to do_add_mount() should be provided with a - pointer to this list. It will hang the vfsmount off of it if it succeeds. + (2) When a new mountpoint is created in the ->d_automount method, add + the mnt to the list using mnt_set_expiry() + mnt_set_expiry(newmnt, &afs_vfsmounts); - (5) When you want mountpoints to be expired, call mark_mounts_for_expiry() + (3) When you want mountpoints to be expired, call mark_mounts_for_expiry() with a pointer to this list. This will process the list, marking every vfsmount thereon for potential expiry on the next call. diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting index e69274de8d0c..3eae250254d5 100644 --- a/Documentation/filesystems/porting +++ b/Documentation/filesystems/porting @@ -483,3 +483,20 @@ in your dentry operations instead. -- [mandatory] ->aio_read/->aio_write are gone. Use ->read_iter/->write_iter. +--- +[recommended] + for embedded ("fast") symlinks just set inode->i_link to wherever the + symlink body is and use simple_follow_link() as ->follow_link(). +-- +[mandatory] + calling conventions for ->follow_link() have changed. Instead of returning + cookie and using nd_set_link() to store the body to traverse, we return + the body to traverse and store the cookie using explicit void ** argument. + nameidata isn't passed at all - nd_jump_link() doesn't need it and + nd_[gs]et_link() is gone. +-- +[mandatory] + calling conventions for ->put_link() have changed. It gets inode instead of + dentry, it does not get nameidata at all and it gets called only when cookie + is non-NULL. Note that link body isn't available anymore, so if you need it, + store it as cookie. diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt index 5d833b32bbcd..b403b29ef710 100644 --- a/Documentation/filesystems/vfs.txt +++ b/Documentation/filesystems/vfs.txt @@ -350,8 +350,8 @@ struct inode_operations { int (*rename2) (struct inode *, struct dentry *, struct inode *, struct dentry *, unsigned int); int (*readlink) (struct dentry *, char __user *,int); - void * (*follow_link) (struct dentry *, struct nameidata *); - void (*put_link) (struct dentry *, struct nameidata *, void *); + const char *(*follow_link) (struct dentry *, void **); + void (*put_link) (struct inode *, void *); int (*permission) (struct inode *, int); int (*get_acl)(struct inode *, int); int (*setattr) (struct dentry *, struct iattr *); @@ -436,16 +436,18 @@ otherwise noted. follow_link: called by the VFS to follow a symbolic link to the inode it points to. Only required if you want to support - symbolic links. This method returns a void pointer cookie - that is passed to put_link(). + symbolic links. This method returns the symlink body + to traverse (and possibly resets the current position with + nd_jump_link()). If the body won't go away until the inode + is gone, nothing else is needed; if it needs to be otherwise + pinned, the data needed to release whatever we'd grabbed + is to be stored in void * variable passed by address to + follow_link() instance. put_link: called by the VFS to release resources allocated by - follow_link(). The cookie returned by follow_link() is passed - to this method as the last parameter. It is used by - filesystems such as NFS where page cache is not stable - (i.e. page that was installed when the symbolic link walk - started might not be in the page cache at the end of the - walk). + follow_link(). The cookie stored by follow_link() is passed + to this method as the last parameter; only called when + cookie isn't NULL. permission: called by the VFS to check for access rights on a POSIX-like filesystem. diff --git a/Documentation/gpio/consumer.txt b/Documentation/gpio/consumer.txt index c21c1313f09e..bbc8b5888b80 100644 --- a/Documentation/gpio/consumer.txt +++ b/Documentation/gpio/consumer.txt @@ -241,18 +241,18 @@ Set multiple GPIO outputs with a single function call ----------------------------------------------------- The following functions set the output values of an array of GPIOs: - void gpiod_set_array(unsigned int array_size, - struct gpio_desc **desc_array, - int *value_array) - void gpiod_set_raw_array(unsigned int array_size, - struct gpio_desc **desc_array, - int *value_array) - void gpiod_set_array_cansleep(unsigned int array_size, - struct gpio_desc **desc_array, - int *value_array) - void gpiod_set_raw_array_cansleep(unsigned int array_size, - struct gpio_desc **desc_array, - int *value_array) + void gpiod_set_array_value(unsigned int array_size, + struct gpio_desc **desc_array, + int *value_array) + void gpiod_set_raw_array_value(unsigned int array_size, + struct gpio_desc **desc_array, + int *value_array) + void gpiod_set_array_value_cansleep(unsigned int array_size, + struct gpio_desc **desc_array, + int *value_array) + void gpiod_set_raw_array_value_cansleep(unsigned int array_size, + struct gpio_desc **desc_array, + int *value_array) The array can be an arbitrary set of GPIOs. The functions will try to set GPIOs belonging to the same bank or chip simultaneously if supported by the @@ -271,8 +271,8 @@ matches the desired group of GPIOs, those GPIOs can be set by simply using the struct gpio_descs returned by gpiod_get_array(): struct gpio_descs *my_gpio_descs = gpiod_get_array(...); - gpiod_set_array(my_gpio_descs->ndescs, my_gpio_descs->desc, - my_gpio_values); + gpiod_set_array_value(my_gpio_descs->ndescs, my_gpio_descs->desc, + my_gpio_values); It is also possible to set a completely arbitrary array of descriptors. The descriptors may be obtained using any combination of gpiod_get() and diff --git a/Documentation/gpio/gpio-legacy.txt b/Documentation/gpio/gpio-legacy.txt index 6f83fa965b4b..79ab5648d69b 100644 --- a/Documentation/gpio/gpio-legacy.txt +++ b/Documentation/gpio/gpio-legacy.txt @@ -751,9 +751,6 @@ requested using gpio_request(): int gpio_export_link(struct device *dev, const char *name, unsigned gpio) - /* change the polarity of a GPIO node in sysfs */ - int gpio_sysfs_set_active_low(unsigned gpio, int value); - After a kernel driver requests a GPIO, it may only be made available in the sysfs interface by gpio_export(). The driver can control whether the signal direction may change. This helps drivers prevent userspace code @@ -767,9 +764,3 @@ After the GPIO has been exported, gpio_export_link() allows creating symlinks from elsewhere in sysfs to the GPIO sysfs node. Drivers can use this to provide the interface under their own device in sysfs with a descriptive name. - -Drivers can use gpio_sysfs_set_active_low() to hide GPIO line polarity -differences between boards from user space. This only affects the -sysfs interface. Polarity change can be done both before and after -gpio_export(), and previously enabled poll(2) support for either -rising or falling edge will be reconfigured to follow this setting. diff --git a/Documentation/gpio/sysfs.txt b/Documentation/gpio/sysfs.txt index c2c3a97f8ff7..535b6a8a7a7c 100644 --- a/Documentation/gpio/sysfs.txt +++ b/Documentation/gpio/sysfs.txt @@ -132,9 +132,6 @@ requested using gpio_request(): int gpiod_export_link(struct device *dev, const char *name, struct gpio_desc *desc); - /* change the polarity of a GPIO node in sysfs */ - int gpiod_sysfs_set_active_low(struct gpio_desc *desc, int value); - After a kernel driver requests a GPIO, it may only be made available in the sysfs interface by gpiod_export(). The driver can control whether the signal direction may change. This helps drivers prevent userspace code @@ -148,8 +145,3 @@ After the GPIO has been exported, gpiod_export_link() allows creating symlinks from elsewhere in sysfs to the GPIO sysfs node. Drivers can use this to provide the interface under their own device in sysfs with a descriptive name. - -Drivers can use gpiod_sysfs_set_active_low() to hide GPIO line polarity -differences between boards from user space. Polarity change can be done both -before and after gpiod_export(), and previously enabled poll(2) support for -either rising or falling edge will be reconfigured to follow this setting. diff --git a/Documentation/hwmon/ntc_thermistor b/Documentation/hwmon/ntc_thermistor index c5e05e2900a3..1d4cc847c6fe 100644 --- a/Documentation/hwmon/ntc_thermistor +++ b/Documentation/hwmon/ntc_thermistor @@ -2,8 +2,10 @@ Kernel driver ntc_thermistor ================= Supported thermistors from Murata: -* Murata NTC Thermistors NCP15WB473, NCP18WB473, NCP21WB473, NCP03WB473, NCP15WL333 - Prefixes: 'ncp15wb473', 'ncp18wb473', 'ncp21wb473', 'ncp03wb473', 'ncp15wl333' +* Murata NTC Thermistors NCP15WB473, NCP18WB473, NCP21WB473, NCP03WB473, + NCP15WL333, NCP03WF104 + Prefixes: 'ncp15wb473', 'ncp18wb473', 'ncp21wb473', 'ncp03wb473', + 'ncp15wl333', 'ncp03wf104' Datasheet: Publicly available at Murata Supported thermistors from EPCOS: diff --git a/Documentation/hwmon/tc74 b/Documentation/hwmon/tc74 new file mode 100644 index 000000000000..43027aad5f8e --- /dev/null +++ b/Documentation/hwmon/tc74 @@ -0,0 +1,20 @@ +Kernel driver tc74 +==================== + +Supported chips: + * Microchip TC74 + Prefix: 'tc74' + Datasheet: Publicly available at Microchip website. + +Description +----------- + +Driver supports the above part. + +The tc74 has an 8-bit sensor, with 1 degree centigrade resolution +and +- 2 degrees centigrade accuracy. + +Notes +----- + +Currently entering low power standby mode is not supported. diff --git a/Documentation/i2c/slave-interface b/Documentation/i2c/slave-interface index 389bb5d61854..b228ca54bcf4 100644 --- a/Documentation/i2c/slave-interface +++ b/Documentation/i2c/slave-interface @@ -31,10 +31,10 @@ User manual =========== I2C slave backends behave like standard I2C clients. So, you can instantiate -them like described in the document 'instantiating-devices'. A quick example -for instantiating the slave-eeprom driver from userspace: +them as described in the document 'instantiating-devices'. A quick example for +instantiating the slave-eeprom driver from userspace at address 0x64 on bus 1: - # echo 0-0064 > /sys/bus/i2c/drivers/i2c-slave-eeprom/bind + # echo slave-24c02 0x64 > /sys/bus/i2c/devices/i2c-1/new_device Each backend should come with separate documentation to describe its specific behaviour and setup. diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 10f879221821..ae4474940fe2 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -746,6 +746,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted. cpuidle.off=1 [CPU_IDLE] disable the cpuidle sub-system + cpu_init_udelay=N + [X86] Delay for N microsec between assert and de-assert + of APIC INIT to start processors. This delay occurs + on every CPU online, such as boot, and resume from suspend. + Default: 10000 + cpcihp_generic= [HW,PCI] Generic port I/O CompactPCI driver Format: <first_slot>,<last_slot>,<port>,<enum_bit>[,<debug>] @@ -937,6 +943,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted. Enable debug messages at boot time. See Documentation/dynamic-debug-howto.txt for details. + nompx [X86] Disables Intel Memory Protection Extensions. + See Documentation/x86/intel_mpx.txt for more + information about the feature. + eagerfpu= [X86] on enable eager fpu restore off disable eager fpu restore @@ -2998,11 +3008,34 @@ bytes respectively. Such letter suffixes can also be entirely omitted. Set maximum number of finished RCU callbacks to process in one batch. + rcutree.dump_tree= [KNL] + Dump the structure of the rcu_node combining tree + out at early boot. This is used for diagnostic + purposes, to verify correct tree setup. + + rcutree.gp_cleanup_delay= [KNL] + Set the number of jiffies to delay each step of + RCU grace-period cleanup. This only has effect + when CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP is set. + rcutree.gp_init_delay= [KNL] Set the number of jiffies to delay each step of RCU grace-period initialization. This only has - effect when CONFIG_RCU_TORTURE_TEST_SLOW_INIT is - set. + effect when CONFIG_RCU_TORTURE_TEST_SLOW_INIT + is set. + + rcutree.gp_preinit_delay= [KNL] + Set the number of jiffies to delay each step of + RCU grace-period pre-initialization, that is, + the propagation of recent CPU-hotplug changes up + the rcu_node combining tree. This only has effect + when CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT is set. + + rcutree.rcu_fanout_exact= [KNL] + Disable autobalancing of the rcu_node combining + tree. This is used by rcutorture, and might + possibly be useful for architectures having high + cache-to-cache transfer latencies. rcutree.rcu_fanout_leaf= [KNL] Increase the number of CPUs assigned to each @@ -3107,7 +3140,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted. test, hence the "fake". rcutorture.nreaders= [KNL] - Set number of RCU readers. + Set number of RCU readers. The value -1 selects + N-1, where N is the number of CPUs. A value + "n" less than -1 selects N-n-2, where N is again + the number of CPUs. For example, -2 selects N + (the number of CPUs), -3 selects N+1, and so on. rcutorture.object_debug= [KNL] Enable debug-object double-call_rcu() testing. diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index f95746189b5d..13feb697271f 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -617,16 +617,16 @@ case what's actually required is: However, stores are not speculated. This means that ordering -is- provided for load-store control dependencies, as in the following example: - q = ACCESS_ONCE(a); + q = READ_ONCE_CTRL(a); if (q) { ACCESS_ONCE(b) = p; } -Control dependencies pair normally with other types of barriers. -That said, please note that ACCESS_ONCE() is not optional! Without the -ACCESS_ONCE(), might combine the load from 'a' with other loads from -'a', and the store to 'b' with other stores to 'b', with possible highly -counterintuitive effects on ordering. +Control dependencies pair normally with other types of barriers. That +said, please note that READ_ONCE_CTRL() is not optional! Without the +READ_ONCE_CTRL(), the compiler might combine the load from 'a' with +other loads from 'a', and the store to 'b' with other stores to 'b', +with possible highly counterintuitive effects on ordering. Worse yet, if the compiler is able to prove (say) that the value of variable 'a' is always non-zero, it would be well within its rights @@ -636,12 +636,15 @@ as follows: q = a; b = p; /* BUG: Compiler and CPU can both reorder!!! */ -So don't leave out the ACCESS_ONCE(). +Finally, the READ_ONCE_CTRL() includes an smp_read_barrier_depends() +that DEC Alpha needs in order to respect control depedencies. + +So don't leave out the READ_ONCE_CTRL(). It is tempting to try to enforce ordering on identical stores on both branches of the "if" statement as follows: - q = ACCESS_ONCE(a); + q = READ_ONCE_CTRL(a); if (q) { barrier(); ACCESS_ONCE(b) = p; @@ -655,7 +658,7 @@ branches of the "if" statement as follows: Unfortunately, current compilers will transform this as follows at high optimization levels: - q = ACCESS_ONCE(a); + q = READ_ONCE_CTRL(a); barrier(); ACCESS_ONCE(b) = p; /* BUG: No ordering vs. load from a!!! */ if (q) { @@ -685,7 +688,7 @@ memory barriers, for example, smp_store_release(): In contrast, without explicit memory barriers, two-legged-if control ordering is guaranteed only when the stores differ, for example: - q = ACCESS_ONCE(a); + q = READ_ONCE_CTRL(a); if (q) { ACCESS_ONCE(b) = p; do_something(); @@ -694,14 +697,14 @@ ordering is guaranteed only when the stores differ, for example: do_something_else(); } -The initial ACCESS_ONCE() is still required to prevent the compiler from -proving the value of 'a'. +The initial READ_ONCE_CTRL() is still required to prevent the compiler +from proving the value of 'a'. In addition, you need to be careful what you do with the local variable 'q', otherwise the compiler might be able to guess the value and again remove the needed conditional. For example: - q = ACCESS_ONCE(a); + q = READ_ONCE_CTRL(a); if (q % MAX) { ACCESS_ONCE(b) = p; do_something(); @@ -714,7 +717,7 @@ If MAX is defined to be 1, then the compiler knows that (q % MAX) is equal to zero, in which case the compiler is within its rights to transform the above code into the following: - q = ACCESS_ONCE(a); + q = READ_ONCE_CTRL(a); ACCESS_ONCE(b) = p; do_something_else(); @@ -725,7 +728,7 @@ is gone, and the barrier won't bring it back. Therefore, if you are relying on this ordering, you should make sure that MAX is greater than one, perhaps as follows: - q = ACCESS_ONCE(a); + q = READ_ONCE_CTRL(a); BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */ if (q % MAX) { ACCESS_ONCE(b) = p; @@ -742,14 +745,15 @@ of the 'if' statement. You must also be careful not to rely too much on boolean short-circuit evaluation. Consider this example: - q = ACCESS_ONCE(a); + q = READ_ONCE_CTRL(a); if (a || 1 > 0) ACCESS_ONCE(b) = 1; -Because the second condition is always true, the compiler can transform -this example as following, defeating control dependency: +Because the first condition cannot fault and the second condition is +always true, the compiler can transform this example as following, +defeating control dependency: - q = ACCESS_ONCE(a); + q = READ_ONCE_CTRL(a); ACCESS_ONCE(b) = 1; This example underscores the need to ensure that the compiler cannot @@ -762,8 +766,8 @@ demonstrated by two related examples, with the initial values of x and y both being zero: CPU 0 CPU 1 - ===================== ===================== - r1 = ACCESS_ONCE(x); r2 = ACCESS_ONCE(y); + ======================= ======================= + r1 = READ_ONCE_CTRL(x); r2 = READ_ONCE_CTRL(y); if (r1 > 0) if (r2 > 0) ACCESS_ONCE(y) = 1; ACCESS_ONCE(x) = 1; @@ -783,7 +787,8 @@ But because control dependencies do -not- provide transitivity, the above assertion can fail after the combined three-CPU example completes. If you need the three-CPU example to provide ordering, you will need smp_mb() between the loads and stores in the CPU 0 and CPU 1 code fragments, -that is, just before or just after the "if" statements. +that is, just before or just after the "if" statements. Furthermore, +the original two-CPU example is very fragile and should be avoided. These two examples are the LB and WWC litmus tests from this paper: http://www.cl.cam.ac.uk/users/pes20/ppc-supplemental/test6.pdf and this @@ -791,6 +796,12 @@ site: https://www.cl.cam.ac.uk/~pes20/ppcmem/index.html. In summary: + (*) Control dependencies must be headed by READ_ONCE_CTRL(). + Or, as a much less preferable alternative, interpose + be headed by READ_ONCE() or an ACCESS_ONCE() read and must + have smp_read_barrier_depends() between this read and the + control-dependent write. + (*) Control dependencies can order prior loads against later stores. However, they do -not- guarantee any other sort of ordering: Not prior loads against later loads, nor prior stores against @@ -1662,7 +1673,7 @@ CPU from reordering them. There are some more advanced barrier functions: - (*) set_mb(var, value) + (*) smp_store_mb(var, value) This assigns the value to the variable and then inserts a full memory barrier after it, depending on the function. It isn't guaranteed to @@ -1784,10 +1795,9 @@ for each construct. These operations all imply certain barriers: Memory operations issued before the ACQUIRE may be completed after the ACQUIRE operation has completed. An smp_mb__before_spinlock(), - combined with a following ACQUIRE, orders prior loads against - subsequent loads and stores and also orders prior stores against - subsequent stores. Note that this is weaker than smp_mb()! The - smp_mb__before_spinlock() primitive is free on many architectures. + combined with a following ACQUIRE, orders prior stores against + subsequent loads and stores. Note that this is weaker than smp_mb()! + The smp_mb__before_spinlock() primitive is free on many architectures. (2) RELEASE operation implication: @@ -1975,7 +1985,7 @@ after it has altered the task state: CPU 1 =============================== set_current_state(); - set_mb(); + smp_store_mb(); STORE current->state <general barrier> LOAD event_indicated @@ -2016,7 +2026,7 @@ between the STORE to indicate the event and the STORE to set TASK_RUNNING: CPU 1 CPU 2 =============================== =============================== set_current_state(); STORE event_indicated - set_mb(); wake_up(); + smp_store_mb(); wake_up(); STORE current->state <write barrier> <general barrier> STORE current->state LOAD event_indicated diff --git a/Documentation/preempt-locking.txt b/Documentation/preempt-locking.txt index 57883ca2498b..e89ce6624af2 100644 --- a/Documentation/preempt-locking.txt +++ b/Documentation/preempt-locking.txt @@ -48,7 +48,7 @@ preemption must be disabled around such regions. Note, some FPU functions are already explicitly preempt safe. For example, kernel_fpu_begin and kernel_fpu_end will disable and enable preemption. -However, math_state_restore must be called with preemption disabled. +However, fpu__restore() must be called with preemption disabled. RULE #3: Lock acquire and release must be performed by same task diff --git a/Documentation/scheduler/sched-deadline.txt b/Documentation/scheduler/sched-deadline.txt index 21461a0441c1..e114513a2731 100644 --- a/Documentation/scheduler/sched-deadline.txt +++ b/Documentation/scheduler/sched-deadline.txt @@ -8,6 +8,10 @@ CONTENTS 1. Overview 2. Scheduling algorithm 3. Scheduling Real-Time Tasks + 3.1 Definitions + 3.2 Schedulability Analysis for Uniprocessor Systems + 3.3 Schedulability Analysis for Multiprocessor Systems + 3.4 Relationship with SCHED_DEADLINE Parameters 4. Bandwidth management 4.1 System-wide settings 4.2 Task interface @@ -43,7 +47,7 @@ CONTENTS "deadline", to schedule tasks. A SCHED_DEADLINE task should receive "runtime" microseconds of execution time every "period" microseconds, and these "runtime" microseconds are available within "deadline" microseconds - from the beginning of the period. In order to implement this behaviour, + from the beginning of the period. In order to implement this behavior, every time the task wakes up, the scheduler computes a "scheduling deadline" consistent with the guarantee (using the CBS[2,3] algorithm). Tasks are then scheduled using EDF[1] on these scheduling deadlines (the task with the @@ -52,7 +56,7 @@ CONTENTS "admission control" strategy (see Section "4. Bandwidth management") is used (clearly, if the system is overloaded this guarantee cannot be respected). - Summing up, the CBS[2,3] algorithms assigns scheduling deadlines to tasks so + Summing up, the CBS[2,3] algorithm assigns scheduling deadlines to tasks so that each task runs for at most its runtime every period, avoiding any interference between different tasks (bandwidth isolation), while the EDF[1] algorithm selects the task with the earliest scheduling deadline as the one @@ -63,7 +67,7 @@ CONTENTS In more details, the CBS algorithm assigns scheduling deadlines to tasks in the following way: - - Each SCHED_DEADLINE task is characterised by the "runtime", + - Each SCHED_DEADLINE task is characterized by the "runtime", "deadline", and "period" parameters; - The state of the task is described by a "scheduling deadline", and @@ -78,7 +82,7 @@ CONTENTS then, if the scheduling deadline is smaller than the current time, or this condition is verified, the scheduling deadline and the - remaining runtime are re-initialised as + remaining runtime are re-initialized as scheduling deadline = current time + deadline remaining runtime = runtime @@ -126,31 +130,37 @@ CONTENTS suited for periodic or sporadic real-time tasks that need guarantees on their timing behavior, e.g., multimedia, streaming, control applications, etc. +3.1 Definitions +------------------------ + A typical real-time task is composed of a repetition of computation phases (task instances, or jobs) which are activated on a periodic or sporadic fashion. - Each job J_j (where J_j is the j^th job of the task) is characterised by an + Each job J_j (where J_j is the j^th job of the task) is characterized by an arrival time r_j (the time when the job starts), an amount of computation time c_j needed to finish the job, and a job absolute deadline d_j, which is the time within which the job should be finished. The maximum execution - time max_j{c_j} is called "Worst Case Execution Time" (WCET) for the task. + time max{c_j} is called "Worst Case Execution Time" (WCET) for the task. A real-time task can be periodic with period P if r_{j+1} = r_j + P, or sporadic with minimum inter-arrival time P is r_{j+1} >= r_j + P. Finally, d_j = r_j + D, where D is the task's relative deadline. - The utilisation of a real-time task is defined as the ratio between its + Summing up, a real-time task can be described as + Task = (WCET, D, P) + + The utilization of a real-time task is defined as the ratio between its WCET and its period (or minimum inter-arrival time), and represents the fraction of CPU time needed to execute the task. - If the total utilisation sum_i(WCET_i/P_i) is larger than M (with M equal + If the total utilization U=sum(WCET_i/P_i) is larger than M (with M equal to the number of CPUs), then the scheduler is unable to respect all the deadlines. - Note that total utilisation is defined as the sum of the utilisations + Note that total utilization is defined as the sum of the utilizations WCET_i/P_i over all the real-time tasks in the system. When considering multiple real-time tasks, the parameters of the i-th task are indicated with the "_i" suffix. - Moreover, if the total utilisation is larger than M, then we risk starving + Moreover, if the total utilization is larger than M, then we risk starving non- real-time tasks by real-time tasks. - If, instead, the total utilisation is smaller than M, then non real-time + If, instead, the total utilization is smaller than M, then non real-time tasks will not be starved and the system might be able to respect all the deadlines. As a matter of fact, in this case it is possible to provide an upper bound @@ -159,38 +169,119 @@ CONTENTS More precisely, it can be proven that using a global EDF scheduler the maximum tardiness of each task is smaller or equal than ((M − 1) · WCET_max − WCET_min)/(M − (M − 2) · U_max) + WCET_max - where WCET_max = max_i{WCET_i} is the maximum WCET, WCET_min=min_i{WCET_i} - is the minimum WCET, and U_max = max_i{WCET_i/P_i} is the maximum utilisation. + where WCET_max = max{WCET_i} is the maximum WCET, WCET_min=min{WCET_i} + is the minimum WCET, and U_max = max{WCET_i/P_i} is the maximum + utilization[12]. + +3.2 Schedulability Analysis for Uniprocessor Systems +------------------------ If M=1 (uniprocessor system), or in case of partitioned scheduling (each real-time task is statically assigned to one and only one CPU), it is possible to formally check if all the deadlines are respected. If D_i = P_i for all tasks, then EDF is able to respect all the deadlines - of all the tasks executing on a CPU if and only if the total utilisation + of all the tasks executing on a CPU if and only if the total utilization of the tasks running on such a CPU is smaller or equal than 1. If D_i != P_i for some task, then it is possible to define the density of - a task as C_i/min{D_i,T_i}, and EDF is able to respect all the deadlines - of all the tasks running on a CPU if the sum sum_i C_i/min{D_i,T_i} of the - densities of the tasks running on such a CPU is smaller or equal than 1 - (notice that this condition is only sufficient, and not necessary). + a task as WCET_i/min{D_i,P_i}, and EDF is able to respect all the deadlines + of all the tasks running on a CPU if the sum of the densities of the tasks + running on such a CPU is smaller or equal than 1: + sum(WCET_i / min{D_i, P_i}) <= 1 + It is important to notice that this condition is only sufficient, and not + necessary: there are task sets that are schedulable, but do not respect the + condition. For example, consider the task set {Task_1,Task_2} composed by + Task_1=(50ms,50ms,100ms) and Task_2=(10ms,100ms,100ms). + EDF is clearly able to schedule the two tasks without missing any deadline + (Task_1 is scheduled as soon as it is released, and finishes just in time + to respect its deadline; Task_2 is scheduled immediately after Task_1, hence + its response time cannot be larger than 50ms + 10ms = 60ms) even if + 50 / min{50,100} + 10 / min{100, 100} = 50 / 50 + 10 / 100 = 1.1 + Of course it is possible to test the exact schedulability of tasks with + D_i != P_i (checking a condition that is both sufficient and necessary), + but this cannot be done by comparing the total utilization or density with + a constant. Instead, the so called "processor demand" approach can be used, + computing the total amount of CPU time h(t) needed by all the tasks to + respect all of their deadlines in a time interval of size t, and comparing + such a time with the interval size t. If h(t) is smaller than t (that is, + the amount of time needed by the tasks in a time interval of size t is + smaller than the size of the interval) for all the possible values of t, then + EDF is able to schedule the tasks respecting all of their deadlines. Since + performing this check for all possible values of t is impossible, it has been + proven[4,5,6] that it is sufficient to perform the test for values of t + between 0 and a maximum value L. The cited papers contain all of the + mathematical details and explain how to compute h(t) and L. + In any case, this kind of analysis is too complex as well as too + time-consuming to be performed on-line. Hence, as explained in Section + 4 Linux uses an admission test based on the tasks' utilizations. + +3.3 Schedulability Analysis for Multiprocessor Systems +------------------------ On multiprocessor systems with global EDF scheduling (non partitioned systems), a sufficient test for schedulability can not be based on the - utilisations (it can be shown that task sets with utilisations slightly - larger than 1 can miss deadlines regardless of the number of CPUs M). - However, as previously stated, enforcing that the total utilisation is smaller - than M is enough to guarantee that non real-time tasks are not starved and - that the tardiness of real-time tasks has an upper bound. + utilizations or densities: it can be shown that even if D_i = P_i task + sets with utilizations slightly larger than 1 can miss deadlines regardless + of the number of CPUs. + + Consider a set {Task_1,...Task_{M+1}} of M+1 tasks on a system with M + CPUs, with the first task Task_1=(P,P,P) having period, relative deadline + and WCET equal to P. The remaining M tasks Task_i=(e,P-1,P-1) have an + arbitrarily small worst case execution time (indicated as "e" here) and a + period smaller than the one of the first task. Hence, if all the tasks + activate at the same time t, global EDF schedules these M tasks first + (because their absolute deadlines are equal to t + P - 1, hence they are + smaller than the absolute deadline of Task_1, which is t + P). As a + result, Task_1 can be scheduled only at time t + e, and will finish at + time t + e + P, after its absolute deadline. The total utilization of the + task set is U = M · e / (P - 1) + P / P = M · e / (P - 1) + 1, and for small + values of e this can become very close to 1. This is known as "Dhall's + effect"[7]. Note: the example in the original paper by Dhall has been + slightly simplified here (for example, Dhall more correctly computed + lim_{e->0}U). + + More complex schedulability tests for global EDF have been developed in + real-time literature[8,9], but they are not based on a simple comparison + between total utilization (or density) and a fixed constant. If all tasks + have D_i = P_i, a sufficient schedulability condition can be expressed in + a simple way: + sum(WCET_i / P_i) <= M - (M - 1) · U_max + where U_max = max{WCET_i / P_i}[10]. Notice that for U_max = 1, + M - (M - 1) · U_max becomes M - M + 1 = 1 and this schedulability condition + just confirms the Dhall's effect. A more complete survey of the literature + about schedulability tests for multi-processor real-time scheduling can be + found in [11]. + + As seen, enforcing that the total utilization is smaller than M does not + guarantee that global EDF schedules the tasks without missing any deadline + (in other words, global EDF is not an optimal scheduling algorithm). However, + a total utilization smaller than M is enough to guarantee that non real-time + tasks are not starved and that the tardiness of real-time tasks has an upper + bound[12] (as previously noted). Different bounds on the maximum tardiness + experienced by real-time tasks have been developed in various papers[13,14], + but the theoretical result that is important for SCHED_DEADLINE is that if + the total utilization is smaller or equal than M then the response times of + the tasks are limited. + +3.4 Relationship with SCHED_DEADLINE Parameters +------------------------ - SCHED_DEADLINE can be used to schedule real-time tasks guaranteeing that - the jobs' deadlines of a task are respected. In order to do this, a task - must be scheduled by setting: + Finally, it is important to understand the relationship between the + SCHED_DEADLINE scheduling parameters described in Section 2 (runtime, + deadline and period) and the real-time task parameters (WCET, D, P) + described in this section. Note that the tasks' temporal constraints are + represented by its absolute deadlines d_j = r_j + D described above, while + SCHED_DEADLINE schedules the tasks according to scheduling deadlines (see + Section 2). + If an admission test is used to guarantee that the scheduling deadlines + are respected, then SCHED_DEADLINE can be used to schedule real-time tasks + guaranteeing that all the jobs' deadlines of a task are respected. + In order to do this, a task must be scheduled by setting: - runtime >= WCET - deadline = D - period <= P - IOW, if runtime >= WCET and if period is >= P, then the scheduling deadlines + IOW, if runtime >= WCET and if period is <= P, then the scheduling deadlines and the absolute deadlines (d_j) coincide, so a proper admission control allows to respect the jobs' absolute deadlines for this task (this is what is called "hard schedulability property" and is an extension of Lemma 1 of [2]). @@ -206,6 +297,39 @@ CONTENTS Symposium, 1998. http://retis.sssup.it/~giorgio/paps/1998/rtss98-cbs.pdf 3 - L. Abeni. Server Mechanisms for Multimedia Applications. ReTiS Lab Technical Report. http://disi.unitn.it/~abeni/tr-98-01.pdf + 4 - J. Y. Leung and M.L. Merril. A Note on Preemptive Scheduling of + Periodic, Real-Time Tasks. Information Processing Letters, vol. 11, + no. 3, pp. 115-118, 1980. + 5 - S. K. Baruah, A. K. Mok and L. E. Rosier. Preemptively Scheduling + Hard-Real-Time Sporadic Tasks on One Processor. Proceedings of the + 11th IEEE Real-time Systems Symposium, 1990. + 6 - S. K. Baruah, L. E. Rosier and R. R. Howell. Algorithms and Complexity + Concerning the Preemptive Scheduling of Periodic Real-Time tasks on + One Processor. Real-Time Systems Journal, vol. 4, no. 2, pp 301-324, + 1990. + 7 - S. J. Dhall and C. L. Liu. On a real-time scheduling problem. Operations + research, vol. 26, no. 1, pp 127-140, 1978. + 8 - T. Baker. Multiprocessor EDF and Deadline Monotonic Schedulability + Analysis. Proceedings of the 24th IEEE Real-Time Systems Symposium, 2003. + 9 - T. Baker. An Analysis of EDF Schedulability on a Multiprocessor. + IEEE Transactions on Parallel and Distributed Systems, vol. 16, no. 8, + pp 760-768, 2005. + 10 - J. Goossens, S. Funk and S. Baruah, Priority-Driven Scheduling of + Periodic Task Systems on Multiprocessors. Real-Time Systems Journal, + vol. 25, no. 2–3, pp. 187–205, 2003. + 11 - R. Davis and A. Burns. A Survey of Hard Real-Time Scheduling for + Multiprocessor Systems. ACM Computing Surveys, vol. 43, no. 4, 2011. + http://www-users.cs.york.ac.uk/~robdavis/papers/MPSurveyv5.0.pdf + 12 - U. C. Devi and J. H. Anderson. Tardiness Bounds under Global EDF + Scheduling on a Multiprocessor. Real-Time Systems Journal, vol. 32, + no. 2, pp 133-189, 2008. + 13 - P. Valente and G. Lipari. An Upper Bound to the Lateness of Soft + Real-Time Tasks Scheduled by EDF on Multiprocessors. Proceedings of + the 26th IEEE Real-Time Systems Symposium, 2005. + 14 - J. Erickson, U. Devi and S. Baruah. Improved tardiness bounds for + Global EDF. Proceedings of the 22nd Euromicro Conference on + Real-Time Systems, 2010. + 4. Bandwidth management ======================= @@ -218,10 +342,10 @@ CONTENTS no guarantee can be given on the actual scheduling of the -deadline tasks. As already stated in Section 3, a necessary condition to be respected to - correctly schedule a set of real-time tasks is that the total utilisation + correctly schedule a set of real-time tasks is that the total utilization is smaller than M. When talking about -deadline tasks, this requires that the sum of the ratio between runtime and period for all tasks is smaller - than M. Notice that the ratio runtime/period is equivalent to the utilisation + than M. Notice that the ratio runtime/period is equivalent to the utilization of a "traditional" real-time task, and is also often referred to as "bandwidth". The interface used to control the CPU bandwidth that can be allocated @@ -251,7 +375,7 @@ CONTENTS The system wide settings are configured under the /proc virtual file system. For now the -rt knobs are used for -deadline admission control and the - -deadline runtime is accounted against the -rt runtime. We realise that this + -deadline runtime is accounted against the -rt runtime. We realize that this isn't entirely desirable; however, it is better to have a small interface for now, and be able to change it easily later. The ideal situation (see 5.) is to run -rt tasks from a -deadline server; in which case the -rt bandwidth is a diff --git a/Documentation/x86/boot.txt b/Documentation/x86/boot.txt index 88b85899d309..7c1f9fad6674 100644 --- a/Documentation/x86/boot.txt +++ b/Documentation/x86/boot.txt @@ -1124,7 +1124,6 @@ The boot loader *must* fill out the following fields in bp, o hdr.code32_start o hdr.cmd_line_ptr - o hdr.cmdline_size o hdr.ramdisk_image (if applicable) o hdr.ramdisk_size (if applicable) diff --git a/Documentation/x86/entry_64.txt b/Documentation/x86/entry_64.txt index 9132b86176a3..33884d156125 100644 --- a/Documentation/x86/entry_64.txt +++ b/Documentation/x86/entry_64.txt @@ -18,10 +18,10 @@ Some of these entries are: - system_call: syscall instruction from 64-bit code. - - ia32_syscall: int 0x80 from 32-bit or 64-bit code; compat syscall + - entry_INT80_compat: int 0x80 from 32-bit or 64-bit code; compat syscall either way. - - ia32_syscall, ia32_sysenter: syscall and sysenter from 32-bit + - entry_INT80_compat, ia32_sysenter: syscall and sysenter from 32-bit code - interrupt: An array of entries. Every IDT vector that doesn't diff --git a/Documentation/x86/x86_64/kernel-stacks b/Documentation/x86/kernel-stacks index e3c8a49d1a2f..0f3a6c201943 100644 --- a/Documentation/x86/x86_64/kernel-stacks +++ b/Documentation/x86/kernel-stacks @@ -1,3 +1,6 @@ +Kernel stacks on x86-64 bit +--------------------------- + Most of the text from Keith Owens, hacked by AK x86_64 page size (PAGE_SIZE) is 4K. @@ -56,13 +59,6 @@ If that assumption is ever broken then the stacks will become corrupt. The currently assigned IST stacks are :- -* STACKFAULT_STACK. EXCEPTION_STKSZ (PAGE_SIZE). - - Used for interrupt 12 - Stack Fault Exception (#SS). - - This allows the CPU to recover from invalid stack segments. Rarely - happens. - * DOUBLEFAULT_STACK. EXCEPTION_STKSZ (PAGE_SIZE). Used for interrupt 8 - Double Fault Exception (#DF). @@ -99,3 +95,47 @@ The currently assigned IST stacks are :- assumptions about the previous state of the kernel stack. For more details see the Intel IA32 or AMD AMD64 architecture manuals. + + +Printing backtraces on x86 +-------------------------- + +The question about the '?' preceding function names in an x86 stacktrace +keeps popping up, here's an indepth explanation. It helps if the reader +stares at print_context_stack() and the whole machinery in and around +arch/x86/kernel/dumpstack.c. + +Adapted from Ingo's mail, Message-ID: <20150521101614.GA10889@gmail.com>: + +We always scan the full kernel stack for return addresses stored on +the kernel stack(s) [*], from stack top to stack bottom, and print out +anything that 'looks like' a kernel text address. + +If it fits into the frame pointer chain, we print it without a question +mark, knowing that it's part of the real backtrace. + +If the address does not fit into our expected frame pointer chain we +still print it, but we print a '?'. It can mean two things: + + - either the address is not part of the call chain: it's just stale + values on the kernel stack, from earlier function calls. This is + the common case. + + - or it is part of the call chain, but the frame pointer was not set + up properly within the function, so we don't recognize it. + +This way we will always print out the real call chain (plus a few more +entries), regardless of whether the frame pointer was set up correctly +or not - but in most cases we'll get the call chain right as well. The +entries printed are strictly in stack order, so you can deduce more +information from that as well. + +The most important property of this method is that we _never_ lose +information: we always strive to print _all_ addresses on the stack(s) +that look like kernel text addresses, so if debug information is wrong, +we still print out the real call chain as well - just with more question +marks than ideal. + +[*] For things like IRQ and IST stacks, we also scan those stacks, in + the right order, and try to cross from one stack into another + reconstructing the call chain. This works most of the time. diff --git a/Documentation/x86/mtrr.txt b/Documentation/x86/mtrr.txt index cc071dc333c2..860bc3adc223 100644 --- a/Documentation/x86/mtrr.txt +++ b/Documentation/x86/mtrr.txt @@ -1,7 +1,19 @@ MTRR (Memory Type Range Register) control -3 Jun 1999 -Richard Gooch -<rgooch@atnf.csiro.au> + +Richard Gooch <rgooch@atnf.csiro.au> - 3 Jun 1999 +Luis R. Rodriguez <mcgrof@do-not-panic.com> - April 9, 2015 + +=============================================================================== +Phasing out MTRR use + +MTRR use is replaced on modern x86 hardware with PAT. Over time the only type +of effective MTRR that is expected to be supported will be for write-combining. +As MTRR use is phased out device drivers should use arch_phys_wc_add() to make +MTRR effective on non-PAT systems while a no-op on PAT enabled systems. + +For details refer to Documentation/x86/pat.txt. + +=============================================================================== On Intel P6 family processors (Pentium Pro, Pentium II and later) the Memory Type Range Registers (MTRRs) may be used to control diff --git a/Documentation/x86/pat.txt b/Documentation/x86/pat.txt index cf08c9fff3cd..54944c71b819 100644 --- a/Documentation/x86/pat.txt +++ b/Documentation/x86/pat.txt @@ -12,7 +12,7 @@ virtual addresses. PAT allows for different types of memory attributes. The most commonly used ones that will be supported at this time are Write-back, Uncached, -Write-combined and Uncached Minus. +Write-combined, Write-through and Uncached Minus. PAT APIs @@ -34,16 +34,23 @@ ioremap | -- | UC- | UC- | | | | | ioremap_cache | -- | WB | WB | | | | | +ioremap_uc | -- | UC | UC | + | | | | ioremap_nocache | -- | UC- | UC- | | | | | ioremap_wc | -- | -- | WC | | | | | +ioremap_wt | -- | -- | WT | + | | | | set_memory_uc | UC- | -- | -- | set_memory_wb | | | | | | | | set_memory_wc | WC | -- | -- | set_memory_wb | | | | | | | | +set_memory_wt | WT | -- | -- | + set_memory_wb | | | | + | | | | pci sysfs resource | -- | -- | UC- | | | | | pci sysfs resource_wc | -- | -- | WC | @@ -102,7 +109,38 @@ wants to export a RAM region, it has to do set_memory_uc() or set_memory_wc() as step 0 above and also track the usage of those pages and use set_memory_wb() before the page is freed to free pool. - +MTRR effects on PAT / non-PAT systems +------------------------------------- + +The following table provides the effects of using write-combining MTRRs when +using ioremap*() calls on x86 for both non-PAT and PAT systems. Ideally +mtrr_add() usage will be phased out in favor of arch_phys_wc_add() which will +be a no-op on PAT enabled systems. The region over which a arch_phys_wc_add() +is made, should already have been ioremapped with WC attributes or PAT entries, +this can be done by using ioremap_wc() / set_memory_wc(). Devices which +combine areas of IO memory desired to remain uncacheable with areas where +write-combining is desirable should consider use of ioremap_uc() followed by +set_memory_wc() to white-list effective write-combined areas. Such use is +nevertheless discouraged as the effective memory type is considered +implementation defined, yet this strategy can be used as last resort on devices +with size-constrained regions where otherwise MTRR write-combining would +otherwise not be effective. + +---------------------------------------------------------------------- +MTRR Non-PAT PAT Linux ioremap value Effective memory type +---------------------------------------------------------------------- + Non-PAT | PAT + PAT + |PCD + ||PWT + ||| +WC 000 WB _PAGE_CACHE_MODE_WB WC | WC +WC 001 WC _PAGE_CACHE_MODE_WC WC* | WC +WC 010 UC- _PAGE_CACHE_MODE_UC_MINUS WC* | UC +WC 011 UC _PAGE_CACHE_MODE_UC UC | UC +---------------------------------------------------------------------- + +(*) denotes implementation defined and is discouraged Notes: @@ -115,8 +153,8 @@ can be more restrictive, in case of any existing aliasing for that address. For example: If there is an existing uncached mapping, a new ioremap_wc can return uncached mapping in place of write-combine requested. -set_memory_[uc|wc] and set_memory_wb should be used in pairs, where driver will -first make a region uc or wc and switch it back to wb after use. +set_memory_[uc|wc|wt] and set_memory_wb should be used in pairs, where driver +will first make a region uc, wc or wt and switch it back to wb after use. Over time writes to /proc/mtrr will be deprecated in favor of using PAT based interfaces. Users writing to /proc/mtrr are suggested to use above interfaces. @@ -124,7 +162,7 @@ interfaces. Users writing to /proc/mtrr are suggested to use above interfaces. Drivers should use ioremap_[uc|wc] to access PCI BARs with [uc|wc] access types. -Drivers should use set_memory_[uc|wc] to set access type for RAM ranges. +Drivers should use set_memory_[uc|wc|wt] to set access type for RAM ranges. PAT debugging diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt index 5223479291a2..68ed3114c363 100644 --- a/Documentation/x86/x86_64/boot-options.txt +++ b/Documentation/x86/x86_64/boot-options.txt @@ -31,6 +31,9 @@ Machine check (e.g. BIOS or hardware monitoring applications), conflicting with OS's error handling, and you cannot deactivate the agent, then this option will be a help. + mce=no_lmce + Do not opt-in to Local MCE delivery. Use legacy method + to broadcast MCEs. mce=bootlog Enable logging of machine checks left over from booting. Disabled by default on AMD because some BIOS leave bogus ones. diff --git a/Documentation/zh_CN/gpio.txt b/Documentation/zh_CN/gpio.txt index d5b8f01833f4..bce972521065 100644 --- a/Documentation/zh_CN/gpio.txt +++ b/Documentation/zh_CN/gpio.txt @@ -638,9 +638,6 @@ GPIO 控制器的路径类似 /sys/class/gpio/gpiochip42/ (对于从#42 GPIO int gpio_export_link(struct device *dev, const char *name, unsigned gpio) - /* 改变 sysfs 中的一个 GPIO 节点的极性 */ - int gpio_sysfs_set_active_low(unsigned gpio, int value); - 在一个内核驱动申请一个 GPIO 之后,它可以通过 gpio_export()使其在 sysfs 接口中可见。该驱动可以控制信号方向是否可修改。这有助于防止用户空间代码无意间 破坏重要的系统状态。 @@ -651,8 +648,3 @@ GPIO 控制器的路径类似 /sys/class/gpio/gpiochip42/ (对于从#42 GPIO 在 GPIO 被导出之后,gpio_export_link()允许在 sysfs 文件系统的任何地方 创建一个到这个 GPIO sysfs 节点的符号链接。这样驱动就可以通过一个描述性的 名字,在 sysfs 中他们所拥有的设备下提供一个(到这个 GPIO sysfs 节点的)接口。 - -驱动可以使用 gpio_sysfs_set_active_low() 来在用户空间隐藏电路板之间 -GPIO 线的极性差异。这个仅对 sysfs 接口起作用。极性的改变可以在 gpio_export() -前后进行,且之前使能的轮询操作(poll(2))支持(上升或下降沿)将会被重新配置来遵循 -这个设置。 |