diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2017-02-22 18:51:29 -0800 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2017-02-22 18:51:29 -0800 |
commit | c1aac62f36c1e37ee81c9e09ee9ee733eef05dcb (patch) | |
tree | b400b92c44faf7da37d37138145e895a55eaa4cc /Documentation | |
parent | fd7e9a88348472521d999434ee02f25735c7dadf (diff) | |
parent | bd8562626c8e170691d6457fe4e8dfb45607a48d (diff) | |
download | linux-c1aac62f36c1e37ee81c9e09ee9ee733eef05dcb.tar.bz2 |
Merge tag 'docs-4.11' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
"A slightly quieter cycle for documentation this time around.
Three more DocBook template files have been converted to RST; only 21
to go. There are various build improvements and the usual array of
documentation improvements and fixes"
* tag 'docs-4.11' of git://git.lwn.net/linux: (44 commits)
docs / driver-api: Fix structure references in device_link.rst
PM / docs: Fix structure references in device.rst
Add a target to check broken external links in the Documentation
Documentation: Fix linux-api list typo
Documentation: DocBook/Makefile comment typo
Improve sparse documentation
Documentation: make Makefile.sphinx no-ops quieter
Documentation: DMA-ISA-LPC.txt
Documentation: input: fix path to input code definitions
docs: Remove the copyright year from conf.py
docs: Fix a warning in the Korean HOWTO.rst translation
PM / sleep / docs: Convert PM notifiers document to reST
PM / core / docs: Convert sleep states API document to reST
PM / core: Update kerneldoc comments in pm.h
doc-rst: Fix recursive make invocation from macros
doc-rst: Delete output of failed dot-SVG conversion
doc-rst: Break shell command sequences on failure
Documentation/sphinx: make targets independent of Sphinx work for HAVE_SPHINX=0
doc-rst: fixed cleandoc target when used with O=dir
Documentation/sphinx: prevent generation of .pyc files in the source tree
...
Diffstat (limited to 'Documentation')
53 files changed, 3151 insertions, 3420 deletions
diff --git a/Documentation/DMA-ISA-LPC.txt b/Documentation/DMA-ISA-LPC.txt index b1a19835e907..c41331398752 100644 --- a/Documentation/DMA-ISA-LPC.txt +++ b/Documentation/DMA-ISA-LPC.txt @@ -42,7 +42,7 @@ requirements you pass the flag GFP_DMA to kmalloc. Unfortunately the memory available for ISA DMA is scarce so unless you allocate the memory during boot-up it's a good idea to also pass -__GFP_REPEAT and __GFP_NOWARN to make the allocater try a bit harder. +__GFP_REPEAT and __GFP_NOWARN to make the allocator try a bit harder. (This scarcity also means that you should allocate the buffer as early as possible and not release it until the driver is unloaded.) diff --git a/Documentation/DocBook/Makefile b/Documentation/DocBook/Makefile index 5fd8f5effd0c..60a17b7da834 100644 --- a/Documentation/DocBook/Makefile +++ b/Documentation/DocBook/Makefile @@ -13,7 +13,7 @@ DOCBOOKS := z8530book.xml \ gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \ genericirq.xml s390-drivers.xml scsi.xml \ sh.xml regulator.xml w1.xml \ - writing_musb_glue_layer.xml iio.xml + writing_musb_glue_layer.xml ifeq ($(DOCBOOKS),) @@ -71,6 +71,7 @@ installmandocs: mandocs # no-op for the DocBook toolchain epubdocs: latexdocs: +linkcheckdocs: ### #External programs used @@ -272,6 +273,6 @@ cleandocs: $(Q)rm -rf $(call objectify, $(clean-dirs)) # Declare the contents of the .PHONY variable as phony. We keep that -# information in a variable se we can use it in if_changed and friends. +# information in a variable so we can use it in if_changed and friends. .PHONY: $(PHONY) diff --git a/Documentation/DocBook/deviceiobook.tmpl b/Documentation/DocBook/deviceiobook.tmpl deleted file mode 100644 index 54199a0dcf9a..000000000000 --- a/Documentation/DocBook/deviceiobook.tmpl +++ /dev/null @@ -1,323 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" - "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> - -<book id="DoingIO"> - <bookinfo> - <title>Bus-Independent Device Accesses</title> - - <authorgroup> - <author> - <firstname>Matthew</firstname> - <surname>Wilcox</surname> - <affiliation> - <address> - <email>matthew@wil.cx</email> - </address> - </affiliation> - </author> - </authorgroup> - - <authorgroup> - <author> - <firstname>Alan</firstname> - <surname>Cox</surname> - <affiliation> - <address> - <email>alan@lxorguk.ukuu.org.uk</email> - </address> - </affiliation> - </author> - </authorgroup> - - <copyright> - <year>2001</year> - <holder>Matthew Wilcox</holder> - </copyright> - - <legalnotice> - <para> - This documentation is free software; you can redistribute - it and/or modify it under the terms of the GNU General Public - License as published by the Free Software Foundation; either - version 2 of the License, or (at your option) any later - version. - </para> - - <para> - This program is distributed in the hope that it will be - useful, but WITHOUT ANY WARRANTY; without even the implied - warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. - See the GNU General Public License for more details. - </para> - - <para> - You should have received a copy of the GNU General Public - License along with this program; if not, write to the Free - Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, - MA 02111-1307 USA - </para> - - <para> - For more details see the file COPYING in the source - distribution of Linux. - </para> - </legalnotice> - </bookinfo> - -<toc></toc> - - <chapter id="intro"> - <title>Introduction</title> - <para> - Linux provides an API which abstracts performing IO across all busses - and devices, allowing device drivers to be written independently of - bus type. - </para> - </chapter> - - <chapter id="bugs"> - <title>Known Bugs And Assumptions</title> - <para> - None. - </para> - </chapter> - - <chapter id="mmio"> - <title>Memory Mapped IO</title> - <sect1 id="getting_access_to_the_device"> - <title>Getting Access to the Device</title> - <para> - The most widely supported form of IO is memory mapped IO. - That is, a part of the CPU's address space is interpreted - not as accesses to memory, but as accesses to a device. Some - architectures define devices to be at a fixed address, but most - have some method of discovering devices. The PCI bus walk is a - good example of such a scheme. This document does not cover how - to receive such an address, but assumes you are starting with one. - Physical addresses are of type unsigned long. - </para> - - <para> - This address should not be used directly. Instead, to get an - address suitable for passing to the accessor functions described - below, you should call <function>ioremap</function>. - An address suitable for accessing the device will be returned to you. - </para> - - <para> - After you've finished using the device (say, in your module's - exit routine), call <function>iounmap</function> in order to return - the address space to the kernel. Most architectures allocate new - address space each time you call <function>ioremap</function>, and - they can run out unless you call <function>iounmap</function>. - </para> - </sect1> - - <sect1 id="accessing_the_device"> - <title>Accessing the device</title> - <para> - The part of the interface most used by drivers is reading and - writing memory-mapped registers on the device. Linux provides - interfaces to read and write 8-bit, 16-bit, 32-bit and 64-bit - quantities. Due to a historical accident, these are named byte, - word, long and quad accesses. Both read and write accesses are - supported; there is no prefetch support at this time. - </para> - - <para> - The functions are named <function>readb</function>, - <function>readw</function>, <function>readl</function>, - <function>readq</function>, <function>readb_relaxed</function>, - <function>readw_relaxed</function>, <function>readl_relaxed</function>, - <function>readq_relaxed</function>, <function>writeb</function>, - <function>writew</function>, <function>writel</function> and - <function>writeq</function>. - </para> - - <para> - Some devices (such as framebuffers) would like to use larger - transfers than 8 bytes at a time. For these devices, the - <function>memcpy_toio</function>, <function>memcpy_fromio</function> - and <function>memset_io</function> functions are provided. - Do not use memset or memcpy on IO addresses; they - are not guaranteed to copy data in order. - </para> - - <para> - The read and write functions are defined to be ordered. That is the - compiler is not permitted to reorder the I/O sequence. When the - ordering can be compiler optimised, you can use <function> - __readb</function> and friends to indicate the relaxed ordering. Use - this with care. - </para> - - <para> - While the basic functions are defined to be synchronous with respect - to each other and ordered with respect to each other the busses the - devices sit on may themselves have asynchronicity. In particular many - authors are burned by the fact that PCI bus writes are posted - asynchronously. A driver author must issue a read from the same - device to ensure that writes have occurred in the specific cases the - author cares. This kind of property cannot be hidden from driver - writers in the API. In some cases, the read used to flush the device - may be expected to fail (if the card is resetting, for example). In - that case, the read should be done from config space, which is - guaranteed to soft-fail if the card doesn't respond. - </para> - - <para> - The following is an example of flushing a write to a device when - the driver would like to ensure the write's effects are visible prior - to continuing execution. - </para> - -<programlisting> -static inline void -qla1280_disable_intrs(struct scsi_qla_host *ha) -{ - struct device_reg *reg; - - reg = ha->iobase; - /* disable risc and host interrupts */ - WRT_REG_WORD(&reg->ictrl, 0); - /* - * The following read will ensure that the above write - * has been received by the device before we return from this - * function. - */ - RD_REG_WORD(&reg->ictrl); - ha->flags.ints_enabled = 0; -} -</programlisting> - - <para> - In addition to write posting, on some large multiprocessing systems - (e.g. SGI Challenge, Origin and Altix machines) posted writes won't - be strongly ordered coming from different CPUs. Thus it's important - to properly protect parts of your driver that do memory-mapped writes - with locks and use the <function>mmiowb</function> to make sure they - arrive in the order intended. Issuing a regular <function>readX - </function> will also ensure write ordering, but should only be used - when the driver has to be sure that the write has actually arrived - at the device (not that it's simply ordered with respect to other - writes), since a full <function>readX</function> is a relatively - expensive operation. - </para> - - <para> - Generally, one should use <function>mmiowb</function> prior to - releasing a spinlock that protects regions using <function>writeb - </function> or similar functions that aren't surrounded by <function> - readb</function> calls, which will ensure ordering and flushing. The - following pseudocode illustrates what might occur if write ordering - isn't guaranteed via <function>mmiowb</function> or one of the - <function>readX</function> functions. - </para> - -<programlisting> -CPU A: spin_lock_irqsave(&dev_lock, flags) -CPU A: ... -CPU A: writel(newval, ring_ptr); -CPU A: spin_unlock_irqrestore(&dev_lock, flags) - ... -CPU B: spin_lock_irqsave(&dev_lock, flags) -CPU B: writel(newval2, ring_ptr); -CPU B: ... -CPU B: spin_unlock_irqrestore(&dev_lock, flags) -</programlisting> - - <para> - In the case above, newval2 could be written to ring_ptr before - newval. Fixing it is easy though: - </para> - -<programlisting> -CPU A: spin_lock_irqsave(&dev_lock, flags) -CPU A: ... -CPU A: writel(newval, ring_ptr); -CPU A: mmiowb(); /* ensure no other writes beat us to the device */ -CPU A: spin_unlock_irqrestore(&dev_lock, flags) - ... -CPU B: spin_lock_irqsave(&dev_lock, flags) -CPU B: writel(newval2, ring_ptr); -CPU B: ... -CPU B: mmiowb(); -CPU B: spin_unlock_irqrestore(&dev_lock, flags) -</programlisting> - - <para> - See tg3.c for a real world example of how to use <function>mmiowb - </function> - </para> - - <para> - PCI ordering rules also guarantee that PIO read responses arrive - after any outstanding DMA writes from that bus, since for some devices - the result of a <function>readb</function> call may signal to the - driver that a DMA transaction is complete. In many cases, however, - the driver may want to indicate that the next - <function>readb</function> call has no relation to any previous DMA - writes performed by the device. The driver can use - <function>readb_relaxed</function> for these cases, although only - some platforms will honor the relaxed semantics. Using the relaxed - read functions will provide significant performance benefits on - platforms that support it. The qla2xxx driver provides examples - of how to use <function>readX_relaxed</function>. In many cases, - a majority of the driver's <function>readX</function> calls can - safely be converted to <function>readX_relaxed</function> calls, since - only a few will indicate or depend on DMA completion. - </para> - </sect1> - - </chapter> - - <chapter id="port_space_accesses"> - <title>Port Space Accesses</title> - <sect1 id="port_space_explained"> - <title>Port Space Explained</title> - - <para> - Another form of IO commonly supported is Port Space. This is a - range of addresses separate to the normal memory address space. - Access to these addresses is generally not as fast as accesses - to the memory mapped addresses, and it also has a potentially - smaller address space. - </para> - - <para> - Unlike memory mapped IO, no preparation is required - to access port space. - </para> - - </sect1> - <sect1 id="accessing_port_space"> - <title>Accessing Port Space</title> - <para> - Accesses to this space are provided through a set of functions - which allow 8-bit, 16-bit and 32-bit accesses; also - known as byte, word and long. These functions are - <function>inb</function>, <function>inw</function>, - <function>inl</function>, <function>outb</function>, - <function>outw</function> and <function>outl</function>. - </para> - - <para> - Some variants are provided for these functions. Some devices - require that accesses to their ports are slowed down. This - functionality is provided by appending a <function>_p</function> - to the end of the function. There are also equivalents to memcpy. - The <function>ins</function> and <function>outs</function> - functions copy bytes, words or longs to the given port. - </para> - </sect1> - - </chapter> - - <chapter id="pubfunctions"> - <title>Public Functions Provided</title> -!Iarch/x86/include/asm/io.h -!Elib/pci_iomap.c - </chapter> - -</book> diff --git a/Documentation/DocBook/iio.tmpl b/Documentation/DocBook/iio.tmpl deleted file mode 100644 index e2ab6a1f223e..000000000000 --- a/Documentation/DocBook/iio.tmpl +++ /dev/null @@ -1,697 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" - "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> - -<book id="iioid"> - <bookinfo> - <title>Industrial I/O driver developer's guide </title> - - <authorgroup> - <author> - <firstname>Daniel</firstname> - <surname>Baluta</surname> - <affiliation> - <address> - <email>daniel.baluta@intel.com</email> - </address> - </affiliation> - </author> - </authorgroup> - - <copyright> - <year>2015</year> - <holder>Intel Corporation</holder> - </copyright> - - <legalnotice> - <para> - This documentation is free software; you can redistribute - it and/or modify it under the terms of the GNU General Public - License version 2. - </para> - </legalnotice> - </bookinfo> - - <toc></toc> - - <chapter id="intro"> - <title>Introduction</title> - <para> - The main purpose of the Industrial I/O subsystem (IIO) is to provide - support for devices that in some sense perform either analog-to-digital - conversion (ADC) or digital-to-analog conversion (DAC) or both. The aim - is to fill the gap between the somewhat similar hwmon and input - subsystems. - Hwmon is directed at low sample rate sensors used to monitor and - control the system itself, like fan speed control or temperature - measurement. Input is, as its name suggests, focused on human interaction - input devices (keyboard, mouse, touchscreen). In some cases there is - considerable overlap between these and IIO. - </para> - <para> - Devices that fall into this category include: - <itemizedlist> - <listitem> - analog to digital converters (ADCs) - </listitem> - <listitem> - accelerometers - </listitem> - <listitem> - capacitance to digital converters (CDCs) - </listitem> - <listitem> - digital to analog converters (DACs) - </listitem> - <listitem> - gyroscopes - </listitem> - <listitem> - inertial measurement units (IMUs) - </listitem> - <listitem> - color and light sensors - </listitem> - <listitem> - magnetometers - </listitem> - <listitem> - pressure sensors - </listitem> - <listitem> - proximity sensors - </listitem> - <listitem> - temperature sensors - </listitem> - </itemizedlist> - Usually these sensors are connected via SPI or I2C. A common use case of the - sensors devices is to have combined functionality (e.g. light plus proximity - sensor). - </para> - </chapter> - <chapter id='iiosubsys'> - <title>Industrial I/O core</title> - <para> - The Industrial I/O core offers: - <itemizedlist> - <listitem> - a unified framework for writing drivers for many different types of - embedded sensors. - </listitem> - <listitem> - a standard interface to user space applications manipulating sensors. - </listitem> - </itemizedlist> - The implementation can be found under <filename> - drivers/iio/industrialio-*</filename> - </para> - <sect1 id="iiodevice"> - <title> Industrial I/O devices </title> - -!Finclude/linux/iio/iio.h iio_dev -!Fdrivers/iio/industrialio-core.c iio_device_alloc -!Fdrivers/iio/industrialio-core.c iio_device_free -!Fdrivers/iio/industrialio-core.c iio_device_register -!Fdrivers/iio/industrialio-core.c iio_device_unregister - - <para> - An IIO device usually corresponds to a single hardware sensor and it - provides all the information needed by a driver handling a device. - Let's first have a look at the functionality embedded in an IIO - device then we will show how a device driver makes use of an IIO - device. - </para> - <para> - There are two ways for a user space application to interact - with an IIO driver. - <itemizedlist> - <listitem> - <filename>/sys/bus/iio/iio:deviceX/</filename>, this - represents a hardware sensor and groups together the data - channels of the same chip. - </listitem> - <listitem> - <filename>/dev/iio:deviceX</filename>, character device node - interface used for buffered data transfer and for events information - retrieval. - </listitem> - </itemizedlist> - </para> - A typical IIO driver will register itself as an I2C or SPI driver and will - create two routines, <function> probe </function> and <function> remove - </function>. At <function>probe</function>: - <itemizedlist> - <listitem>call <function>iio_device_alloc</function>, which allocates memory - for an IIO device. - </listitem> - <listitem> initialize IIO device fields with driver specific information - (e.g. device name, device channels). - </listitem> - <listitem>call <function> iio_device_register</function>, this registers the - device with the IIO core. After this call the device is ready to accept - requests from user space applications. - </listitem> - </itemizedlist> - At <function>remove</function>, we free the resources allocated in - <function>probe</function> in reverse order: - <itemizedlist> - <listitem><function>iio_device_unregister</function>, unregister the device - from the IIO core. - </listitem> - <listitem><function>iio_device_free</function>, free the memory allocated - for the IIO device. - </listitem> - </itemizedlist> - - <sect2 id="iioattr"> <title> IIO device sysfs interface </title> - <para> - Attributes are sysfs files used to expose chip info and also allowing - applications to set various configuration parameters. For device - with index X, attributes can be found under - <filename>/sys/bus/iio/iio:deviceX/ </filename> directory. - Common attributes are: - <itemizedlist> - <listitem><filename>name</filename>, description of the physical - chip. - </listitem> - <listitem><filename>dev</filename>, shows the major:minor pair - associated with <filename>/dev/iio:deviceX</filename> node. - </listitem> - <listitem><filename>sampling_frequency_available</filename>, - available discrete set of sampling frequency values for - device. - </listitem> - </itemizedlist> - Available standard attributes for IIO devices are described in the - <filename>Documentation/ABI/testing/sysfs-bus-iio </filename> file - in the Linux kernel sources. - </para> - </sect2> - <sect2 id="iiochannel"> <title> IIO device channels </title> -!Finclude/linux/iio/iio.h iio_chan_spec structure. - <para> - An IIO device channel is a representation of a data channel. An - IIO device can have one or multiple channels. For example: - <itemizedlist> - <listitem> - a thermometer sensor has one channel representing the - temperature measurement. - </listitem> - <listitem> - a light sensor with two channels indicating the measurements in - the visible and infrared spectrum. - </listitem> - <listitem> - an accelerometer can have up to 3 channels representing - acceleration on X, Y and Z axes. - </listitem> - </itemizedlist> - An IIO channel is described by the <type> struct iio_chan_spec - </type>. A thermometer driver for the temperature sensor in the - example above would have to describe its channel as follows: - <programlisting> - static const struct iio_chan_spec temp_channel[] = { - { - .type = IIO_TEMP, - .info_mask_separate = BIT(IIO_CHAN_INFO_PROCESSED), - }, - }; - - </programlisting> - Channel sysfs attributes exposed to userspace are specified in - the form of <emphasis>bitmasks</emphasis>. Depending on their - shared info, attributes can be set in one of the following masks: - <itemizedlist> - <listitem><emphasis>info_mask_separate</emphasis>, attributes will - be specific to this channel</listitem> - <listitem><emphasis>info_mask_shared_by_type</emphasis>, - attributes are shared by all channels of the same type</listitem> - <listitem><emphasis>info_mask_shared_by_dir</emphasis>, attributes - are shared by all channels of the same direction </listitem> - <listitem><emphasis>info_mask_shared_by_all</emphasis>, - attributes are shared by all channels</listitem> - </itemizedlist> - When there are multiple data channels per channel type we have two - ways to distinguish between them: - <itemizedlist> - <listitem> set <emphasis> .modified</emphasis> field of <type> - iio_chan_spec</type> to 1. Modifiers are specified using - <emphasis>.channel2</emphasis> field of the same - <type>iio_chan_spec</type> structure and are used to indicate a - physically unique characteristic of the channel such as its direction - or spectral response. For example, a light sensor can have two channels, - one for infrared light and one for both infrared and visible light. - </listitem> - <listitem> set <emphasis>.indexed </emphasis> field of - <type>iio_chan_spec</type> to 1. In this case the channel is - simply another instance with an index specified by the - <emphasis>.channel</emphasis> field. - </listitem> - </itemizedlist> - Here is how we can make use of the channel's modifiers: - <programlisting> - static const struct iio_chan_spec light_channels[] = { - { - .type = IIO_INTENSITY, - .modified = 1, - .channel2 = IIO_MOD_LIGHT_IR, - .info_mask_separate = BIT(IIO_CHAN_INFO_RAW), - .info_mask_shared = BIT(IIO_CHAN_INFO_SAMP_FREQ), - }, - { - .type = IIO_INTENSITY, - .modified = 1, - .channel2 = IIO_MOD_LIGHT_BOTH, - .info_mask_separate = BIT(IIO_CHAN_INFO_RAW), - .info_mask_shared = BIT(IIO_CHAN_INFO_SAMP_FREQ), - }, - { - .type = IIO_LIGHT, - .info_mask_separate = BIT(IIO_CHAN_INFO_PROCESSED), - .info_mask_shared = BIT(IIO_CHAN_INFO_SAMP_FREQ), - }, - - } - </programlisting> - This channel's definition will generate two separate sysfs files - for raw data retrieval: - <itemizedlist> - <listitem> - <filename>/sys/bus/iio/iio:deviceX/in_intensity_ir_raw</filename> - </listitem> - <listitem> - <filename>/sys/bus/iio/iio:deviceX/in_intensity_both_raw</filename> - </listitem> - </itemizedlist> - one file for processed data: - <itemizedlist> - <listitem> - <filename>/sys/bus/iio/iio:deviceX/in_illuminance_input - </filename> - </listitem> - </itemizedlist> - and one shared sysfs file for sampling frequency: - <itemizedlist> - <listitem> - <filename>/sys/bus/iio/iio:deviceX/sampling_frequency. - </filename> - </listitem> - </itemizedlist> - </para> - <para> - Here is how we can make use of the channel's indexing: - <programlisting> - static const struct iio_chan_spec light_channels[] = { - { - .type = IIO_VOLTAGE, - .indexed = 1, - .channel = 0, - .info_mask_separate = BIT(IIO_CHAN_INFO_RAW), - }, - { - .type = IIO_VOLTAGE, - .indexed = 1, - .channel = 1, - .info_mask_separate = BIT(IIO_CHAN_INFO_RAW), - }, - } - </programlisting> - This will generate two separate attributes files for raw data - retrieval: - <itemizedlist> - <listitem> - <filename>/sys/bus/iio/devices/iio:deviceX/in_voltage0_raw</filename>, - representing voltage measurement for channel 0. - </listitem> - <listitem> - <filename>/sys/bus/iio/devices/iio:deviceX/in_voltage1_raw</filename>, - representing voltage measurement for channel 1. - </listitem> - </itemizedlist> - </para> - </sect2> - </sect1> - - <sect1 id="iiobuffer"> <title> Industrial I/O buffers </title> -!Finclude/linux/iio/buffer.h iio_buffer -!Edrivers/iio/industrialio-buffer.c - - <para> - The Industrial I/O core offers a way for continuous data capture - based on a trigger source. Multiple data channels can be read at once - from <filename>/dev/iio:deviceX</filename> character device node, - thus reducing the CPU load. - </para> - - <sect2 id="iiobuffersysfs"> - <title>IIO buffer sysfs interface </title> - <para> - An IIO buffer has an associated attributes directory under <filename> - /sys/bus/iio/iio:deviceX/buffer/</filename>. Here are the existing - attributes: - <itemizedlist> - <listitem> - <emphasis>length</emphasis>, the total number of data samples - (capacity) that can be stored by the buffer. - </listitem> - <listitem> - <emphasis>enable</emphasis>, activate buffer capture. - </listitem> - </itemizedlist> - - </para> - </sect2> - <sect2 id="iiobuffersetup"> <title> IIO buffer setup </title> - <para>The meta information associated with a channel reading - placed in a buffer is called a <emphasis> scan element </emphasis>. - The important bits configuring scan elements are exposed to - userspace applications via the <filename> - /sys/bus/iio/iio:deviceX/scan_elements/</filename> directory. This - file contains attributes of the following form: - <itemizedlist> - <listitem><emphasis>enable</emphasis>, used for enabling a channel. - If and only if its attribute is non zero, then a triggered capture - will contain data samples for this channel. - </listitem> - <listitem><emphasis>type</emphasis>, description of the scan element - data storage within the buffer and hence the form in which it is - read from user space. Format is <emphasis> - [be|le]:[s|u]bits/storagebitsXrepeat[>>shift] </emphasis>. - <itemizedlist> - <listitem> <emphasis>be</emphasis> or <emphasis>le</emphasis>, specifies - big or little endian. - </listitem> - <listitem> - <emphasis>s </emphasis>or <emphasis>u</emphasis>, specifies if - signed (2's complement) or unsigned. - </listitem> - <listitem><emphasis>bits</emphasis>, is the number of valid data - bits. - </listitem> - <listitem><emphasis>storagebits</emphasis>, is the number of bits - (after padding) that it occupies in the buffer. - </listitem> - <listitem> - <emphasis>shift</emphasis>, if specified, is the shift that needs - to be applied prior to masking out unused bits. - </listitem> - <listitem> - <emphasis>repeat</emphasis>, specifies the number of bits/storagebits - repetitions. When the repeat element is 0 or 1, then the repeat - value is omitted. - </listitem> - </itemizedlist> - </listitem> - </itemizedlist> - For example, a driver for a 3-axis accelerometer with 12 bit - resolution where data is stored in two 8-bits registers as - follows: - <programlisting> - 7 6 5 4 3 2 1 0 - +---+---+---+---+---+---+---+---+ - |D3 |D2 |D1 |D0 | X | X | X | X | (LOW byte, address 0x06) - +---+---+---+---+---+---+---+---+ - - 7 6 5 4 3 2 1 0 - +---+---+---+---+---+---+---+---+ - |D11|D10|D9 |D8 |D7 |D6 |D5 |D4 | (HIGH byte, address 0x07) - +---+---+---+---+---+---+---+---+ - </programlisting> - - will have the following scan element type for each axis: - <programlisting> - $ cat /sys/bus/iio/devices/iio:device0/scan_elements/in_accel_y_type - le:s12/16>>4 - </programlisting> - A user space application will interpret data samples read from the - buffer as two byte little endian signed data, that needs a 4 bits - right shift before masking out the 12 valid bits of data. - </para> - <para> - For implementing buffer support a driver should initialize the following - fields in <type>iio_chan_spec</type> definition: - <programlisting> - struct iio_chan_spec { - /* other members */ - int scan_index - struct { - char sign; - u8 realbits; - u8 storagebits; - u8 shift; - u8 repeat; - enum iio_endian endianness; - } scan_type; - }; - </programlisting> - The driver implementing the accelerometer described above will - have the following channel definition: - <programlisting> - struct struct iio_chan_spec accel_channels[] = { - { - .type = IIO_ACCEL, - .modified = 1, - .channel2 = IIO_MOD_X, - /* other stuff here */ - .scan_index = 0, - .scan_type = { - .sign = 's', - .realbits = 12, - .storagebits = 16, - .shift = 4, - .endianness = IIO_LE, - }, - } - /* similar for Y (with channel2 = IIO_MOD_Y, scan_index = 1) - * and Z (with channel2 = IIO_MOD_Z, scan_index = 2) axis - */ - } - </programlisting> - </para> - <para> - Here <emphasis> scan_index </emphasis> defines the order in which - the enabled channels are placed inside the buffer. Channels with a lower - scan_index will be placed before channels with a higher index. Each - channel needs to have a unique scan_index. - </para> - <para> - Setting scan_index to -1 can be used to indicate that the specific - channel does not support buffered capture. In this case no entries will - be created for the channel in the scan_elements directory. - </para> - </sect2> - </sect1> - - <sect1 id="iiotrigger"> <title> Industrial I/O triggers </title> -!Finclude/linux/iio/trigger.h iio_trigger -!Edrivers/iio/industrialio-trigger.c - <para> - In many situations it is useful for a driver to be able to - capture data based on some external event (trigger) as opposed - to periodically polling for data. An IIO trigger can be provided - by a device driver that also has an IIO device based on hardware - generated events (e.g. data ready or threshold exceeded) or - provided by a separate driver from an independent interrupt - source (e.g. GPIO line connected to some external system, timer - interrupt or user space writing a specific file in sysfs). A - trigger may initiate data capture for a number of sensors and - also it may be completely unrelated to the sensor itself. - </para> - - <sect2 id="iiotrigsysfs"> <title> IIO trigger sysfs interface </title> - There are two locations in sysfs related to triggers: - <itemizedlist> - <listitem><filename>/sys/bus/iio/devices/triggerY</filename>, - this file is created once an IIO trigger is registered with - the IIO core and corresponds to trigger with index Y. Because - triggers can be very different depending on type there are few - standard attributes that we can describe here: - <itemizedlist> - <listitem> - <emphasis>name</emphasis>, trigger name that can be later - used for association with a device. - </listitem> - <listitem> - <emphasis>sampling_frequency</emphasis>, some timer based - triggers use this attribute to specify the frequency for - trigger calls. - </listitem> - </itemizedlist> - </listitem> - <listitem> - <filename>/sys/bus/iio/devices/iio:deviceX/trigger/</filename>, this - directory is created once the device supports a triggered - buffer. We can associate a trigger with our device by writing - the trigger's name in the <filename>current_trigger</filename> file. - </listitem> - </itemizedlist> - </sect2> - - <sect2 id="iiotrigattr"> <title> IIO trigger setup</title> - - <para> - Let's see a simple example of how to setup a trigger to be used - by a driver. - - <programlisting> - struct iio_trigger_ops trigger_ops = { - .set_trigger_state = sample_trigger_state, - .validate_device = sample_validate_device, - } - - struct iio_trigger *trig; - - /* first, allocate memory for our trigger */ - trig = iio_trigger_alloc(dev, "trig-%s-%d", name, idx); - - /* setup trigger operations field */ - trig->ops = &trigger_ops; - - /* now register the trigger with the IIO core */ - iio_trigger_register(trig); - </programlisting> - </para> - </sect2> - - <sect2 id="iiotrigsetup"> <title> IIO trigger ops</title> -!Finclude/linux/iio/trigger.h iio_trigger_ops - <para> - Notice that a trigger has a set of operations attached: - <itemizedlist> - <listitem> - <function>set_trigger_state</function>, switch the trigger on/off - on demand. - </listitem> - <listitem> - <function>validate_device</function>, function to validate the - device when the current trigger gets changed. - </listitem> - </itemizedlist> - </para> - </sect2> - </sect1> - <sect1 id="iiotriggered_buffer"> - <title> Industrial I/O triggered buffers </title> - <para> - Now that we know what buffers and triggers are let's see how they - work together. - </para> - <sect2 id="iiotrigbufsetup"> <title> IIO triggered buffer setup</title> -!Edrivers/iio/buffer/industrialio-triggered-buffer.c -!Finclude/linux/iio/iio.h iio_buffer_setup_ops - - - <para> - A typical triggered buffer setup looks like this: - <programlisting> - const struct iio_buffer_setup_ops sensor_buffer_setup_ops = { - .preenable = sensor_buffer_preenable, - .postenable = sensor_buffer_postenable, - .postdisable = sensor_buffer_postdisable, - .predisable = sensor_buffer_predisable, - }; - - irqreturn_t sensor_iio_pollfunc(int irq, void *p) - { - pf->timestamp = iio_get_time_ns((struct indio_dev *)p); - return IRQ_WAKE_THREAD; - } - - irqreturn_t sensor_trigger_handler(int irq, void *p) - { - u16 buf[8]; - int i = 0; - - /* read data for each active channel */ - for_each_set_bit(bit, active_scan_mask, masklength) - buf[i++] = sensor_get_data(bit) - - iio_push_to_buffers_with_timestamp(indio_dev, buf, timestamp); - - iio_trigger_notify_done(trigger); - return IRQ_HANDLED; - } - - /* setup triggered buffer, usually in probe function */ - iio_triggered_buffer_setup(indio_dev, sensor_iio_polfunc, - sensor_trigger_handler, - sensor_buffer_setup_ops); - </programlisting> - </para> - The important things to notice here are: - <itemizedlist> - <listitem><function> iio_buffer_setup_ops</function>, the buffer setup - functions to be called at predefined points in the buffer configuration - sequence (e.g. before enable, after disable). If not specified, the - IIO core uses the default <type>iio_triggered_buffer_setup_ops</type>. - </listitem> - <listitem><function>sensor_iio_pollfunc</function>, the function that - will be used as top half of poll function. It should do as little - processing as possible, because it runs in interrupt context. The most - common operation is recording of the current timestamp and for this reason - one can use the IIO core defined <function>iio_pollfunc_store_time - </function> function. - </listitem> - <listitem><function>sensor_trigger_handler</function>, the function that - will be used as bottom half of the poll function. This runs in the - context of a kernel thread and all the processing takes place here. - It usually reads data from the device and stores it in the internal - buffer together with the timestamp recorded in the top half. - </listitem> - </itemizedlist> - </sect2> - </sect1> - </chapter> - <chapter id='iioresources'> - <title> Resources </title> - IIO core may change during time so the best documentation to read is the - source code. There are several locations where you should look: - <itemizedlist> - <listitem> - <filename>drivers/iio/</filename>, contains the IIO core plus - and directories for each sensor type (e.g. accel, magnetometer, - etc.) - </listitem> - <listitem> - <filename>include/linux/iio/</filename>, contains the header - files, nice to read for the internal kernel interfaces. - </listitem> - <listitem> - <filename>include/uapi/linux/iio/</filename>, contains files to be - used by user space applications. - </listitem> - <listitem> - <filename>tools/iio/</filename>, contains tools for rapidly - testing buffers, events and device creation. - </listitem> - <listitem> - <filename>drivers/staging/iio/</filename>, contains code for some - drivers or experimental features that are not yet mature enough - to be moved out. - </listitem> - </itemizedlist> - <para> - Besides the code, there are some good online documentation sources: - <itemizedlist> - <listitem> - <ulink url="http://marc.info/?l=linux-iio"> Industrial I/O mailing - list </ulink> - </listitem> - <listitem> - <ulink url="http://wiki.analog.com/software/linux/docs/iio/iio"> - Analog Device IIO wiki page </ulink> - </listitem> - <listitem> - <ulink url="https://fosdem.org/2015/schedule/event/iiosdr/"> - Using the Linux IIO framework for SDR, Lars-Peter Clausen's - presentation at FOSDEM </ulink> - </listitem> - </itemizedlist> - </para> - </chapter> -</book> - -<!-- -vim: softtabstop=2:shiftwidth=2:expandtab:textwidth=72 ---> diff --git a/Documentation/DocBook/regulator.tmpl b/Documentation/DocBook/regulator.tmpl deleted file mode 100644 index 3b08a085d2c7..000000000000 --- a/Documentation/DocBook/regulator.tmpl +++ /dev/null @@ -1,304 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" - "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> - -<book id="regulator-api"> - <bookinfo> - <title>Voltage and current regulator API</title> - - <authorgroup> - <author> - <firstname>Liam</firstname> - <surname>Girdwood</surname> - <affiliation> - <address> - <email>lrg@slimlogic.co.uk</email> - </address> - </affiliation> - </author> - <author> - <firstname>Mark</firstname> - <surname>Brown</surname> - <affiliation> - <orgname>Wolfson Microelectronics</orgname> - <address> - <email>broonie@opensource.wolfsonmicro.com</email> - </address> - </affiliation> - </author> - </authorgroup> - - <copyright> - <year>2007-2008</year> - <holder>Wolfson Microelectronics</holder> - </copyright> - <copyright> - <year>2008</year> - <holder>Liam Girdwood</holder> - </copyright> - - <legalnotice> - <para> - This documentation is free software; you can redistribute - it and/or modify it under the terms of the GNU General Public - License version 2 as published by the Free Software Foundation. - </para> - - <para> - This program is distributed in the hope that it will be - useful, but WITHOUT ANY WARRANTY; without even the implied - warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. - See the GNU General Public License for more details. - </para> - - <para> - You should have received a copy of the GNU General Public - License along with this program; if not, write to the Free - Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, - MA 02111-1307 USA - </para> - - <para> - For more details see the file COPYING in the source - distribution of Linux. - </para> - </legalnotice> - </bookinfo> - -<toc></toc> - - <chapter id="intro"> - <title>Introduction</title> - <para> - This framework is designed to provide a standard kernel - interface to control voltage and current regulators. - </para> - <para> - The intention is to allow systems to dynamically control - regulator power output in order to save power and prolong - battery life. This applies to both voltage regulators (where - voltage output is controllable) and current sinks (where current - limit is controllable). - </para> - <para> - Note that additional (and currently more complete) documentation - is available in the Linux kernel source under - <filename>Documentation/power/regulator</filename>. - </para> - - <sect1 id="glossary"> - <title>Glossary</title> - <para> - The regulator API uses a number of terms which may not be - familiar: - </para> - <glossary> - - <glossentry> - <glossterm>Regulator</glossterm> - <glossdef> - <para> - Electronic device that supplies power to other devices. Most - regulators can enable and disable their output and some can also - control their output voltage or current. - </para> - </glossdef> - </glossentry> - - <glossentry> - <glossterm>Consumer</glossterm> - <glossdef> - <para> - Electronic device which consumes power provided by a regulator. - These may either be static, requiring only a fixed supply, or - dynamic, requiring active management of the regulator at - runtime. - </para> - </glossdef> - </glossentry> - - <glossentry> - <glossterm>Power Domain</glossterm> - <glossdef> - <para> - The electronic circuit supplied by a given regulator, including - the regulator and all consumer devices. The configuration of - the regulator is shared between all the components in the - circuit. - </para> - </glossdef> - </glossentry> - - <glossentry> - <glossterm>Power Management Integrated Circuit</glossterm> - <acronym>PMIC</acronym> - <glossdef> - <para> - An IC which contains numerous regulators and often also other - subsystems. In an embedded system the primary PMIC is often - equivalent to a combination of the PSU and southbridge in a - desktop system. - </para> - </glossdef> - </glossentry> - </glossary> - </sect1> - </chapter> - - <chapter id="consumer"> - <title>Consumer driver interface</title> - <para> - This offers a similar API to the kernel clock framework. - Consumer drivers use <link - linkend='API-regulator-get'>get</link> and <link - linkend='API-regulator-put'>put</link> operations to acquire and - release regulators. Functions are - provided to <link linkend='API-regulator-enable'>enable</link> - and <link linkend='API-regulator-disable'>disable</link> the - regulator and to get and set the runtime parameters of the - regulator. - </para> - <para> - When requesting regulators consumers use symbolic names for their - supplies, such as "Vcc", which are mapped into actual regulator - devices by the machine interface. - </para> - <para> - A stub version of this API is provided when the regulator - framework is not in use in order to minimise the need to use - ifdefs. - </para> - - <sect1 id="consumer-enable"> - <title>Enabling and disabling</title> - <para> - The regulator API provides reference counted enabling and - disabling of regulators. Consumer devices use the <function><link - linkend='API-regulator-enable'>regulator_enable</link></function> - and <function><link - linkend='API-regulator-disable'>regulator_disable</link> - </function> functions to enable and disable regulators. Calls - to the two functions must be balanced. - </para> - <para> - Note that since multiple consumers may be using a regulator and - machine constraints may not allow the regulator to be disabled - there is no guarantee that calling - <function>regulator_disable</function> will actually cause the - supply provided by the regulator to be disabled. Consumer - drivers should assume that the regulator may be enabled at all - times. - </para> - </sect1> - - <sect1 id="consumer-config"> - <title>Configuration</title> - <para> - Some consumer devices may need to be able to dynamically - configure their supplies. For example, MMC drivers may need to - select the correct operating voltage for their cards. This may - be done while the regulator is enabled or disabled. - </para> - <para> - The <function><link - linkend='API-regulator-set-voltage'>regulator_set_voltage</link> - </function> and <function><link - linkend='API-regulator-set-current-limit' - >regulator_set_current_limit</link> - </function> functions provide the primary interface for this. - Both take ranges of voltages and currents, supporting drivers - that do not require a specific value (eg, CPU frequency scaling - normally permits the CPU to use a wider range of supply - voltages at lower frequencies but does not require that the - supply voltage be lowered). Where an exact value is required - both minimum and maximum values should be identical. - </para> - </sect1> - - <sect1 id="consumer-callback"> - <title>Callbacks</title> - <para> - Callbacks may also be <link - linkend='API-regulator-register-notifier'>registered</link> - for events such as regulation failures. - </para> - </sect1> - </chapter> - - <chapter id="driver"> - <title>Regulator driver interface</title> - <para> - Drivers for regulator chips <link - linkend='API-regulator-register'>register</link> the regulators - with the regulator core, providing operations structures to the - core. A <link - linkend='API-regulator-notifier-call-chain'>notifier</link> interface - allows error conditions to be reported to the core. - </para> - <para> - Registration should be triggered by explicit setup done by the - platform, supplying a <link - linkend='API-struct-regulator-init-data'>struct - regulator_init_data</link> for the regulator containing - <link linkend='machine-constraint'>constraint</link> and - <link linkend='machine-supply'>supply</link> information. - </para> - </chapter> - - <chapter id="machine"> - <title>Machine interface</title> - <para> - This interface provides a way to define how regulators are - connected to consumers on a given system and what the valid - operating parameters are for the system. - </para> - - <sect1 id="machine-supply"> - <title>Supplies</title> - <para> - Regulator supplies are specified using <link - linkend='API-struct-regulator-consumer-supply'>struct - regulator_consumer_supply</link>. This is done at - <link linkend='driver'>driver registration - time</link> as part of the machine constraints. - </para> - </sect1> - - <sect1 id="machine-constraint"> - <title>Constraints</title> - <para> - As well as defining the connections the machine interface - also provides constraints defining the operations that - clients are allowed to perform and the parameters that may be - set. This is required since generally regulator devices will - offer more flexibility than it is safe to use on a given - system, for example supporting higher supply voltages than the - consumers are rated for. - </para> - <para> - This is done at <link linkend='driver'>driver - registration time</link> by providing a <link - linkend='API-struct-regulation-constraints'>struct - regulation_constraints</link>. - </para> - <para> - The constraints may also specify an initial configuration for the - regulator in the constraints, which is particularly useful for - use with static consumers. - </para> - </sect1> - </chapter> - - <chapter id="api"> - <title>API reference</title> - <para> - Due to limitations of the kernel documentation framework and the - existing layout of the source code the entire regulator API is - documented here. - </para> -!Iinclude/linux/regulator/consumer.h -!Iinclude/linux/regulator/machine.h -!Iinclude/linux/regulator/driver.h -!Edrivers/regulator/core.c - </chapter> -</book> diff --git a/Documentation/Makefile.sphinx b/Documentation/Makefile.sphinx index 707c65337ebf..bcf529f6cf9b 100644 --- a/Documentation/Makefile.sphinx +++ b/Documentation/Makefile.sphinx @@ -43,7 +43,7 @@ ALLSPHINXOPTS = $(KERNELDOC_CONF) $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) I18NSPHINXOPTS = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) . # commands; the 'cmd' from scripts/Kbuild.include is not *loopable* -loop_cmd = $(echo-cmd) $(cmd_$(1)) +loop_cmd = $(echo-cmd) $(cmd_$(1)) || exit; # $2 sphinx builder e.g. "html" # $3 name of the build subfolder / e.g. "media", used as: @@ -54,7 +54,8 @@ loop_cmd = $(echo-cmd) $(cmd_$(1)) # e.g. "media" for the linux-tv book-set at ./Documentation/media quiet_cmd_sphinx = SPHINX $@ --> file://$(abspath $(BUILDDIR)/$3/$4) - cmd_sphinx = $(MAKE) BUILDDIR=$(abspath $(BUILDDIR)) $(build)=Documentation/media $2;\ + cmd_sphinx = $(MAKE) BUILDDIR=$(abspath $(BUILDDIR)) $(build)=Documentation/media $2 && \ + PYTHONDONTWRITEBYTECODE=1 \ BUILDDIR=$(abspath $(BUILDDIR)) SPHINX_CONF=$(abspath $(srctree)/$(src)/$5/$(SPHINX_CONF)) \ $(SPHINXBUILD) \ -b $2 \ @@ -63,13 +64,16 @@ quiet_cmd_sphinx = SPHINX $@ --> file://$(abspath $(BUILDDIR)/$3/$4) -D version=$(KERNELVERSION) -D release=$(KERNELRELEASE) \ $(ALLSPHINXOPTS) \ $(abspath $(srctree)/$(src)/$5) \ - $(abspath $(BUILDDIR)/$3/$4); + $(abspath $(BUILDDIR)/$3/$4) htmldocs: - @$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,html,$(var),,$(var))) + @+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,html,$(var),,$(var))) + +linkcheckdocs: + @$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,linkcheck,$(var),,$(var))) latexdocs: - @$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,latex,$(var),latex,$(var))) + @+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,latex,$(var),latex,$(var))) ifeq ($(HAVE_PDFLATEX),0) @@ -80,27 +84,34 @@ pdfdocs: else # HAVE_PDFLATEX pdfdocs: latexdocs - $(foreach var,$(SPHINXDIRS), $(MAKE) PDFLATEX=$(PDFLATEX) LATEXOPTS="$(LATEXOPTS)" -C $(BUILDDIR)/$(var)/latex;) + $(foreach var,$(SPHINXDIRS), $(MAKE) PDFLATEX=$(PDFLATEX) LATEXOPTS="$(LATEXOPTS)" -C $(BUILDDIR)/$(var)/latex || exit;) endif # HAVE_PDFLATEX epubdocs: - @$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,epub,$(var),epub,$(var))) + @+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,epub,$(var),epub,$(var))) xmldocs: - @$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,xml,$(var),xml,$(var))) + @+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,xml,$(var),xml,$(var))) + +endif # HAVE_SPHINX + +# The following targets are independent of HAVE_SPHINX, and the rules should +# work or silently pass without Sphinx. # no-ops for the Sphinx toolchain sgmldocs: + @: psdocs: + @: mandocs: + @: installmandocs: + @: cleandocs: $(Q)rm -rf $(BUILDDIR) - $(Q)$(MAKE) BUILDDIR=$(abspath $(BUILDDIR)) -C Documentation/media clean - -endif # HAVE_SPHINX + $(Q)$(MAKE) BUILDDIR=$(abspath $(BUILDDIR)) $(build)=Documentation/media clean dochelp: @echo ' Linux kernel internal documentation in different formats (Sphinx):' @@ -109,6 +120,7 @@ dochelp: @echo ' pdfdocs - PDF' @echo ' epubdocs - EPUB' @echo ' xmldocs - XML' + @echo ' linkcheckdocs - check for broken external links (will connect to external hosts)' @echo ' cleandocs - clean all generated files' @echo @echo ' make SPHINXDIRS="s1 s2" [target] Generate only docs of folder s1, s2' diff --git a/Documentation/admin-guide/README.rst b/Documentation/admin-guide/README.rst index 1b6dfb2b3adb..697a00ccec25 100644 --- a/Documentation/admin-guide/README.rst +++ b/Documentation/admin-guide/README.rst @@ -17,7 +17,7 @@ What is Linux? loading, shared copy-on-write executables, proper memory management, and multistack networking including IPv4 and IPv6. - It is distributed under the GNU General Public License - see the + It is distributed under the GNU General Public License v2 - see the accompanying COPYING file for more details. On what hardware does it run? @@ -236,7 +236,7 @@ Configuring the kernel - Having unnecessary drivers will make the kernel bigger, and can under some circumstances lead to problems: probing for a - nonexistent controller card may confuse your other controllers + nonexistent controller card may confuse your other controllers. - A kernel with math-emulation compiled in will still use the coprocessor if one is present: the math emulation will just diff --git a/Documentation/admin-guide/dynamic-debug-howto.rst b/Documentation/admin-guide/dynamic-debug-howto.rst index 88adcfdf5b2b..12278a926370 100644 --- a/Documentation/admin-guide/dynamic-debug-howto.rst +++ b/Documentation/admin-guide/dynamic-debug-howto.rst @@ -93,9 +93,9 @@ Command Language Reference At the lexical level, a command comprises a sequence of words separated by spaces or tabs. So these are all equivalent:: - nullarbor:~ # echo -c 'file svcsock.c line 1603 +p' > + nullarbor:~ # echo -n 'file svcsock.c line 1603 +p' > <debugfs>/dynamic_debug/control - nullarbor:~ # echo -c ' file svcsock.c line 1603 +p ' > + nullarbor:~ # echo -n ' file svcsock.c line 1603 +p ' > <debugfs>/dynamic_debug/control nullarbor:~ # echo -n 'file svcsock.c line 1603 +p' > <debugfs>/dynamic_debug/control diff --git a/Documentation/block/pr.txt b/Documentation/block/pr.txt index d3eb1ca65051..ac9b8e70e64b 100644 --- a/Documentation/block/pr.txt +++ b/Documentation/block/pr.txt @@ -90,7 +90,7 @@ and thus removes any access restriction implied by it. 4. IOC_PR_PREEMPT This ioctl command releases the existing reservation referred to by -old_key and replaces it with a a new reservation of type for the +old_key and replaces it with a new reservation of type for the reservation key new_key. diff --git a/Documentation/cgroup-v1/cpusets.txt b/Documentation/cgroup-v1/cpusets.txt index e5ac5da86682..8402dd6de8df 100644 --- a/Documentation/cgroup-v1/cpusets.txt +++ b/Documentation/cgroup-v1/cpusets.txt @@ -615,7 +615,7 @@ to allocate a page of memory for that task. If a cpuset has its 'cpuset.cpus' modified, then each task in that cpuset will have its allowed CPU placement changed immediately. Similarly, -if a task's pid is written to another cpusets 'cpuset.tasks' file, then its +if a task's pid is written to another cpuset's 'tasks' file, then its allowed CPU placement is changed immediately. If such a task had been bound to some subset of its cpuset using the sched_setaffinity() call, the task will be allowed to run on any CPU allowed in its new cpuset, diff --git a/Documentation/conf.py b/Documentation/conf.py index 1ac958c0333d..f6823cf01275 100644 --- a/Documentation/conf.py +++ b/Documentation/conf.py @@ -58,7 +58,7 @@ master_doc = 'index' # General information about the project. project = 'The Linux Kernel' -copyright = '2016, The kernel development community' +copyright = 'The kernel development community' author = 'The kernel development community' # The version info for the project you're documenting, acts as replacement for diff --git a/Documentation/core-api/cpu_hotplug.rst b/Documentation/core-api/cpu_hotplug.rst new file mode 100644 index 000000000000..4a50ab7817f7 --- /dev/null +++ b/Documentation/core-api/cpu_hotplug.rst @@ -0,0 +1,372 @@ +========================= +CPU hotplug in the Kernel +========================= + +:Date: December, 2016 +:Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>, + Rusty Russell <rusty@rustcorp.com.au>, + Srivatsa Vaddagiri <vatsa@in.ibm.com>, + Ashok Raj <ashok.raj@intel.com>, + Joel Schopp <jschopp@austin.ibm.com> + +Introduction +============ + +Modern advances in system architectures have introduced advanced error +reporting and correction capabilities in processors. There are couple OEMS that +support NUMA hardware which are hot pluggable as well, where physical node +insertion and removal require support for CPU hotplug. + +Such advances require CPUs available to a kernel to be removed either for +provisioning reasons, or for RAS purposes to keep an offending CPU off +system execution path. Hence the need for CPU hotplug support in the +Linux kernel. + +A more novel use of CPU-hotplug support is its use today in suspend resume +support for SMP. Dual-core and HT support makes even a laptop run SMP kernels +which didn't support these methods. + + +Command Line Switches +===================== +``maxcpus=n`` + Restrict boot time CPUs to *n*. Say if you have fourV CPUs, using + ``maxcpus=2`` will only boot two. You can choose to bring the + other CPUs later online. + +``nr_cpus=n`` + Restrict the total amount CPUs the kernel will support. If the number + supplied here is lower than the number of physically available CPUs than + those CPUs can not be brought online later. + +``additional_cpus=n`` + Use this to limit hotpluggable CPUs. This option sets + ``cpu_possible_mask = cpu_present_mask + additional_cpus`` + + This option is limited to the IA64 architecture. + +``possible_cpus=n`` + This option sets ``possible_cpus`` bits in ``cpu_possible_mask``. + + This option is limited to the X86 and S390 architecture. + +``cede_offline={"off","on"}`` + Use this option to disable/enable putting offlined processors to an extended + ``H_CEDE`` state on supported pseries platforms. If nothing is specified, + ``cede_offline`` is set to "on". + + This option is limited to the PowerPC architecture. + +``cpu0_hotplug`` + Allow to shutdown CPU0. + + This option is limited to the X86 architecture. + +CPU maps +======== + +``cpu_possible_mask`` + Bitmap of possible CPUs that can ever be available in the + system. This is used to allocate some boot time memory for per_cpu variables + that aren't designed to grow/shrink as CPUs are made available or removed. + Once set during boot time discovery phase, the map is static, i.e no bits + are added or removed anytime. Trimming it accurately for your system needs + upfront can save some boot time memory. + +``cpu_online_mask`` + Bitmap of all CPUs currently online. Its set in ``__cpu_up()`` + after a CPU is available for kernel scheduling and ready to receive + interrupts from devices. Its cleared when a CPU is brought down using + ``__cpu_disable()``, before which all OS services including interrupts are + migrated to another target CPU. + +``cpu_present_mask`` + Bitmap of CPUs currently present in the system. Not all + of them may be online. When physical hotplug is processed by the relevant + subsystem (e.g ACPI) can change and new bit either be added or removed + from the map depending on the event is hot-add/hot-remove. There are currently + no locking rules as of now. Typical usage is to init topology during boot, + at which time hotplug is disabled. + +You really don't need to manipulate any of the system CPU maps. They should +be read-only for most use. When setting up per-cpu resources almost always use +``cpu_possible_mask`` or ``for_each_possible_cpu()`` to iterate. To macro +``for_each_cpu()`` can be used to iterate over a custom CPU mask. + +Never use anything other than ``cpumask_t`` to represent bitmap of CPUs. + + +Using CPU hotplug +================= +The kernel option *CONFIG_HOTPLUG_CPU* needs to be enabled. It is currently +available on multiple architectures including ARM, MIPS, PowerPC and X86. The +configuration is done via the sysfs interface: :: + + $ ls -lh /sys/devices/system/cpu + total 0 + drwxr-xr-x 9 root root 0 Dec 21 16:33 cpu0 + drwxr-xr-x 9 root root 0 Dec 21 16:33 cpu1 + drwxr-xr-x 9 root root 0 Dec 21 16:33 cpu2 + drwxr-xr-x 9 root root 0 Dec 21 16:33 cpu3 + drwxr-xr-x 9 root root 0 Dec 21 16:33 cpu4 + drwxr-xr-x 9 root root 0 Dec 21 16:33 cpu5 + drwxr-xr-x 9 root root 0 Dec 21 16:33 cpu6 + drwxr-xr-x 9 root root 0 Dec 21 16:33 cpu7 + drwxr-xr-x 2 root root 0 Dec 21 16:33 hotplug + -r--r--r-- 1 root root 4.0K Dec 21 16:33 offline + -r--r--r-- 1 root root 4.0K Dec 21 16:33 online + -r--r--r-- 1 root root 4.0K Dec 21 16:33 possible + -r--r--r-- 1 root root 4.0K Dec 21 16:33 present + +The files *offline*, *online*, *possible*, *present* represent the CPU masks. +Each CPU folder contains an *online* file which controls the logical on (1) and +off (0) state. To logically shutdown CPU4: :: + + $ echo 0 > /sys/devices/system/cpu/cpu4/online + smpboot: CPU 4 is now offline + +Once the CPU is shutdown, it will be removed from */proc/interrupts*, +*/proc/cpuinfo* and should also not be shown visible by the *top* command. To +bring CPU4 back online: :: + + $ echo 1 > /sys/devices/system/cpu/cpu4/online + smpboot: Booting Node 0 Processor 4 APIC 0x1 + +The CPU is usable again. This should work on all CPUs. CPU0 is often special +and excluded from CPU hotplug. On X86 the kernel option +*CONFIG_BOOTPARAM_HOTPLUG_CPU0* has to be enabled in order to be able to +shutdown CPU0. Alternatively the kernel command option *cpu0_hotplug* can be +used. Some known dependencies of CPU0: + +* Resume from hibernate/suspend. Hibernate/suspend will fail if CPU0 is offline. +* PIC interrupts. CPU0 can't be removed if a PIC interrupt is detected. + +Please let Fenghua Yu <fenghua.yu@intel.com> know if you find any dependencies +on CPU0. + +The CPU hotplug coordination +============================ + +The offline case +---------------- +Once a CPU has been logically shutdown the teardown callbacks of registered +hotplug states will be invoked, starting with ``CPUHP_ONLINE`` and terminating +at state ``CPUHP_OFFLINE``. This includes: + +* If tasks are frozen due to a suspend operation then *cpuhp_tasks_frozen* + will be set to true. +* All processes are migrated away from this outgoing CPU to new CPUs. + The new CPU is chosen from each process' current cpuset, which may be + a subset of all online CPUs. +* All interrupts targeted to this CPU are migrated to a new CPU +* timers are also migrated to a new CPU +* Once all services are migrated, kernel calls an arch specific routine + ``__cpu_disable()`` to perform arch specific cleanup. + +Using the hotplug API +--------------------- +It is possible to receive notifications once a CPU is offline or onlined. This +might be important to certain drivers which need to perform some kind of setup +or clean up functions based on the number of available CPUs: :: + + #include <linux/cpuhotplug.h> + + ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "X/Y:online", + Y_online, Y_prepare_down); + +*X* is the subsystem and *Y* the particular driver. The *Y_online* callback +will be invoked during registration on all online CPUs. If an error +occurs during the online callback the *Y_prepare_down* callback will be +invoked on all CPUs on which the online callback was previously invoked. +After registration completed, the *Y_online* callback will be invoked +once a CPU is brought online and *Y_prepare_down* will be invoked when a +CPU is shutdown. All resources which were previously allocated in +*Y_online* should be released in *Y_prepare_down*. +The return value *ret* is negative if an error occurred during the +registration process. Otherwise a positive value is returned which +contains the allocated hotplug for dynamically allocated states +(*CPUHP_AP_ONLINE_DYN*). It will return zero for predefined states. + +The callback can be remove by invoking ``cpuhp_remove_state()``. In case of a +dynamically allocated state (*CPUHP_AP_ONLINE_DYN*) use the returned state. +During the removal of a hotplug state the teardown callback will be invoked. + +Multiple instances +~~~~~~~~~~~~~~~~~~ +If a driver has multiple instances and each instance needs to perform the +callback independently then it is likely that a ''multi-state'' should be used. +First a multi-state state needs to be registered: :: + + ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, "X/Y:online, + Y_online, Y_prepare_down); + Y_hp_online = ret; + +The ``cpuhp_setup_state_multi()`` behaves similar to ``cpuhp_setup_state()`` +except it prepares the callbacks for a multi state and does not invoke +the callbacks. This is a one time setup. +Once a new instance is allocated, you need to register this new instance: :: + + ret = cpuhp_state_add_instance(Y_hp_online, &d->node); + +This function will add this instance to your previously allocated +*Y_hp_online* state and invoke the previously registered callback +(*Y_online*) on all online CPUs. The *node* element is a ``struct +hlist_node`` member of your per-instance data structure. + +On removal of the instance: :: + cpuhp_state_remove_instance(Y_hp_online, &d->node) + +should be invoked which will invoke the teardown callback on all online +CPUs. + +Manual setup +~~~~~~~~~~~~ +Usually it is handy to invoke setup and teardown callbacks on registration or +removal of a state because usually the operation needs to performed once a CPU +goes online (offline) and during initial setup (shutdown) of the driver. However +each registration and removal function is also available with a ``_nocalls`` +suffix which does not invoke the provided callbacks if the invocation of the +callbacks is not desired. During the manual setup (or teardown) the functions +``get_online_cpus()`` and ``put_online_cpus()`` should be used to inhibit CPU +hotplug operations. + + +The ordering of the events +-------------------------- +The hotplug states are defined in ``include/linux/cpuhotplug.h``: + +* The states *CPUHP_OFFLINE* … *CPUHP_AP_OFFLINE* are invoked before the + CPU is up. +* The states *CPUHP_AP_OFFLINE* … *CPUHP_AP_ONLINE* are invoked + just the after the CPU has been brought up. The interrupts are off and + the scheduler is not yet active on this CPU. Starting with *CPUHP_AP_OFFLINE* + the callbacks are invoked on the target CPU. +* The states between *CPUHP_AP_ONLINE_DYN* and *CPUHP_AP_ONLINE_DYN_END* are + reserved for the dynamic allocation. +* The states are invoked in the reverse order on CPU shutdown starting with + *CPUHP_ONLINE* and stopping at *CPUHP_OFFLINE*. Here the callbacks are + invoked on the CPU that will be shutdown until *CPUHP_AP_OFFLINE*. + +A dynamically allocated state via *CPUHP_AP_ONLINE_DYN* is often enough. +However if an earlier invocation during the bring up or shutdown is required +then an explicit state should be acquired. An explicit state might also be +required if the hotplug event requires specific ordering in respect to +another hotplug event. + +Testing of hotplug states +========================= +One way to verify whether a custom state is working as expected or not is to +shutdown a CPU and then put it online again. It is also possible to put the CPU +to certain state (for instance *CPUHP_AP_ONLINE*) and then go back to +*CPUHP_ONLINE*. This would simulate an error one state after *CPUHP_AP_ONLINE* +which would lead to rollback to the online state. + +All registered states are enumerated in ``/sys/devices/system/cpu/hotplug/states``: :: + + $ tail /sys/devices/system/cpu/hotplug/states + 138: mm/vmscan:online + 139: mm/vmstat:online + 140: lib/percpu_cnt:online + 141: acpi/cpu-drv:online + 142: base/cacheinfo:online + 143: virtio/net:online + 144: x86/mce:online + 145: printk:online + 168: sched:active + 169: online + +To rollback CPU4 to ``lib/percpu_cnt:online`` and back online just issue: :: + + $ cat /sys/devices/system/cpu/cpu4/hotplug/state + 169 + $ echo 140 > /sys/devices/system/cpu/cpu4/hotplug/target + $ cat /sys/devices/system/cpu/cpu4/hotplug/state + 140 + +It is important to note that the teardown callbac of state 140 have been +invoked. And now get back online: :: + + $ echo 169 > /sys/devices/system/cpu/cpu4/hotplug/target + $ cat /sys/devices/system/cpu/cpu4/hotplug/state + 169 + +With trace events enabled, the individual steps are visible, too: :: + + # TASK-PID CPU# TIMESTAMP FUNCTION + # | | | | | + bash-394 [001] 22.976: cpuhp_enter: cpu: 0004 target: 140 step: 169 (cpuhp_kick_ap_work) + cpuhp/4-31 [004] 22.977: cpuhp_enter: cpu: 0004 target: 140 step: 168 (sched_cpu_deactivate) + cpuhp/4-31 [004] 22.990: cpuhp_exit: cpu: 0004 state: 168 step: 168 ret: 0 + cpuhp/4-31 [004] 22.991: cpuhp_enter: cpu: 0004 target: 140 step: 144 (mce_cpu_pre_down) + cpuhp/4-31 [004] 22.992: cpuhp_exit: cpu: 0004 state: 144 step: 144 ret: 0 + cpuhp/4-31 [004] 22.993: cpuhp_multi_enter: cpu: 0004 target: 140 step: 143 (virtnet_cpu_down_prep) + cpuhp/4-31 [004] 22.994: cpuhp_exit: cpu: 0004 state: 143 step: 143 ret: 0 + cpuhp/4-31 [004] 22.995: cpuhp_enter: cpu: 0004 target: 140 step: 142 (cacheinfo_cpu_pre_down) + cpuhp/4-31 [004] 22.996: cpuhp_exit: cpu: 0004 state: 142 step: 142 ret: 0 + bash-394 [001] 22.997: cpuhp_exit: cpu: 0004 state: 140 step: 169 ret: 0 + bash-394 [005] 95.540: cpuhp_enter: cpu: 0004 target: 169 step: 140 (cpuhp_kick_ap_work) + cpuhp/4-31 [004] 95.541: cpuhp_enter: cpu: 0004 target: 169 step: 141 (acpi_soft_cpu_online) + cpuhp/4-31 [004] 95.542: cpuhp_exit: cpu: 0004 state: 141 step: 141 ret: 0 + cpuhp/4-31 [004] 95.543: cpuhp_enter: cpu: 0004 target: 169 step: 142 (cacheinfo_cpu_online) + cpuhp/4-31 [004] 95.544: cpuhp_exit: cpu: 0004 state: 142 step: 142 ret: 0 + cpuhp/4-31 [004] 95.545: cpuhp_multi_enter: cpu: 0004 target: 169 step: 143 (virtnet_cpu_online) + cpuhp/4-31 [004] 95.546: cpuhp_exit: cpu: 0004 state: 143 step: 143 ret: 0 + cpuhp/4-31 [004] 95.547: cpuhp_enter: cpu: 0004 target: 169 step: 144 (mce_cpu_online) + cpuhp/4-31 [004] 95.548: cpuhp_exit: cpu: 0004 state: 144 step: 144 ret: 0 + cpuhp/4-31 [004] 95.549: cpuhp_enter: cpu: 0004 target: 169 step: 145 (console_cpu_notify) + cpuhp/4-31 [004] 95.550: cpuhp_exit: cpu: 0004 state: 145 step: 145 ret: 0 + cpuhp/4-31 [004] 95.551: cpuhp_enter: cpu: 0004 target: 169 step: 168 (sched_cpu_activate) + cpuhp/4-31 [004] 95.552: cpuhp_exit: cpu: 0004 state: 168 step: 168 ret: 0 + bash-394 [005] 95.553: cpuhp_exit: cpu: 0004 state: 169 step: 140 ret: 0 + +As it an be seen, CPU4 went down until timestamp 22.996 and then back up until +95.552. All invoked callbacks including their return codes are visible in the +trace. + +Architecture's requirements +=========================== +The following functions and configurations are required: + +``CONFIG_HOTPLUG_CPU`` + This entry needs to be enabled in Kconfig + +``__cpu_up()`` + Arch interface to bring up a CPU + +``__cpu_disable()`` + Arch interface to shutdown a CPU, no more interrupts can be handled by the + kernel after the routine returns. This includes the shutdown of the timer. + +``__cpu_die()`` + This actually supposed to ensure death of the CPU. Actually look at some + example code in other arch that implement CPU hotplug. The processor is taken + down from the ``idle()`` loop for that specific architecture. ``__cpu_die()`` + typically waits for some per_cpu state to be set, to ensure the processor dead + routine is called to be sure positively. + +User Space Notification +======================= +After CPU successfully onlined or offline udev events are sent. A udev rule like: :: + + SUBSYSTEM=="cpu", DRIVERS=="processor", DEVPATH=="/devices/system/cpu/*", RUN+="the_hotplug_receiver.sh" + +will receive all events. A script like: :: + + #!/bin/sh + + if [ "${ACTION}" = "offline" ] + then + echo "CPU ${DEVPATH##*/} offline" + + elif [ "${ACTION}" = "online" ] + then + echo "CPU ${DEVPATH##*/} online" + + fi + +can process the event further. + +Kernel Inline Documentations Reference +====================================== + +.. kernel-doc:: include/linux/cpuhotplug.h diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst index 2872ca1a52f1..0d93d8089136 100644 --- a/Documentation/core-api/index.rst +++ b/Documentation/core-api/index.rst @@ -13,6 +13,7 @@ Core utilities assoc_array atomic_ops + cpu_hotplug local_ops workqueue diff --git a/Documentation/cpu-freq/user-guide.txt b/Documentation/cpu-freq/user-guide.txt index 107f6fdd7d14..391da64e9492 100644 --- a/Documentation/cpu-freq/user-guide.txt +++ b/Documentation/cpu-freq/user-guide.txt @@ -82,7 +82,9 @@ UltraSPARC-III ------- Several "PowerBook" and "iBook2" notebooks are supported. - +The following POWER processors are supported in powernv mode: +POWER8 +POWER9 1.5 SuperH ---------- diff --git a/Documentation/cpu-hotplug.txt b/Documentation/cpu-hotplug.txt deleted file mode 100644 index d02e8a451872..000000000000 --- a/Documentation/cpu-hotplug.txt +++ /dev/null @@ -1,452 +0,0 @@ - CPU hotplug Support in Linux(tm) Kernel - - Maintainers: - CPU Hotplug Core: - Rusty Russell <rusty@rustcorp.com.au> - Srivatsa Vaddagiri <vatsa@in.ibm.com> - i386: - Zwane Mwaikambo <zwanem@gmail.com> - ppc64: - Nathan Lynch <nathanl@austin.ibm.com> - Joel Schopp <jschopp@austin.ibm.com> - ia64/x86_64: - Ashok Raj <ashok.raj@intel.com> - s390: - Heiko Carstens <heiko.carstens@de.ibm.com> - -Authors: Ashok Raj <ashok.raj@intel.com> -Lots of feedback: Nathan Lynch <nathanl@austin.ibm.com>, - Joel Schopp <jschopp@austin.ibm.com> - -Introduction - -Modern advances in system architectures have introduced advanced error -reporting and correction capabilities in processors. CPU architectures permit -partitioning support, where compute resources of a single CPU could be made -available to virtual machine environments. There are couple OEMS that -support NUMA hardware which are hot pluggable as well, where physical -node insertion and removal require support for CPU hotplug. - -Such advances require CPUs available to a kernel to be removed either for -provisioning reasons, or for RAS purposes to keep an offending CPU off -system execution path. Hence the need for CPU hotplug support in the -Linux kernel. - -A more novel use of CPU-hotplug support is its use today in suspend -resume support for SMP. Dual-core and HT support makes even -a laptop run SMP kernels which didn't support these methods. SMP support -for suspend/resume is a work in progress. - -General Stuff about CPU Hotplug --------------------------------- - -Command Line Switches ---------------------- -maxcpus=n Restrict boot time cpus to n. Say if you have 4 cpus, using - maxcpus=2 will only boot 2. You can choose to bring the - other cpus later online, read FAQ's for more info. - -additional_cpus=n (*) Use this to limit hotpluggable cpus. This option sets - cpu_possible_mask = cpu_present_mask + additional_cpus - -cede_offline={"off","on"} Use this option to disable/enable putting offlined - processors to an extended H_CEDE state on - supported pseries platforms. - If nothing is specified, - cede_offline is set to "on". - -(*) Option valid only for following architectures -- ia64 - -ia64 uses the number of disabled local apics in ACPI tables MADT to -determine the number of potentially hot-pluggable cpus. The implementation -should only rely on this to count the # of cpus, but *MUST* not rely -on the apicid values in those tables for disabled apics. In the event -BIOS doesn't mark such hot-pluggable cpus as disabled entries, one could -use this parameter "additional_cpus=x" to represent those cpus in the -cpu_possible_mask. - -possible_cpus=n [s390,x86_64] use this to set hotpluggable cpus. - This option sets possible_cpus bits in - cpu_possible_mask. Thus keeping the numbers of bits set - constant even if the machine gets rebooted. - -CPU maps and such ------------------ -[More on cpumaps and primitive to manipulate, please check -include/linux/cpumask.h that has more descriptive text.] - -cpu_possible_mask: Bitmap of possible CPUs that can ever be available in the -system. This is used to allocate some boot time memory for per_cpu variables -that aren't designed to grow/shrink as CPUs are made available or removed. -Once set during boot time discovery phase, the map is static, i.e no bits -are added or removed anytime. Trimming it accurately for your system needs -upfront can save some boot time memory. See below for how we use heuristics -in x86_64 case to keep this under check. - -cpu_online_mask: Bitmap of all CPUs currently online. It's set in __cpu_up() -after a CPU is available for kernel scheduling and ready to receive -interrupts from devices. It's cleared when a CPU is brought down using -__cpu_disable(), before which all OS services including interrupts are -migrated to another target CPU. - -cpu_present_mask: Bitmap of CPUs currently present in the system. Not all -of them may be online. When physical hotplug is processed by the relevant -subsystem (e.g ACPI) can change and new bit either be added or removed -from the map depending on the event is hot-add/hot-remove. There are currently -no locking rules as of now. Typical usage is to init topology during boot, -at which time hotplug is disabled. - -You really dont need to manipulate any of the system cpu maps. They should -be read-only for most use. When setting up per-cpu resources almost always use -cpu_possible_mask/for_each_possible_cpu() to iterate. - -Never use anything other than cpumask_t to represent bitmap of CPUs. - - #include <linux/cpumask.h> - - for_each_possible_cpu - Iterate over cpu_possible_mask - for_each_online_cpu - Iterate over cpu_online_mask - for_each_present_cpu - Iterate over cpu_present_mask - for_each_cpu(x,mask) - Iterate over some random collection of cpu mask. - - #include <linux/cpu.h> - get_online_cpus() and put_online_cpus(): - -The above calls are used to inhibit cpu hotplug operations. While the -cpu_hotplug.refcount is non zero, the cpu_online_mask will not change. -If you merely need to avoid cpus going away, you could also use -preempt_disable() and preempt_enable() for those sections. -Just remember the critical section cannot call any -function that can sleep or schedule this process away. The preempt_disable() -will work as long as stop_machine_run() is used to take a cpu down. - -CPU Hotplug - Frequently Asked Questions. - -Q: How to enable my kernel to support CPU hotplug? -A: When doing make defconfig, Enable CPU hotplug support - - "Processor type and Features" -> Support for Hotpluggable CPUs - -Make sure that you have CONFIG_SMP turned on as well. - -You would need to enable CONFIG_HOTPLUG_CPU for SMP suspend/resume support -as well. - -Q: What architectures support CPU hotplug? -A: As of 2.6.14, the following architectures support CPU hotplug. - -i386 (Intel), ppc, ppc64, parisc, s390, ia64 and x86_64 - -Q: How to test if hotplug is supported on the newly built kernel? -A: You should now notice an entry in sysfs. - -Check if sysfs is mounted, using the "mount" command. You should notice -an entry as shown below in the output. - - .... - none on /sys type sysfs (rw) - .... - -If this is not mounted, do the following. - - #mkdir /sys - #mount -t sysfs sys /sys - -Now you should see entries for all present cpu, the following is an example -in a 8-way system. - - #pwd - #/sys/devices/system/cpu - #ls -l - total 0 - drwxr-xr-x 10 root root 0 Sep 19 07:44 . - drwxr-xr-x 13 root root 0 Sep 19 07:45 .. - drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu0 - drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu1 - drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu2 - drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu3 - drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu4 - drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu5 - drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu6 - drwxr-xr-x 3 root root 0 Sep 19 07:48 cpu7 - -Under each directory you would find an "online" file which is the control -file to logically online/offline a processor. - -Q: Does hot-add/hot-remove refer to physical add/remove of cpus? -A: The usage of hot-add/remove may not be very consistently used in the code. -CONFIG_HOTPLUG_CPU enables logical online/offline capability in the kernel. -To support physical addition/removal, one would need some BIOS hooks and -the platform should have something like an attention button in PCI hotplug. -CONFIG_ACPI_HOTPLUG_CPU enables ACPI support for physical add/remove of CPUs. - -Q: How do I logically offline a CPU? -A: Do the following. - - #echo 0 > /sys/devices/system/cpu/cpuX/online - -Once the logical offline is successful, check - - #cat /proc/interrupts - -You should now not see the CPU that you removed. Also online file will report -the state as 0 when a CPU is offline and 1 when it's online. - - #To display the current cpu state. - #cat /sys/devices/system/cpu/cpuX/online - -Q: Why can't I remove CPU0 on some systems? -A: Some architectures may have some special dependency on a certain CPU. - -For e.g in IA64 platforms we have ability to send platform interrupts to the -OS. a.k.a Corrected Platform Error Interrupts (CPEI). In current ACPI -specifications, we didn't have a way to change the target CPU. Hence if the -current ACPI version doesn't support such re-direction, we disable that CPU -by making it not-removable. - -In such cases you will also notice that the online file is missing under cpu0. - -Q: Is CPU0 removable on X86? -A: Yes. If kernel is compiled with CONFIG_BOOTPARAM_HOTPLUG_CPU0=y, CPU0 is -removable by default. Otherwise, CPU0 is also removable by kernel option -cpu0_hotplug. - -But some features depend on CPU0. Two known dependencies are: - -1. Resume from hibernate/suspend depends on CPU0. Hibernate/suspend will fail if -CPU0 is offline and you need to online CPU0 before hibernate/suspend can -continue. -2. PIC interrupts also depend on CPU0. CPU0 can't be removed if a PIC interrupt -is detected. - -It's said poweroff/reboot may depend on CPU0 on some machines although I haven't -seen any poweroff/reboot failure so far after CPU0 is offline on a few tested -machines. - -Please let me know if you know or see any other dependencies of CPU0. - -If the dependencies are under your control, you can turn on CPU0 hotplug feature -either by CONFIG_BOOTPARAM_HOTPLUG_CPU0 or by kernel parameter cpu0_hotplug. - ---Fenghua Yu <fenghua.yu@intel.com> - -Q: How do I find out if a particular CPU is not removable? -A: Depending on the implementation, some architectures may show this by the -absence of the "online" file. This is done if it can be determined ahead of -time that this CPU cannot be removed. - -In some situations, this can be a run time check, i.e if you try to remove the -last CPU, this will not be permitted. You can find such failures by -investigating the return value of the "echo" command. - -Q: What happens when a CPU is being logically offlined? -A: The following happen, listed in no particular order :-) - -- A notification is sent to in-kernel registered modules by sending an event - CPU_DOWN_PREPARE or CPU_DOWN_PREPARE_FROZEN, depending on whether or not the - CPU is being offlined while tasks are frozen due to a suspend operation in - progress -- All processes are migrated away from this outgoing CPU to new CPUs. - The new CPU is chosen from each process' current cpuset, which may be - a subset of all online CPUs. -- All interrupts targeted to this CPU are migrated to a new CPU -- timers/bottom half/task lets are also migrated to a new CPU -- Once all services are migrated, kernel calls an arch specific routine - __cpu_disable() to perform arch specific cleanup. -- Once this is successful, an event for successful cleanup is sent by an event - CPU_DEAD (or CPU_DEAD_FROZEN if tasks are frozen due to a suspend while the - CPU is being offlined). - - "It is expected that each service cleans up when the CPU_DOWN_PREPARE - notifier is called, when CPU_DEAD is called it's expected there is nothing - running on behalf of this CPU that was offlined" - -Q: If I have some kernel code that needs to be aware of CPU arrival and - departure, how to i arrange for proper notification? -A: This is what you would need in your kernel code to receive notifications. - - #include <linux/cpu.h> - static int foobar_cpu_callback(struct notifier_block *nfb, - unsigned long action, void *hcpu) - { - unsigned int cpu = (unsigned long)hcpu; - - switch (action) { - case CPU_ONLINE: - case CPU_ONLINE_FROZEN: - foobar_online_action(cpu); - break; - case CPU_DEAD: - case CPU_DEAD_FROZEN: - foobar_dead_action(cpu); - break; - } - return NOTIFY_OK; - } - - static struct notifier_block foobar_cpu_notifier = - { - .notifier_call = foobar_cpu_callback, - }; - -You need to call register_cpu_notifier() from your init function. -Init functions could be of two types: -1. early init (init function called when only the boot processor is online). -2. late init (init function called _after_ all the CPUs are online). - -For the first case, you should add the following to your init function - - register_cpu_notifier(&foobar_cpu_notifier); - -For the second case, you should add the following to your init function - - register_hotcpu_notifier(&foobar_cpu_notifier); - -You can fail PREPARE notifiers if something doesn't work to prepare resources. -This will stop the activity and send a following CANCELED event back. - -CPU_DEAD should not be failed, its just a goodness indication, but bad -things will happen if a notifier in path sent a BAD notify code. - -Q: I don't see my action being called for all CPUs already up and running? -A: Yes, CPU notifiers are called only when new CPUs are on-lined or offlined. - If you need to perform some action for each CPU already in the system, then - do this: - - for_each_online_cpu(i) { - foobar_cpu_callback(&foobar_cpu_notifier, CPU_UP_PREPARE, i); - foobar_cpu_callback(&foobar_cpu_notifier, CPU_ONLINE, i); - } - - However, if you want to register a hotplug callback, as well as perform - some initialization for CPUs that are already online, then do this: - - Version 1: (Correct) - --------- - - cpu_notifier_register_begin(); - - for_each_online_cpu(i) { - foobar_cpu_callback(&foobar_cpu_notifier, - CPU_UP_PREPARE, i); - foobar_cpu_callback(&foobar_cpu_notifier, - CPU_ONLINE, i); - } - - /* Note the use of the double underscored version of the API */ - __register_cpu_notifier(&foobar_cpu_notifier); - - cpu_notifier_register_done(); - - Note that the following code is *NOT* the right way to achieve this, - because it is prone to an ABBA deadlock between the cpu_add_remove_lock - and the cpu_hotplug.lock. - - Version 2: (Wrong!) - --------- - - get_online_cpus(); - - for_each_online_cpu(i) { - foobar_cpu_callback(&foobar_cpu_notifier, - CPU_UP_PREPARE, i); - foobar_cpu_callback(&foobar_cpu_notifier, - CPU_ONLINE, i); - } - - register_cpu_notifier(&foobar_cpu_notifier); - - put_online_cpus(); - - So always use the first version shown above when you want to register - callbacks as well as initialize the already online CPUs. - - -Q: If I would like to develop CPU hotplug support for a new architecture, - what do I need at a minimum? -A: The following are what is required for CPU hotplug infrastructure to work - correctly. - - - Make sure you have an entry in Kconfig to enable CONFIG_HOTPLUG_CPU - - __cpu_up() - Arch interface to bring up a CPU - - __cpu_disable() - Arch interface to shutdown a CPU, no more interrupts - can be handled by the kernel after the routine - returns. Including local APIC timers etc are - shutdown. - - __cpu_die() - This actually supposed to ensure death of the CPU. - Actually look at some example code in other arch - that implement CPU hotplug. The processor is taken - down from the idle() loop for that specific - architecture. __cpu_die() typically waits for some - per_cpu state to be set, to ensure the processor - dead routine is called to be sure positively. - -Q: I need to ensure that a particular CPU is not removed when there is some - work specific to this CPU in progress. -A: There are two ways. If your code can be run in interrupt context, use - smp_call_function_single(), otherwise use work_on_cpu(). Note that - work_on_cpu() is slow, and can fail due to out of memory: - - int my_func_on_cpu(int cpu) - { - int err; - get_online_cpus(); - if (!cpu_online(cpu)) - err = -EINVAL; - else -#if NEEDS_BLOCKING - err = work_on_cpu(cpu, __my_func_on_cpu, NULL); -#else - smp_call_function_single(cpu, __my_func_on_cpu, &err, - true); -#endif - put_online_cpus(); - return err; - } - -Q: How do we determine how many CPUs are available for hotplug. -A: There is no clear spec defined way from ACPI that can give us that - information today. Based on some input from Natalie of Unisys, - that the ACPI MADT (Multiple APIC Description Tables) marks those possible - CPUs in a system with disabled status. - - Andi implemented some simple heuristics that count the number of disabled - CPUs in MADT as hotpluggable CPUS. In the case there are no disabled CPUS - we assume 1/2 the number of CPUs currently present can be hotplugged. - - Caveat: ACPI MADT can only provide 256 entries in systems with only ACPI 2.0c - or earlier ACPI version supported, because the apicid field in MADT is only - 8 bits. From ACPI 3.0, this limitation was removed since the apicid field - was extended to 32 bits with x2APIC introduced. - -User Space Notification - -Hotplug support for devices is common in Linux today. Its being used today to -support automatic configuration of network, usb and pci devices. A hotplug -event can be used to invoke an agent script to perform the configuration task. - -You can add /etc/hotplug/cpu.agent to handle hotplug notification user space -scripts. - - #!/bin/bash - # $Id: cpu.agent - # Kernel hotplug params include: - #ACTION=%s [online or offline] - #DEVPATH=%s - # - cd /etc/hotplug - . ./hotplug.functions - - case $ACTION in - online) - echo `date` ":cpu.agent" add cpu >> /tmp/hotplug.txt - ;; - offline) - echo `date` ":cpu.agent" remove cpu >>/tmp/hotplug.txt - ;; - *) - debug_mesg CPU $ACTION event not supported - exit 1 - ;; - esac diff --git a/Documentation/dev-tools/sparse.rst b/Documentation/dev-tools/sparse.rst index 78aa00a604a0..ffdcc97f6f5a 100644 --- a/Documentation/dev-tools/sparse.rst +++ b/Documentation/dev-tools/sparse.rst @@ -103,3 +103,9 @@ have already built it. The optional make variable CF can be used to pass arguments to sparse. The build system passes -Wbitwise to sparse automatically. + +Checking RCU annotations +~~~~~~~~~~~~~~~~~~~~~~~~ + +RCU annotations are not checked by default. To enable RCU annotation +checks, include -DCONFIG_SPARSE_RCU_POINTER in your CF flags. diff --git a/Documentation/dontdiff b/Documentation/dontdiff index a23edccd2059..77b92221f951 100644 --- a/Documentation/dontdiff +++ b/Documentation/dontdiff @@ -116,9 +116,11 @@ crc32table.h* cscope.* defkeymap.c devlist.h* +devicetable-offsets.h dnotify_test docproc dslm +dtc elf2ecoff elfconfig.h* evergreen_reg_safe.h @@ -153,8 +155,8 @@ keywords.c ksym.c* ksym.h* kxgettext -lex.c -lex.*.c +*lex.c +*lex.*.c linux logo_*.c logo_*_clut224.c @@ -215,6 +217,7 @@ series setup setup.bin setup.elf +sortextable sImage sm_tbl* split-include diff --git a/Documentation/driver-api/device-io.rst b/Documentation/driver-api/device-io.rst new file mode 100644 index 000000000000..b00b23903078 --- /dev/null +++ b/Documentation/driver-api/device-io.rst @@ -0,0 +1,201 @@ +.. Copyright 2001 Matthew Wilcox +.. +.. This documentation is free software; you can redistribute +.. it and/or modify it under the terms of the GNU General Public +.. License as published by the Free Software Foundation; either +.. version 2 of the License, or (at your option) any later +.. version. + +=============================== +Bus-Independent Device Accesses +=============================== + +:Author: Matthew Wilcox +:Author: Alan Cox + +Introduction +============ + +Linux provides an API which abstracts performing IO across all busses +and devices, allowing device drivers to be written independently of bus +type. + +Memory Mapped IO +================ + +Getting Access to the Device +---------------------------- + +The most widely supported form of IO is memory mapped IO. That is, a +part of the CPU's address space is interpreted not as accesses to +memory, but as accesses to a device. Some architectures define devices +to be at a fixed address, but most have some method of discovering +devices. The PCI bus walk is a good example of such a scheme. This +document does not cover how to receive such an address, but assumes you +are starting with one. Physical addresses are of type unsigned long. + +This address should not be used directly. Instead, to get an address +suitable for passing to the accessor functions described below, you +should call :c:func:`ioremap()`. An address suitable for accessing +the device will be returned to you. + +After you've finished using the device (say, in your module's exit +routine), call :c:func:`iounmap()` in order to return the address +space to the kernel. Most architectures allocate new address space each +time you call :c:func:`ioremap()`, and they can run out unless you +call :c:func:`iounmap()`. + +Accessing the device +-------------------- + +The part of the interface most used by drivers is reading and writing +memory-mapped registers on the device. Linux provides interfaces to read +and write 8-bit, 16-bit, 32-bit and 64-bit quantities. Due to a +historical accident, these are named byte, word, long and quad accesses. +Both read and write accesses are supported; there is no prefetch support +at this time. + +The functions are named readb(), readw(), readl(), readq(), +readb_relaxed(), readw_relaxed(), readl_relaxed(), readq_relaxed(), +writeb(), writew(), writel() and writeq(). + +Some devices (such as framebuffers) would like to use larger transfers than +8 bytes at a time. For these devices, the :c:func:`memcpy_toio()`, +:c:func:`memcpy_fromio()` and :c:func:`memset_io()` functions are +provided. Do not use memset or memcpy on IO addresses; they are not +guaranteed to copy data in order. + +The read and write functions are defined to be ordered. That is the +compiler is not permitted to reorder the I/O sequence. When the ordering +can be compiler optimised, you can use __readb() and friends to +indicate the relaxed ordering. Use this with care. + +While the basic functions are defined to be synchronous with respect to +each other and ordered with respect to each other the busses the devices +sit on may themselves have asynchronicity. In particular many authors +are burned by the fact that PCI bus writes are posted asynchronously. A +driver author must issue a read from the same device to ensure that +writes have occurred in the specific cases the author cares. This kind +of property cannot be hidden from driver writers in the API. In some +cases, the read used to flush the device may be expected to fail (if the +card is resetting, for example). In that case, the read should be done +from config space, which is guaranteed to soft-fail if the card doesn't +respond. + +The following is an example of flushing a write to a device when the +driver would like to ensure the write's effects are visible prior to +continuing execution:: + + static inline void + qla1280_disable_intrs(struct scsi_qla_host *ha) + { + struct device_reg *reg; + + reg = ha->iobase; + /* disable risc and host interrupts */ + WRT_REG_WORD(®->ictrl, 0); + /* + * The following read will ensure that the above write + * has been received by the device before we return from this + * function. + */ + RD_REG_WORD(®->ictrl); + ha->flags.ints_enabled = 0; + } + +In addition to write posting, on some large multiprocessing systems +(e.g. SGI Challenge, Origin and Altix machines) posted writes won't be +strongly ordered coming from different CPUs. Thus it's important to +properly protect parts of your driver that do memory-mapped writes with +locks and use the :c:func:`mmiowb()` to make sure they arrive in the +order intended. Issuing a regular readX() will also ensure write ordering, +but should only be used when the +driver has to be sure that the write has actually arrived at the device +(not that it's simply ordered with respect to other writes), since a +full readX() is a relatively expensive operation. + +Generally, one should use :c:func:`mmiowb()` prior to releasing a spinlock +that protects regions using :c:func:`writeb()` or similar functions that +aren't surrounded by readb() calls, which will ensure ordering +and flushing. The following pseudocode illustrates what might occur if +write ordering isn't guaranteed via :c:func:`mmiowb()` or one of the +readX() functions:: + + CPU A: spin_lock_irqsave(&dev_lock, flags) + CPU A: ... + CPU A: writel(newval, ring_ptr); + CPU A: spin_unlock_irqrestore(&dev_lock, flags) + ... + CPU B: spin_lock_irqsave(&dev_lock, flags) + CPU B: writel(newval2, ring_ptr); + CPU B: ... + CPU B: spin_unlock_irqrestore(&dev_lock, flags) + +In the case above, newval2 could be written to ring_ptr before newval. +Fixing it is easy though:: + + CPU A: spin_lock_irqsave(&dev_lock, flags) + CPU A: ... + CPU A: writel(newval, ring_ptr); + CPU A: mmiowb(); /* ensure no other writes beat us to the device */ + CPU A: spin_unlock_irqrestore(&dev_lock, flags) + ... + CPU B: spin_lock_irqsave(&dev_lock, flags) + CPU B: writel(newval2, ring_ptr); + CPU B: ... + CPU B: mmiowb(); + CPU B: spin_unlock_irqrestore(&dev_lock, flags) + +See tg3.c for a real world example of how to use :c:func:`mmiowb()` + +PCI ordering rules also guarantee that PIO read responses arrive after any +outstanding DMA writes from that bus, since for some devices the result of +a readb() call may signal to the driver that a DMA transaction is +complete. In many cases, however, the driver may want to indicate that the +next readb() call has no relation to any previous DMA writes +performed by the device. The driver can use readb_relaxed() for +these cases, although only some platforms will honor the relaxed +semantics. Using the relaxed read functions will provide significant +performance benefits on platforms that support it. The qla2xxx driver +provides examples of how to use readX_relaxed(). In many cases, a majority +of the driver's readX() calls can safely be converted to readX_relaxed() +calls, since only a few will indicate or depend on DMA completion. + +Port Space Accesses +=================== + +Port Space Explained +-------------------- + +Another form of IO commonly supported is Port Space. This is a range of +addresses separate to the normal memory address space. Access to these +addresses is generally not as fast as accesses to the memory mapped +addresses, and it also has a potentially smaller address space. + +Unlike memory mapped IO, no preparation is required to access port +space. + +Accessing Port Space +-------------------- + +Accesses to this space are provided through a set of functions which +allow 8-bit, 16-bit and 32-bit accesses; also known as byte, word and +long. These functions are :c:func:`inb()`, :c:func:`inw()`, +:c:func:`inl()`, :c:func:`outb()`, :c:func:`outw()` and +:c:func:`outl()`. + +Some variants are provided for these functions. Some devices require +that accesses to their ports are slowed down. This functionality is +provided by appending a ``_p`` to the end of the function. +There are also equivalents to memcpy. The :c:func:`ins()` and +:c:func:`outs()` functions copy bytes, words or longs to the given +port. + +Public Functions Provided +========================= + +.. kernel-doc:: arch/x86/include/asm/io.h + :internal: + +.. kernel-doc:: lib/pci_iomap.c + :export: diff --git a/Documentation/driver-api/device_link.rst b/Documentation/driver-api/device_link.rst index 5f5713448703..70e328e16aad 100644 --- a/Documentation/driver-api/device_link.rst +++ b/Documentation/driver-api/device_link.rst @@ -1,3 +1,6 @@ +.. |struct dev_pm_domain| replace:: :c:type:`struct dev_pm_domain <dev_pm_domain>` +.. |struct generic_pm_domain| replace:: :c:type:`struct generic_pm_domain <generic_pm_domain>` + ============ Device links ============ @@ -120,12 +123,11 @@ Examples is the same as if the MMU was the parent of the master device. The fact that both devices share the same power domain would normally - suggest usage of a :c:type:`struct dev_pm_domain` or :c:type:`struct - generic_pm_domain`, however these are not independent devices that - happen to share a power switch, but rather the MMU device serves the - busmaster device and is useless without it. A device link creates a - synthetic hierarchical relationship between the devices and is thus - more apt. + suggest usage of a |struct dev_pm_domain| or |struct generic_pm_domain|, + however these are not independent devices that happen to share a power + switch, but rather the MMU device serves the busmaster device and is + useless without it. A device link creates a synthetic hierarchical + relationship between the devices and is thus more apt. * A Thunderbolt host controller comprises a number of PCIe hotplug ports and an NHI device to manage the PCIe switch. On resume from system sleep, @@ -157,7 +159,7 @@ Examples Alternatives ============ -* A :c:type:`struct dev_pm_domain` can be used to override the bus, +* A |struct dev_pm_domain| can be used to override the bus, class or device type callbacks. It is intended for devices sharing a single on/off switch, however it does not guarantee a specific suspend/resume ordering, this needs to be implemented separately. @@ -166,7 +168,7 @@ Alternatives suspended. Furthermore it cannot be used to enforce a specific shutdown ordering or a driver presence dependency. -* A :c:type:`struct generic_pm_domain` is a lot more heavyweight than a +* A |struct generic_pm_domain| is a lot more heavyweight than a device link and does not allow for shutdown ordering or driver presence dependencies. It also cannot be used on ACPI systems. diff --git a/Documentation/driver-api/iio/buffers.rst b/Documentation/driver-api/iio/buffers.rst new file mode 100644 index 000000000000..02c99a6bee18 --- /dev/null +++ b/Documentation/driver-api/iio/buffers.rst @@ -0,0 +1,125 @@ +======= +Buffers +======= + +* struct :c:type:`iio_buffer` — general buffer structure +* :c:func:`iio_validate_scan_mask_onehot` — Validates that exactly one channel + is selected +* :c:func:`iio_buffer_get` — Grab a reference to the buffer +* :c:func:`iio_buffer_put` — Release the reference to the buffer + +The Industrial I/O core offers a way for continuous data capture based on a +trigger source. Multiple data channels can be read at once from +:file:`/dev/iio:device{X}` character device node, thus reducing the CPU load. + +IIO buffer sysfs interface +========================== +An IIO buffer has an associated attributes directory under +:file:`/sys/bus/iio/iio:device{X}/buffer/*`. Here are some of the existing +attributes: + +* :file:`length`, the total number of data samples (capacity) that can be + stored by the buffer. +* :file:`enable`, activate buffer capture. + +IIO buffer setup +================ + +The meta information associated with a channel reading placed in a buffer is +called a scan element . The important bits configuring scan elements are +exposed to userspace applications via the +:file:`/sys/bus/iio/iio:device{X}/scan_elements/*` directory. This file contains +attributes of the following form: + +* :file:`enable`, used for enabling a channel. If and only if its attribute + is non *zero*, then a triggered capture will contain data samples for this + channel. +* :file:`type`, description of the scan element data storage within the buffer + and hence the form in which it is read from user space. + Format is [be|le]:[s|u]bits/storagebitsXrepeat[>>shift] . + * *be* or *le*, specifies big or little endian. + * *s* or *u*, specifies if signed (2's complement) or unsigned. + * *bits*, is the number of valid data bits. + * *storagebits*, is the number of bits (after padding) that it occupies in the + buffer. + * *shift*, if specified, is the shift that needs to be applied prior to + masking out unused bits. + * *repeat*, specifies the number of bits/storagebits repetitions. When the + repeat element is 0 or 1, then the repeat value is omitted. + +For example, a driver for a 3-axis accelerometer with 12 bit resolution where +data is stored in two 8-bits registers as follows:: + + 7 6 5 4 3 2 1 0 + +---+---+---+---+---+---+---+---+ + |D3 |D2 |D1 |D0 | X | X | X | X | (LOW byte, address 0x06) + +---+---+---+---+---+---+---+---+ + + 7 6 5 4 3 2 1 0 + +---+---+---+---+---+---+---+---+ + |D11|D10|D9 |D8 |D7 |D6 |D5 |D4 | (HIGH byte, address 0x07) + +---+---+---+---+---+---+---+---+ + +will have the following scan element type for each axis:: + + $ cat /sys/bus/iio/devices/iio:device0/scan_elements/in_accel_y_type + le:s12/16>>4 + +A user space application will interpret data samples read from the buffer as +two byte little endian signed data, that needs a 4 bits right shift before +masking out the 12 valid bits of data. + +For implementing buffer support a driver should initialize the following +fields in iio_chan_spec definition:: + + struct iio_chan_spec { + /* other members */ + int scan_index + struct { + char sign; + u8 realbits; + u8 storagebits; + u8 shift; + u8 repeat; + enum iio_endian endianness; + } scan_type; + }; + +The driver implementing the accelerometer described above will have the +following channel definition:: + + struct struct iio_chan_spec accel_channels[] = { + { + .type = IIO_ACCEL, + .modified = 1, + .channel2 = IIO_MOD_X, + /* other stuff here */ + .scan_index = 0, + .scan_type = { + .sign = 's', + .realbits = 12, + .storagebits = 16, + .shift = 4, + .endianness = IIO_LE, + }, + } + /* similar for Y (with channel2 = IIO_MOD_Y, scan_index = 1) + * and Z (with channel2 = IIO_MOD_Z, scan_index = 2) axis + */ + } + +Here **scan_index** defines the order in which the enabled channels are placed +inside the buffer. Channels with a lower **scan_index** will be placed before +channels with a higher index. Each channel needs to have a unique +**scan_index**. + +Setting **scan_index** to -1 can be used to indicate that the specific channel +does not support buffered capture. In this case no entries will be created for +the channel in the scan_elements directory. + +More details +============ +.. kernel-doc:: include/linux/iio/buffer.h +.. kernel-doc:: drivers/iio/industrialio-buffer.c + :export: + diff --git a/Documentation/driver-api/iio/core.rst b/Documentation/driver-api/iio/core.rst new file mode 100644 index 000000000000..9a34ae03b679 --- /dev/null +++ b/Documentation/driver-api/iio/core.rst @@ -0,0 +1,182 @@ +============= +Core elements +============= + +The Industrial I/O core offers a unified framework for writing drivers for +many different types of embedded sensors. a standard interface to user space +applications manipulating sensors. The implementation can be found under +:file:`drivers/iio/industrialio-*` + +Industrial I/O Devices +---------------------- + +* struct :c:type:`iio_dev` - industrial I/O device +* :c:func:`iio_device_alloc()` - alocate an :c:type:`iio_dev` from a driver +* :c:func:`iio_device_free()` - free an :c:type:`iio_dev` from a driver +* :c:func:`iio_device_register()` - register a device with the IIO subsystem +* :c:func:`iio_device_unregister()` - unregister a device from the IIO + subsystem + +An IIO device usually corresponds to a single hardware sensor and it +provides all the information needed by a driver handling a device. +Let's first have a look at the functionality embedded in an IIO device +then we will show how a device driver makes use of an IIO device. + +There are two ways for a user space application to interact with an IIO driver. + +1. :file:`/sys/bus/iio/iio:device{X}/`, this represents a hardware sensor + and groups together the data channels of the same chip. +2. :file:`/dev/iio:device{X}`, character device node interface used for + buffered data transfer and for events information retrieval. + +A typical IIO driver will register itself as an :doc:`I2C <../i2c>` or +:doc:`SPI <../spi>` driver and will create two routines, probe and remove. + +At probe: + +1. Call :c:func:`iio_device_alloc()`, which allocates memory for an IIO device. +2. Initialize IIO device fields with driver specific information (e.g. + device name, device channels). +3. Call :c:func:`iio_device_register()`, this registers the device with the + IIO core. After this call the device is ready to accept requests from user + space applications. + +At remove, we free the resources allocated in probe in reverse order: + +1. :c:func:`iio_device_unregister()`, unregister the device from the IIO core. +2. :c:func:`iio_device_free()`, free the memory allocated for the IIO device. + +IIO device sysfs interface +========================== + +Attributes are sysfs files used to expose chip info and also allowing +applications to set various configuration parameters. For device with +index X, attributes can be found under /sys/bus/iio/iio:deviceX/ directory. +Common attributes are: + +* :file:`name`, description of the physical chip. +* :file:`dev`, shows the major:minor pair associated with + :file:`/dev/iio:deviceX` node. +* :file:`sampling_frequency_available`, available discrete set of sampling + frequency values for device. +* Available standard attributes for IIO devices are described in the + :file:`Documentation/ABI/testing/sysfs-bus-iio` file in the Linux kernel + sources. + +IIO device channels +=================== + +struct :c:type:`iio_chan_spec` - specification of a single channel + +An IIO device channel is a representation of a data channel. An IIO device can +have one or multiple channels. For example: + +* a thermometer sensor has one channel representing the temperature measurement. +* a light sensor with two channels indicating the measurements in the visible + and infrared spectrum. +* an accelerometer can have up to 3 channels representing acceleration on X, Y + and Z axes. + +An IIO channel is described by the struct :c:type:`iio_chan_spec`. +A thermometer driver for the temperature sensor in the example above would +have to describe its channel as follows:: + + static const struct iio_chan_spec temp_channel[] = { + { + .type = IIO_TEMP, + .info_mask_separate = BIT(IIO_CHAN_INFO_PROCESSED), + }, + }; + +Channel sysfs attributes exposed to userspace are specified in the form of +bitmasks. Depending on their shared info, attributes can be set in one of the +following masks: + +* **info_mask_separate**, attributes will be specific to + this channel +* **info_mask_shared_by_type**, attributes are shared by all channels of the + same type +* **info_mask_shared_by_dir**, attributes are shared by all channels of the same + direction +* **info_mask_shared_by_all**, attributes are shared by all channels + +When there are multiple data channels per channel type we have two ways to +distinguish between them: + +* set **.modified** field of :c:type:`iio_chan_spec` to 1. Modifiers are + specified using **.channel2** field of the same :c:type:`iio_chan_spec` + structure and are used to indicate a physically unique characteristic of the + channel such as its direction or spectral response. For example, a light + sensor can have two channels, one for infrared light and one for both + infrared and visible light. +* set **.indexed** field of :c:type:`iio_chan_spec` to 1. In this case the + channel is simply another instance with an index specified by the **.channel** + field. + +Here is how we can make use of the channel's modifiers:: + + static const struct iio_chan_spec light_channels[] = { + { + .type = IIO_INTENSITY, + .modified = 1, + .channel2 = IIO_MOD_LIGHT_IR, + .info_mask_separate = BIT(IIO_CHAN_INFO_RAW), + .info_mask_shared = BIT(IIO_CHAN_INFO_SAMP_FREQ), + }, + { + .type = IIO_INTENSITY, + .modified = 1, + .channel2 = IIO_MOD_LIGHT_BOTH, + .info_mask_separate = BIT(IIO_CHAN_INFO_RAW), + .info_mask_shared = BIT(IIO_CHAN_INFO_SAMP_FREQ), + }, + { + .type = IIO_LIGHT, + .info_mask_separate = BIT(IIO_CHAN_INFO_PROCESSED), + .info_mask_shared = BIT(IIO_CHAN_INFO_SAMP_FREQ), + }, + } + +This channel's definition will generate two separate sysfs files for raw data +retrieval: + +* :file:`/sys/bus/iio/iio:device{X}/in_intensity_ir_raw` +* :file:`/sys/bus/iio/iio:device{X}/in_intensity_both_raw` + +one file for processed data: + +* :file:`/sys/bus/iio/iio:device{X}/in_illuminance_input` + +and one shared sysfs file for sampling frequency: + +* :file:`/sys/bus/iio/iio:device{X}/sampling_frequency`. + +Here is how we can make use of the channel's indexing:: + + static const struct iio_chan_spec light_channels[] = { + { + .type = IIO_VOLTAGE, + .indexed = 1, + .channel = 0, + .info_mask_separate = BIT(IIO_CHAN_INFO_RAW), + }, + { + .type = IIO_VOLTAGE, + .indexed = 1, + .channel = 1, + .info_mask_separate = BIT(IIO_CHAN_INFO_RAW), + }, + } + +This will generate two separate attributes files for raw data retrieval: + +* :file:`/sys/bus/iio/devices/iio:device{X}/in_voltage0_raw`, representing + voltage measurement for channel 0. +* :file:`/sys/bus/iio/devices/iio:device{X}/in_voltage1_raw`, representing + voltage measurement for channel 1. + +More details +============ +.. kernel-doc:: include/linux/iio/iio.h +.. kernel-doc:: drivers/iio/industrialio-core.c + :export: diff --git a/Documentation/driver-api/iio/index.rst b/Documentation/driver-api/iio/index.rst new file mode 100644 index 000000000000..e5c3922d1b6f --- /dev/null +++ b/Documentation/driver-api/iio/index.rst @@ -0,0 +1,17 @@ +.. include:: <isonum.txt> + +Industrial I/O +============== + +**Copyright** |copy| 2015 Intel Corporation + +Contents: + +.. toctree:: + :maxdepth: 2 + + intro + core + buffers + triggers + triggered-buffers diff --git a/Documentation/driver-api/iio/intro.rst b/Documentation/driver-api/iio/intro.rst new file mode 100644 index 000000000000..3653fbd57069 --- /dev/null +++ b/Documentation/driver-api/iio/intro.rst @@ -0,0 +1,33 @@ +.. include:: <isonum.txt> + +============ +Introduction +============ + +The main purpose of the Industrial I/O subsystem (IIO) is to provide support +for devices that in some sense perform either +analog-to-digital conversion (ADC) or digital-to-analog conversion (DAC) +or both. The aim is to fill the gap between the somewhat similar hwmon and +:doc:`input <../input>` subsystems. Hwmon is directed at low sample rate +sensors used to monitor and control the system itself, like fan speed control +or temperature measurement. :doc:`Input <../input>` is, as its name suggests, +focused on human interaction input devices (keyboard, mouse, touchscreen). +In some cases there is considerable overlap between these and IIO. + +Devices that fall into this category include: + +* analog to digital converters (ADCs) +* accelerometers +* capacitance to digital converters (CDCs) +* digital to analog converters (DACs) +* gyroscopes +* inertial measurement units (IMUs) +* color and light sensors +* magnetometers +* pressure sensors +* proximity sensors +* temperature sensors + +Usually these sensors are connected via :doc:`SPI <../spi>` or +:doc:`I2C <../i2c>`. A common use case of the sensors devices is to have +combined functionality (e.g. light plus proximity sensor). diff --git a/Documentation/driver-api/iio/triggered-buffers.rst b/Documentation/driver-api/iio/triggered-buffers.rst new file mode 100644 index 000000000000..0db12660cc90 --- /dev/null +++ b/Documentation/driver-api/iio/triggered-buffers.rst @@ -0,0 +1,69 @@ +================= +Triggered Buffers +================= + +Now that we know what buffers and triggers are let's see how they work together. + +IIO triggered buffer setup +========================== + +* :c:func:`iio_triggered_buffer_setup` — Setup triggered buffer and pollfunc +* :c:func:`iio_triggered_buffer_cleanup` — Free resources allocated by + :c:func:`iio_triggered_buffer_setup` +* struct :c:type:`iio_buffer_setup_ops` — buffer setup related callbacks + +A typical triggered buffer setup looks like this:: + + const struct iio_buffer_setup_ops sensor_buffer_setup_ops = { + .preenable = sensor_buffer_preenable, + .postenable = sensor_buffer_postenable, + .postdisable = sensor_buffer_postdisable, + .predisable = sensor_buffer_predisable, + }; + + irqreturn_t sensor_iio_pollfunc(int irq, void *p) + { + pf->timestamp = iio_get_time_ns((struct indio_dev *)p); + return IRQ_WAKE_THREAD; + } + + irqreturn_t sensor_trigger_handler(int irq, void *p) + { + u16 buf[8]; + int i = 0; + + /* read data for each active channel */ + for_each_set_bit(bit, active_scan_mask, masklength) + buf[i++] = sensor_get_data(bit) + + iio_push_to_buffers_with_timestamp(indio_dev, buf, timestamp); + + iio_trigger_notify_done(trigger); + return IRQ_HANDLED; + } + + /* setup triggered buffer, usually in probe function */ + iio_triggered_buffer_setup(indio_dev, sensor_iio_polfunc, + sensor_trigger_handler, + sensor_buffer_setup_ops); + +The important things to notice here are: + +* :c:type:`iio_buffer_setup_ops`, the buffer setup functions to be called at + predefined points in the buffer configuration sequence (e.g. before enable, + after disable). If not specified, the IIO core uses the default + iio_triggered_buffer_setup_ops. +* **sensor_iio_pollfunc**, the function that will be used as top half of poll + function. It should do as little processing as possible, because it runs in + interrupt context. The most common operation is recording of the current + timestamp and for this reason one can use the IIO core defined + :c:func:`iio_pollfunc_store_time` function. +* **sensor_trigger_handler**, the function that will be used as bottom half of + the poll function. This runs in the context of a kernel thread and all the + processing takes place here. It usually reads data from the device and + stores it in the internal buffer together with the timestamp recorded in the + top half. + +More details +============ +.. kernel-doc:: drivers/iio/buffer/industrialio-triggered-buffer.c diff --git a/Documentation/driver-api/iio/triggers.rst b/Documentation/driver-api/iio/triggers.rst new file mode 100644 index 000000000000..f89d37e7dd82 --- /dev/null +++ b/Documentation/driver-api/iio/triggers.rst @@ -0,0 +1,80 @@ +======== +Triggers +======== + +* struct :c:type:`iio_trigger` — industrial I/O trigger device +* :c:func:`devm_iio_trigger_alloc` — Resource-managed iio_trigger_alloc +* :c:func:`devm_iio_trigger_free` — Resource-managed iio_trigger_free +* :c:func:`devm_iio_trigger_register` — Resource-managed iio_trigger_register +* :c:func:`devm_iio_trigger_unregister` — Resource-managed + iio_trigger_unregister +* :c:func:`iio_trigger_validate_own_device` — Check if a trigger and IIO + device belong to the same device + +In many situations it is useful for a driver to be able to capture data based +on some external event (trigger) as opposed to periodically polling for data. +An IIO trigger can be provided by a device driver that also has an IIO device +based on hardware generated events (e.g. data ready or threshold exceeded) or +provided by a separate driver from an independent interrupt source (e.g. GPIO +line connected to some external system, timer interrupt or user space writing +a specific file in sysfs). A trigger may initiate data capture for a number of +sensors and also it may be completely unrelated to the sensor itself. + +IIO trigger sysfs interface +=========================== + +There are two locations in sysfs related to triggers: + +* :file:`/sys/bus/iio/devices/trigger{Y}/*`, this file is created once an + IIO trigger is registered with the IIO core and corresponds to trigger + with index Y. + Because triggers can be very different depending on type there are few + standard attributes that we can describe here: + + * :file:`name`, trigger name that can be later used for association with a + device. + * :file:`sampling_frequency`, some timer based triggers use this attribute to + specify the frequency for trigger calls. + +* :file:`/sys/bus/iio/devices/iio:device{X}/trigger/*`, this directory is + created once the device supports a triggered buffer. We can associate a + trigger with our device by writing the trigger's name in the + :file:`current_trigger` file. + +IIO trigger setup +================= + +Let's see a simple example of how to setup a trigger to be used by a driver:: + + struct iio_trigger_ops trigger_ops = { + .set_trigger_state = sample_trigger_state, + .validate_device = sample_validate_device, + } + + struct iio_trigger *trig; + + /* first, allocate memory for our trigger */ + trig = iio_trigger_alloc(dev, "trig-%s-%d", name, idx); + + /* setup trigger operations field */ + trig->ops = &trigger_ops; + + /* now register the trigger with the IIO core */ + iio_trigger_register(trig); + +IIO trigger ops +=============== + +* struct :c:type:`iio_trigger_ops` — operations structure for an iio_trigger. + +Notice that a trigger has a set of operations attached: + +* :file:`set_trigger_state`, switch the trigger on/off on demand. +* :file:`validate_device`, function to validate the device when the current + trigger gets changed. + +More details +============ +.. kernel-doc:: include/linux/iio/trigger.h +.. kernel-doc:: drivers/iio/industrialio-trigger.c + :export: diff --git a/Documentation/driver-api/index.rst b/Documentation/driver-api/index.rst index dbd34c9c1d93..60db00d1532b 100644 --- a/Documentation/driver-api/index.rst +++ b/Documentation/driver-api/index.rst @@ -16,11 +16,15 @@ available subsections can be seen below. basics infrastructure + pm/index + device-io dma-buf device_link message-based sound frame-buffer + regulator + iio/index input usb spi diff --git a/Documentation/driver-api/pm/conf.py b/Documentation/driver-api/pm/conf.py new file mode 100644 index 000000000000..a89fac11272f --- /dev/null +++ b/Documentation/driver-api/pm/conf.py @@ -0,0 +1,10 @@ +# -*- coding: utf-8; mode: python -*- + +project = "Device Power Management" + +tags.add("subproject") + +latex_documents = [ + ('index', 'pm.tex', project, + 'The kernel development community', 'manual'), +] diff --git a/Documentation/driver-api/pm/devices.rst b/Documentation/driver-api/pm/devices.rst new file mode 100644 index 000000000000..bedd32388dac --- /dev/null +++ b/Documentation/driver-api/pm/devices.rst @@ -0,0 +1,736 @@ +.. |struct dev_pm_ops| replace:: :c:type:`struct dev_pm_ops <dev_pm_ops>` +.. |struct dev_pm_domain| replace:: :c:type:`struct dev_pm_domain <dev_pm_domain>` +.. |struct bus_type| replace:: :c:type:`struct bus_type <bus_type>` +.. |struct device_type| replace:: :c:type:`struct device_type <device_type>` +.. |struct class| replace:: :c:type:`struct class <class>` +.. |struct wakeup_source| replace:: :c:type:`struct wakeup_source <wakeup_source>` +.. |struct device| replace:: :c:type:`struct device <device>` + +============================== +Device Power Management Basics +============================== + +:: + + Copyright (c) 2010-2011 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. + Copyright (c) 2010 Alan Stern <stern@rowland.harvard.edu> + Copyright (c) 2016 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com> + +Most of the code in Linux is device drivers, so most of the Linux power +management (PM) code is also driver-specific. Most drivers will do very +little; others, especially for platforms with small batteries (like cell +phones), will do a lot. + +This writeup gives an overview of how drivers interact with system-wide +power management goals, emphasizing the models and interfaces that are +shared by everything that hooks up to the driver model core. Read it as +background for the domain-specific work you'd do with any specific driver. + + +Two Models for Device Power Management +====================================== + +Drivers will use one or both of these models to put devices into low-power +states: + + System Sleep model: + + Drivers can enter low-power states as part of entering system-wide + low-power states like "suspend" (also known as "suspend-to-RAM"), or + (mostly for systems with disks) "hibernation" (also known as + "suspend-to-disk"). + + This is something that device, bus, and class drivers collaborate on + by implementing various role-specific suspend and resume methods to + cleanly power down hardware and software subsystems, then reactivate + them without loss of data. + + Some drivers can manage hardware wakeup events, which make the system + leave the low-power state. This feature may be enabled or disabled + using the relevant :file:`/sys/devices/.../power/wakeup` file (for + Ethernet drivers the ioctl interface used by ethtool may also be used + for this purpose); enabling it may cost some power usage, but let the + whole system enter low-power states more often. + + Runtime Power Management model: + + Devices may also be put into low-power states while the system is + running, independently of other power management activity in principle. + However, devices are not generally independent of each other (for + example, a parent device cannot be suspended unless all of its child + devices have been suspended). Moreover, depending on the bus type the + device is on, it may be necessary to carry out some bus-specific + operations on the device for this purpose. Devices put into low power + states at run time may require special handling during system-wide power + transitions (suspend or hibernation). + + For these reasons not only the device driver itself, but also the + appropriate subsystem (bus type, device type or device class) driver and + the PM core are involved in runtime power management. As in the system + sleep power management case, they need to collaborate by implementing + various role-specific suspend and resume methods, so that the hardware + is cleanly powered down and reactivated without data or service loss. + +There's not a lot to be said about those low-power states except that they are +very system-specific, and often device-specific. Also, that if enough devices +have been put into low-power states (at runtime), the effect may be very similar +to entering some system-wide low-power state (system sleep) ... and that +synergies exist, so that several drivers using runtime PM might put the system +into a state where even deeper power saving options are available. + +Most suspended devices will have quiesced all I/O: no more DMA or IRQs (except +for wakeup events), no more data read or written, and requests from upstream +drivers are no longer accepted. A given bus or platform may have different +requirements though. + +Examples of hardware wakeup events include an alarm from a real time clock, +network wake-on-LAN packets, keyboard or mouse activity, and media insertion +or removal (for PCMCIA, MMC/SD, USB, and so on). + +Interfaces for Entering System Sleep States +=========================================== + +There are programming interfaces provided for subsystems (bus type, device type, +device class) and device drivers to allow them to participate in the power +management of devices they are concerned with. These interfaces cover both +system sleep and runtime power management. + + +Device Power Management Operations +---------------------------------- + +Device power management operations, at the subsystem level as well as at the +device driver level, are implemented by defining and populating objects of type +|struct dev_pm_ops| defined in :file:`include/linux/pm.h`. The roles of the +methods included in it will be explained in what follows. For now, it should be +sufficient to remember that the last three methods are specific to runtime power +management while the remaining ones are used during system-wide power +transitions. + +There also is a deprecated "old" or "legacy" interface for power management +operations available at least for some subsystems. This approach does not use +|struct dev_pm_ops| objects and it is suitable only for implementing system +sleep power management methods in a limited way. Therefore it is not described +in this document, so please refer directly to the source code for more +information about it. + + +Subsystem-Level Methods +----------------------- + +The core methods to suspend and resume devices reside in +|struct dev_pm_ops| pointed to by the :c:member:`ops` member of +|struct dev_pm_domain|, or by the :c:member:`pm` member of |struct bus_type|, +|struct device_type| and |struct class|. They are mostly of interest to the +people writing infrastructure for platforms and buses, like PCI or USB, or +device type and device class drivers. They also are relevant to the writers of +device drivers whose subsystems (PM domains, device types, device classes and +bus types) don't provide all power management methods. + +Bus drivers implement these methods as appropriate for the hardware and the +drivers using it; PCI works differently from USB, and so on. Not many people +write subsystem-level drivers; most driver code is a "device driver" that builds +on top of bus-specific framework code. + +For more information on these driver calls, see the description later; +they are called in phases for every device, respecting the parent-child +sequencing in the driver model tree. + + +:file:`/sys/devices/.../power/wakeup` files +------------------------------------------- + +All device objects in the driver model contain fields that control the handling +of system wakeup events (hardware signals that can force the system out of a +sleep state). These fields are initialized by bus or device driver code using +:c:func:`device_set_wakeup_capable()` and :c:func:`device_set_wakeup_enable()`, +defined in :file:`include/linux/pm_wakeup.h`. + +The :c:member:`power.can_wakeup` flag just records whether the device (and its +driver) can physically support wakeup events. The +:c:func:`device_set_wakeup_capable()` routine affects this flag. The +:c:member:`power.wakeup` field is a pointer to an object of type +|struct wakeup_source| used for controlling whether or not the device should use +its system wakeup mechanism and for notifying the PM core of system wakeup +events signaled by the device. This object is only present for wakeup-capable +devices (i.e. devices whose :c:member:`can_wakeup` flags are set) and is created +(or removed) by :c:func:`device_set_wakeup_capable()`. + +Whether or not a device is capable of issuing wakeup events is a hardware +matter, and the kernel is responsible for keeping track of it. By contrast, +whether or not a wakeup-capable device should issue wakeup events is a policy +decision, and it is managed by user space through a sysfs attribute: the +:file:`power/wakeup` file. User space can write the "enabled" or "disabled" +strings to it to indicate whether or not, respectively, the device is supposed +to signal system wakeup. This file is only present if the +:c:member:`power.wakeup` object exists for the given device and is created (or +removed) along with that object, by :c:func:`device_set_wakeup_capable()`. +Reads from the file will return the corresponding string. + +The initial value in the :file:`power/wakeup` file is "disabled" for the +majority of devices; the major exceptions are power buttons, keyboards, and +Ethernet adapters whose WoL (wake-on-LAN) feature has been set up with ethtool. +It should also default to "enabled" for devices that don't generate wakeup +requests on their own but merely forward wakeup requests from one bus to another +(like PCI Express ports). + +The :c:func:`device_may_wakeup()` routine returns true only if the +:c:member:`power.wakeup` object exists and the corresponding :file:`power/wakeup` +file contains the "enabled" string. This information is used by subsystems, +like the PCI bus type code, to see whether or not to enable the devices' wakeup +mechanisms. If device wakeup mechanisms are enabled or disabled directly by +drivers, they also should use :c:func:`device_may_wakeup()` to decide what to do +during a system sleep transition. Device drivers, however, are not expected to +call :c:func:`device_set_wakeup_enable()` directly in any case. + +It ought to be noted that system wakeup is conceptually different from "remote +wakeup" used by runtime power management, although it may be supported by the +same physical mechanism. Remote wakeup is a feature allowing devices in +low-power states to trigger specific interrupts to signal conditions in which +they should be put into the full-power state. Those interrupts may or may not +be used to signal system wakeup events, depending on the hardware design. On +some systems it is impossible to trigger them from system sleep states. In any +case, remote wakeup should always be enabled for runtime power management for +all devices and drivers that support it. + + +:file:`/sys/devices/.../power/control` files +-------------------------------------------- + +Each device in the driver model has a flag to control whether it is subject to +runtime power management. This flag, :c:member:`runtime_auto`, is initialized +by the bus type (or generally subsystem) code using :c:func:`pm_runtime_allow()` +or :c:func:`pm_runtime_forbid()`; the default is to allow runtime power +management. + +The setting can be adjusted by user space by writing either "on" or "auto" to +the device's :file:`power/control` sysfs file. Writing "auto" calls +:c:func:`pm_runtime_allow()`, setting the flag and allowing the device to be +runtime power-managed by its driver. Writing "on" calls +:c:func:`pm_runtime_forbid()`, clearing the flag, returning the device to full +power if it was in a low-power state, and preventing the +device from being runtime power-managed. User space can check the current value +of the :c:member:`runtime_auto` flag by reading that file. + +The device's :c:member:`runtime_auto` flag has no effect on the handling of +system-wide power transitions. In particular, the device can (and in the +majority of cases should and will) be put into a low-power state during a +system-wide transition to a sleep state even though its :c:member:`runtime_auto` +flag is clear. + +For more information about the runtime power management framework, refer to +:file:`Documentation/power/runtime_pm.txt`. + + +Calling Drivers to Enter and Leave System Sleep States +====================================================== + +When the system goes into a sleep state, each device's driver is asked to +suspend the device by putting it into a state compatible with the target +system state. That's usually some version of "off", but the details are +system-specific. Also, wakeup-enabled devices will usually stay partly +functional in order to wake the system. + +When the system leaves that low-power state, the device's driver is asked to +resume it by returning it to full power. The suspend and resume operations +always go together, and both are multi-phase operations. + +For simple drivers, suspend might quiesce the device using class code +and then turn its hardware as "off" as possible during suspend_noirq. The +matching resume calls would then completely reinitialize the hardware +before reactivating its class I/O queues. + +More power-aware drivers might prepare the devices for triggering system wakeup +events. + + +Call Sequence Guarantees +------------------------ + +To ensure that bridges and similar links needing to talk to a device are +available when the device is suspended or resumed, the device hierarchy is +walked in a bottom-up order to suspend devices. A top-down order is +used to resume those devices. + +The ordering of the device hierarchy is defined by the order in which devices +get registered: a child can never be registered, probed or resumed before +its parent; and can't be removed or suspended after that parent. + +The policy is that the device hierarchy should match hardware bus topology. +[Or at least the control bus, for devices which use multiple busses.] +In particular, this means that a device registration may fail if the parent of +the device is suspending (i.e. has been chosen by the PM core as the next +device to suspend) or has already suspended, as well as after all of the other +devices have been suspended. Device drivers must be prepared to cope with such +situations. + + +System Power Management Phases +------------------------------ + +Suspending or resuming the system is done in several phases. Different phases +are used for suspend-to-idle, shallow (standby), and deep ("suspend-to-RAM") +sleep states and the hibernation state ("suspend-to-disk"). Each phase involves +executing callbacks for every device before the next phase begins. Not all +buses or classes support all these callbacks and not all drivers use all the +callbacks. The various phases always run after tasks have been frozen and +before they are unfrozen. Furthermore, the ``*_noirq phases`` run at a time +when IRQ handlers have been disabled (except for those marked with the +IRQF_NO_SUSPEND flag). + +All phases use PM domain, bus, type, class or driver callbacks (that is, methods +defined in ``dev->pm_domain->ops``, ``dev->bus->pm``, ``dev->type->pm``, +``dev->class->pm`` or ``dev->driver->pm``). These callbacks are regarded by the +PM core as mutually exclusive. Moreover, PM domain callbacks always take +precedence over all of the other callbacks and, for example, type callbacks take +precedence over bus, class and driver callbacks. To be precise, the following +rules are used to determine which callback to execute in the given phase: + + 1. If ``dev->pm_domain`` is present, the PM core will choose the callback + provided by ``dev->pm_domain->ops`` for execution. + + 2. Otherwise, if both ``dev->type`` and ``dev->type->pm`` are present, the + callback provided by ``dev->type->pm`` will be chosen for execution. + + 3. Otherwise, if both ``dev->class`` and ``dev->class->pm`` are present, + the callback provided by ``dev->class->pm`` will be chosen for + execution. + + 4. Otherwise, if both ``dev->bus`` and ``dev->bus->pm`` are present, the + callback provided by ``dev->bus->pm`` will be chosen for execution. + +This allows PM domains and device types to override callbacks provided by bus +types or device classes if necessary. + +The PM domain, type, class and bus callbacks may in turn invoke device- or +driver-specific methods stored in ``dev->driver->pm``, but they don't have to do +that. + +If the subsystem callback chosen for execution is not present, the PM core will +execute the corresponding method from the ``dev->driver->pm`` set instead if +there is one. + + +Entering System Suspend +----------------------- + +When the system goes into the freeze, standby or memory sleep state, +the phases are: ``prepare``, ``suspend``, ``suspend_late``, ``suspend_noirq``. + + 1. The ``prepare`` phase is meant to prevent races by preventing new + devices from being registered; the PM core would never know that all the + children of a device had been suspended if new children could be + registered at will. [By contrast, from the PM core's perspective, + devices may be unregistered at any time.] Unlike the other + suspend-related phases, during the ``prepare`` phase the device + hierarchy is traversed top-down. + + After the ``->prepare`` callback method returns, no new children may be + registered below the device. The method may also prepare the device or + driver in some way for the upcoming system power transition, but it + should not put the device into a low-power state. + + For devices supporting runtime power management, the return value of the + prepare callback can be used to indicate to the PM core that it may + safely leave the device in runtime suspend (if runtime-suspended + already), provided that all of the device's descendants are also left in + runtime suspend. Namely, if the prepare callback returns a positive + number and that happens for all of the descendants of the device too, + and all of them (including the device itself) are runtime-suspended, the + PM core will skip the ``suspend``, ``suspend_late`` and + ``suspend_noirq`` phases as well as all of the corresponding phases of + the subsequent device resume for all of these devices. In that case, + the ``->complete`` callback will be invoked directly after the + ``->prepare`` callback and is entirely responsible for putting the + device into a consistent state as appropriate. + + Note that this direct-complete procedure applies even if the device is + disabled for runtime PM; only the runtime-PM status matters. It follows + that if a device has system-sleep callbacks but does not support runtime + PM, then its prepare callback must never return a positive value. This + is because all such devices are initially set to runtime-suspended with + runtime PM disabled. + + 2. The ``->suspend`` methods should quiesce the device to stop it from + performing I/O. They also may save the device registers and put it into + the appropriate low-power state, depending on the bus type the device is + on, and they may enable wakeup events. + + 3. For a number of devices it is convenient to split suspend into the + "quiesce device" and "save device state" phases, in which cases + ``suspend_late`` is meant to do the latter. It is always executed after + runtime power management has been disabled for the device in question. + + 4. The ``suspend_noirq`` phase occurs after IRQ handlers have been disabled, + which means that the driver's interrupt handler will not be called while + the callback method is running. The ``->suspend_noirq`` methods should + save the values of the device's registers that weren't saved previously + and finally put the device into the appropriate low-power state. + + The majority of subsystems and device drivers need not implement this + callback. However, bus types allowing devices to share interrupt + vectors, like PCI, generally need it; otherwise a driver might encounter + an error during the suspend phase by fielding a shared interrupt + generated by some other device after its own device had been set to low + power. + +At the end of these phases, drivers should have stopped all I/O transactions +(DMA, IRQs), saved enough state that they can re-initialize or restore previous +state (as needed by the hardware), and placed the device into a low-power state. +On many platforms they will gate off one or more clock sources; sometimes they +will also switch off power supplies or reduce voltages. [Drivers supporting +runtime PM may already have performed some or all of these steps.] + +If :c:func:`device_may_wakeup(dev)` returns ``true``, the device should be +prepared for generating hardware wakeup signals to trigger a system wakeup event +when the system is in the sleep state. For example, :c:func:`enable_irq_wake()` +might identify GPIO signals hooked up to a switch or other external hardware, +and :c:func:`pci_enable_wake()` does something similar for the PCI PME signal. + +If any of these callbacks returns an error, the system won't enter the desired +low-power state. Instead, the PM core will unwind its actions by resuming all +the devices that were suspended. + + +Leaving System Suspend +---------------------- + +When resuming from freeze, standby or memory sleep, the phases are: +``resume_noirq``, ``resume_early``, ``resume``, ``complete``. + + 1. The ``->resume_noirq`` callback methods should perform any actions + needed before the driver's interrupt handlers are invoked. This + generally means undoing the actions of the ``suspend_noirq`` phase. If + the bus type permits devices to share interrupt vectors, like PCI, the + method should bring the device and its driver into a state in which the + driver can recognize if the device is the source of incoming interrupts, + if any, and handle them correctly. + + For example, the PCI bus type's ``->pm.resume_noirq()`` puts the device + into the full-power state (D0 in the PCI terminology) and restores the + standard configuration registers of the device. Then it calls the + device driver's ``->pm.resume_noirq()`` method to perform device-specific + actions. + + 2. The ``->resume_early`` methods should prepare devices for the execution + of the resume methods. This generally involves undoing the actions of + the preceding ``suspend_late`` phase. + + 3. The ``->resume`` methods should bring the device back to its operating + state, so that it can perform normal I/O. This generally involves + undoing the actions of the ``suspend`` phase. + + 4. The ``complete`` phase should undo the actions of the ``prepare`` phase. + For this reason, unlike the other resume-related phases, during the + ``complete`` phase the device hierarchy is traversed bottom-up. + + Note, however, that new children may be registered below the device as + soon as the ``->resume`` callbacks occur; it's not necessary to wait + until the ``complete`` phase with that. + + Moreover, if the preceding ``->prepare`` callback returned a positive + number, the device may have been left in runtime suspend throughout the + whole system suspend and resume (the ``suspend``, ``suspend_late``, + ``suspend_noirq`` phases of system suspend and the ``resume_noirq``, + ``resume_early``, ``resume`` phases of system resume may have been + skipped for it). In that case, the ``->complete`` callback is entirely + responsible for putting the device into a consistent state after system + suspend if necessary. [For example, it may need to queue up a runtime + resume request for the device for this purpose.] To check if that is + the case, the ``->complete`` callback can consult the device's + ``power.direct_complete`` flag. Namely, if that flag is set when the + ``->complete`` callback is being run, it has been called directly after + the preceding ``->prepare`` and special actions may be required + to make the device work correctly afterward. + +At the end of these phases, drivers should be as functional as they were before +suspending: I/O can be performed using DMA and IRQs, and the relevant clocks are +gated on. + +However, the details here may again be platform-specific. For example, +some systems support multiple "run" states, and the mode in effect at +the end of resume might not be the one which preceded suspension. +That means availability of certain clocks or power supplies changed, +which could easily affect how a driver works. + +Drivers need to be able to handle hardware which has been reset since all of the +suspend methods were called, for example by complete reinitialization. +This may be the hardest part, and the one most protected by NDA'd documents +and chip errata. It's simplest if the hardware state hasn't changed since +the suspend was carried out, but that can only be guaranteed if the target +system sleep entered was suspend-to-idle. For the other system sleep states +that may not be the case (and usually isn't for ACPI-defined system sleep +states, like S3). + +Drivers must also be prepared to notice that the device has been removed +while the system was powered down, whenever that's physically possible. +PCMCIA, MMC, USB, Firewire, SCSI, and even IDE are common examples of busses +where common Linux platforms will see such removal. Details of how drivers +will notice and handle such removals are currently bus-specific, and often +involve a separate thread. + +These callbacks may return an error value, but the PM core will ignore such +errors since there's nothing it can do about them other than printing them in +the system log. + + +Entering Hibernation +-------------------- + +Hibernating the system is more complicated than putting it into sleep states, +because it involves creating and saving a system image. Therefore there are +more phases for hibernation, with a different set of callbacks. These phases +always run after tasks have been frozen and enough memory has been freed. + +The general procedure for hibernation is to quiesce all devices ("freeze"), +create an image of the system memory while everything is stable, reactivate all +devices ("thaw"), write the image to permanent storage, and finally shut down +the system ("power off"). The phases used to accomplish this are: ``prepare``, +``freeze``, ``freeze_late``, ``freeze_noirq``, ``thaw_noirq``, ``thaw_early``, +``thaw``, ``complete``, ``prepare``, ``poweroff``, ``poweroff_late``, +``poweroff_noirq``. + + 1. The ``prepare`` phase is discussed in the "Entering System Suspend" + section above. + + 2. The ``->freeze`` methods should quiesce the device so that it doesn't + generate IRQs or DMA, and they may need to save the values of device + registers. However the device does not have to be put in a low-power + state, and to save time it's best not to do so. Also, the device should + not be prepared to generate wakeup events. + + 3. The ``freeze_late`` phase is analogous to the ``suspend_late`` phase + described earlier, except that the device should not be put into a + low-power state and should not be allowed to generate wakeup events. + + 4. The ``freeze_noirq`` phase is analogous to the ``suspend_noirq`` phase + discussed earlier, except again that the device should not be put into + a low-power state and should not be allowed to generate wakeup events. + +At this point the system image is created. All devices should be inactive and +the contents of memory should remain undisturbed while this happens, so that the +image forms an atomic snapshot of the system state. + + 5. The ``thaw_noirq`` phase is analogous to the ``resume_noirq`` phase + discussed earlier. The main difference is that its methods can assume + the device is in the same state as at the end of the ``freeze_noirq`` + phase. + + 6. The ``thaw_early`` phase is analogous to the ``resume_early`` phase + described above. Its methods should undo the actions of the preceding + ``freeze_late``, if necessary. + + 7. The ``thaw`` phase is analogous to the ``resume`` phase discussed + earlier. Its methods should bring the device back to an operating + state, so that it can be used for saving the image if necessary. + + 8. The ``complete`` phase is discussed in the "Leaving System Suspend" + section above. + +At this point the system image is saved, and the devices then need to be +prepared for the upcoming system shutdown. This is much like suspending them +before putting the system into the suspend-to-idle, shallow or deep sleep state, +and the phases are similar. + + 9. The ``prepare`` phase is discussed above. + + 10. The ``poweroff`` phase is analogous to the ``suspend`` phase. + + 11. The ``poweroff_late`` phase is analogous to the ``suspend_late`` phase. + + 12. The ``poweroff_noirq`` phase is analogous to the ``suspend_noirq`` phase. + +The ``->poweroff``, ``->poweroff_late`` and ``->poweroff_noirq`` callbacks +should do essentially the same things as the ``->suspend``, ``->suspend_late`` +and ``->suspend_noirq`` callbacks, respectively. The only notable difference is +that they need not store the device register values, because the registers +should already have been stored during the ``freeze``, ``freeze_late`` or +``freeze_noirq`` phases. + + +Leaving Hibernation +------------------- + +Resuming from hibernation is, again, more complicated than resuming from a sleep +state in which the contents of main memory are preserved, because it requires +a system image to be loaded into memory and the pre-hibernation memory contents +to be restored before control can be passed back to the image kernel. + +Although in principle the image might be loaded into memory and the +pre-hibernation memory contents restored by the boot loader, in practice this +can't be done because boot loaders aren't smart enough and there is no +established protocol for passing the necessary information. So instead, the +boot loader loads a fresh instance of the kernel, called "the restore kernel", +into memory and passes control to it in the usual way. Then the restore kernel +reads the system image, restores the pre-hibernation memory contents, and passes +control to the image kernel. Thus two different kernel instances are involved +in resuming from hibernation. In fact, the restore kernel may be completely +different from the image kernel: a different configuration and even a different +version. This has important consequences for device drivers and their +subsystems. + +To be able to load the system image into memory, the restore kernel needs to +include at least a subset of device drivers allowing it to access the storage +medium containing the image, although it doesn't need to include all of the +drivers present in the image kernel. After the image has been loaded, the +devices managed by the boot kernel need to be prepared for passing control back +to the image kernel. This is very similar to the initial steps involved in +creating a system image, and it is accomplished in the same way, using +``prepare``, ``freeze``, and ``freeze_noirq`` phases. However, the devices +affected by these phases are only those having drivers in the restore kernel; +other devices will still be in whatever state the boot loader left them. + +Should the restoration of the pre-hibernation memory contents fail, the restore +kernel would go through the "thawing" procedure described above, using the +``thaw_noirq``, ``thaw_early``, ``thaw``, and ``complete`` phases, and then +continue running normally. This happens only rarely. Most often the +pre-hibernation memory contents are restored successfully and control is passed +to the image kernel, which then becomes responsible for bringing the system back +to the working state. + +To achieve this, the image kernel must restore the devices' pre-hibernation +functionality. The operation is much like waking up from a sleep state (with +the memory contents preserved), although it involves different phases: +``restore_noirq``, ``restore_early``, ``restore``, ``complete``. + + 1. The ``restore_noirq`` phase is analogous to the ``resume_noirq`` phase. + + 2. The ``restore_early`` phase is analogous to the ``resume_early`` phase. + + 3. The ``restore`` phase is analogous to the ``resume`` phase. + + 4. The ``complete`` phase is discussed above. + +The main difference from ``resume[_early|_noirq]`` is that +``restore[_early|_noirq]`` must assume the device has been accessed and +reconfigured by the boot loader or the restore kernel. Consequently, the state +of the device may be different from the state remembered from the ``freeze``, +``freeze_late`` and ``freeze_noirq`` phases. The device may even need to be +reset and completely re-initialized. In many cases this difference doesn't +matter, so the ``->resume[_early|_noirq]`` and ``->restore[_early|_norq]`` +method pointers can be set to the same routines. Nevertheless, different +callback pointers are used in case there is a situation where it actually does +matter. + + +Power Management Notifiers +========================== + +There are some operations that cannot be carried out by the power management +callbacks discussed above, because the callbacks occur too late or too early. +To handle these cases, subsystems and device drivers may register power +management notifiers that are called before tasks are frozen and after they have +been thawed. Generally speaking, the PM notifiers are suitable for performing +actions that either require user space to be available, or at least won't +interfere with user space. + +For details refer to :doc:`notifiers`. + + +Device Low-Power (suspend) States +================================= + +Device low-power states aren't standard. One device might only handle +"on" and "off", while another might support a dozen different versions of +"on" (how many engines are active?), plus a state that gets back to "on" +faster than from a full "off". + +Some buses define rules about what different suspend states mean. PCI +gives one example: after the suspend sequence completes, a non-legacy +PCI device may not perform DMA or issue IRQs, and any wakeup events it +issues would be issued through the PME# bus signal. Plus, there are +several PCI-standard device states, some of which are optional. + +In contrast, integrated system-on-chip processors often use IRQs as the +wakeup event sources (so drivers would call :c:func:`enable_irq_wake`) and +might be able to treat DMA completion as a wakeup event (sometimes DMA can stay +active too, it'd only be the CPU and some peripherals that sleep). + +Some details here may be platform-specific. Systems may have devices that +can be fully active in certain sleep states, such as an LCD display that's +refreshed using DMA while most of the system is sleeping lightly ... and +its frame buffer might even be updated by a DSP or other non-Linux CPU while +the Linux control processor stays idle. + +Moreover, the specific actions taken may depend on the target system state. +One target system state might allow a given device to be very operational; +another might require a hard shut down with re-initialization on resume. +And two different target systems might use the same device in different +ways; the aforementioned LCD might be active in one product's "standby", +but a different product using the same SOC might work differently. + + +Device Power Management Domains +=============================== + +Sometimes devices share reference clocks or other power resources. In those +cases it generally is not possible to put devices into low-power states +individually. Instead, a set of devices sharing a power resource can be put +into a low-power state together at the same time by turning off the shared +power resource. Of course, they also need to be put into the full-power state +together, by turning the shared power resource on. A set of devices with this +property is often referred to as a power domain. A power domain may also be +nested inside another power domain. The nested domain is referred to as the +sub-domain of the parent domain. + +Support for power domains is provided through the :c:member:`pm_domain` field of +|struct device|. This field is a pointer to an object of type +|struct dev_pm_domain|, defined in :file:`include/linux/pm.h``, providing a set +of power management callbacks analogous to the subsystem-level and device driver +callbacks that are executed for the given device during all power transitions, +instead of the respective subsystem-level callbacks. Specifically, if a +device's :c:member:`pm_domain` pointer is not NULL, the ``->suspend()`` callback +from the object pointed to by it will be executed instead of its subsystem's +(e.g. bus type's) ``->suspend()`` callback and analogously for all of the +remaining callbacks. In other words, power management domain callbacks, if +defined for the given device, always take precedence over the callbacks provided +by the device's subsystem (e.g. bus type). + +The support for device power management domains is only relevant to platforms +needing to use the same device driver power management callbacks in many +different power domain configurations and wanting to avoid incorporating the +support for power domains into subsystem-level callbacks, for example by +modifying the platform bus type. Other platforms need not implement it or take +it into account in any way. + +Devices may be defined as IRQ-safe which indicates to the PM core that their +runtime PM callbacks may be invoked with disabled interrupts (see +:file:`Documentation/power/runtime_pm.txt` for more information). If an +IRQ-safe device belongs to a PM domain, the runtime PM of the domain will be +disallowed, unless the domain itself is defined as IRQ-safe. However, it +makes sense to define a PM domain as IRQ-safe only if all the devices in it +are IRQ-safe. Moreover, if an IRQ-safe domain has a parent domain, the runtime +PM of the parent is only allowed if the parent itself is IRQ-safe too with the +additional restriction that all child domains of an IRQ-safe parent must also +be IRQ-safe. + + +Runtime Power Management +======================== + +Many devices are able to dynamically power down while the system is still +running. This feature is useful for devices that are not being used, and +can offer significant power savings on a running system. These devices +often support a range of runtime power states, which might use names such +as "off", "sleep", "idle", "active", and so on. Those states will in some +cases (like PCI) be partially constrained by the bus the device uses, and will +usually include hardware states that are also used in system sleep states. + +A system-wide power transition can be started while some devices are in low +power states due to runtime power management. The system sleep PM callbacks +should recognize such situations and react to them appropriately, but the +necessary actions are subsystem-specific. + +In some cases the decision may be made at the subsystem level while in other +cases the device driver may be left to decide. In some cases it may be +desirable to leave a suspended device in that state during a system-wide power +transition, but in other cases the device must be put back into the full-power +state temporarily, for example so that its system wakeup capability can be +disabled. This all depends on the hardware and the design of the subsystem and +device driver in question. + +During system-wide resume from a sleep state it's easiest to put devices into +the full-power state, as explained in :file:`Documentation/power/runtime_pm.txt`. +Refer to that document for more information regarding this particular issue as +well as for information on the device runtime power management framework in +general. diff --git a/Documentation/driver-api/pm/index.rst b/Documentation/driver-api/pm/index.rst new file mode 100644 index 000000000000..2f6d0e9cf6b7 --- /dev/null +++ b/Documentation/driver-api/pm/index.rst @@ -0,0 +1,16 @@ +======================= +Device Power Management +======================= + +.. toctree:: + + devices + notifiers + types + +.. only:: subproject and html + + Indices + ======= + + * :ref:`genindex` diff --git a/Documentation/driver-api/pm/notifiers.rst b/Documentation/driver-api/pm/notifiers.rst new file mode 100644 index 000000000000..62f860026992 --- /dev/null +++ b/Documentation/driver-api/pm/notifiers.rst @@ -0,0 +1,70 @@ +============================= +Suspend/Hibernation Notifiers +============================= + +:: + + Copyright (c) 2016 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com> + +There are some operations that subsystems or drivers may want to carry out +before hibernation/suspend or after restore/resume, but they require the system +to be fully functional, so the drivers' and subsystems' ``->suspend()`` and +``->resume()`` or even ``->prepare()`` and ``->complete()`` callbacks are not +suitable for this purpose. + +For example, device drivers may want to upload firmware to their devices after +resume/restore, but they cannot do it by calling :c:func:`request_firmware()` +from their ``->resume()`` or ``->complete()`` callback routines (user land +processes are frozen at these points). The solution may be to load the firmware +into memory before processes are frozen and upload it from there in the +``->resume()`` routine. A suspend/hibernation notifier may be used for that. + +Subsystems or drivers having such needs can register suspend notifiers that +will be called upon the following events by the PM core: + +``PM_HIBERNATION_PREPARE`` + The system is going to hibernate, tasks will be frozen immediately. This + is different from ``PM_SUSPEND_PREPARE`` below, because in this case + additional work is done between the notifiers and the invocation of PM + callbacks for the "freeze" transition. + +``PM_POST_HIBERNATION`` + The system memory state has been restored from a hibernation image or an + error occurred during hibernation. Device restore callbacks have been + executed and tasks have been thawed. + +``PM_RESTORE_PREPARE`` + The system is going to restore a hibernation image. If all goes well, + the restored image kernel will issue a ``PM_POST_HIBERNATION`` + notification. + +``PM_POST_RESTORE`` + An error occurred during restore from hibernation. Device restore + callbacks have been executed and tasks have been thawed. + +``PM_SUSPEND_PREPARE`` + The system is preparing for suspend. + +``PM_POST_SUSPEND`` + The system has just resumed or an error occurred during suspend. Device + resume callbacks have been executed and tasks have been thawed. + +It is generally assumed that whatever the notifiers do for +``PM_HIBERNATION_PREPARE``, should be undone for ``PM_POST_HIBERNATION``. +Analogously, operations carried out for ``PM_SUSPEND_PREPARE`` should be +reversed for ``PM_POST_SUSPEND``. + +Moreover, if one of the notifiers fails for the ``PM_HIBERNATION_PREPARE`` or +``PM_SUSPEND_PREPARE`` event, the notifiers that have already succeeded for that +event will be called for ``PM_POST_HIBERNATION`` or ``PM_POST_SUSPEND``, +respectively. + +The hibernation and suspend notifiers are called with :c:data:`pm_mutex` held. +They are defined in the usual way, but their last argument is meaningless (it is +always NULL). + +To register and/or unregister a suspend notifier use +:c:func:`register_pm_notifier()` and :c:func:`unregister_pm_notifier()`, +respectively (both defined in :file:`include/linux/suspend.h`). If you don't +need to unregister the notifier, you can also use the :c:func:`pm_notifier()` +macro defined in :file:`include/linux/suspend.h`. diff --git a/Documentation/driver-api/pm/types.rst b/Documentation/driver-api/pm/types.rst new file mode 100644 index 000000000000..3ebdecc54104 --- /dev/null +++ b/Documentation/driver-api/pm/types.rst @@ -0,0 +1,5 @@ +================================== +Device Power Management Data Types +================================== + +.. kernel-doc:: include/linux/pm.h diff --git a/Documentation/driver-api/regulator.rst b/Documentation/driver-api/regulator.rst new file mode 100644 index 000000000000..520da0a5251d --- /dev/null +++ b/Documentation/driver-api/regulator.rst @@ -0,0 +1,170 @@ +.. Copyright 2007-2008 Wolfson Microelectronics + +.. This documentation is free software; you can redistribute +.. it and/or modify it under the terms of the GNU General Public +.. License version 2 as published by the Free Software Foundation. + +================================= +Voltage and current regulator API +================================= + +:Author: Liam Girdwood +:Author: Mark Brown + +Introduction +============ + +This framework is designed to provide a standard kernel interface to +control voltage and current regulators. + +The intention is to allow systems to dynamically control regulator power +output in order to save power and prolong battery life. This applies to +both voltage regulators (where voltage output is controllable) and +current sinks (where current limit is controllable). + +Note that additional (and currently more complete) documentation is +available in the Linux kernel source under +``Documentation/power/regulator``. + +Glossary +-------- + +The regulator API uses a number of terms which may not be familiar: + +Regulator + + Electronic device that supplies power to other devices. Most regulators + can enable and disable their output and some can also control their + output voltage or current. + +Consumer + + Electronic device which consumes power provided by a regulator. These + may either be static, requiring only a fixed supply, or dynamic, + requiring active management of the regulator at runtime. + +Power Domain + + The electronic circuit supplied by a given regulator, including the + regulator and all consumer devices. The configuration of the regulator + is shared between all the components in the circuit. + +Power Management Integrated Circuit (PMIC) + + An IC which contains numerous regulators and often also other + subsystems. In an embedded system the primary PMIC is often equivalent + to a combination of the PSU and southbridge in a desktop system. + +Consumer driver interface +========================= + +This offers a similar API to the kernel clock framework. Consumer +drivers use `get <#API-regulator-get>`__ and +`put <#API-regulator-put>`__ operations to acquire and release +regulators. Functions are provided to `enable <#API-regulator-enable>`__ +and `disable <#API-regulator-disable>`__ the regulator and to get and +set the runtime parameters of the regulator. + +When requesting regulators consumers use symbolic names for their +supplies, such as "Vcc", which are mapped into actual regulator devices +by the machine interface. + +A stub version of this API is provided when the regulator framework is +not in use in order to minimise the need to use ifdefs. + +Enabling and disabling +---------------------- + +The regulator API provides reference counted enabling and disabling of +regulators. Consumer devices use the :c:func:`regulator_enable()` and +:c:func:`regulator_disable()` functions to enable and disable +regulators. Calls to the two functions must be balanced. + +Note that since multiple consumers may be using a regulator and machine +constraints may not allow the regulator to be disabled there is no +guarantee that calling :c:func:`regulator_disable()` will actually +cause the supply provided by the regulator to be disabled. Consumer +drivers should assume that the regulator may be enabled at all times. + +Configuration +------------- + +Some consumer devices may need to be able to dynamically configure their +supplies. For example, MMC drivers may need to select the correct +operating voltage for their cards. This may be done while the regulator +is enabled or disabled. + +The :c:func:`regulator_set_voltage()` and +:c:func:`regulator_set_current_limit()` functions provide the primary +interface for this. Both take ranges of voltages and currents, supporting +drivers that do not require a specific value (eg, CPU frequency scaling +normally permits the CPU to use a wider range of supply voltages at lower +frequencies but does not require that the supply voltage be lowered). Where +an exact value is required both minimum and maximum values should be +identical. + +Callbacks +--------- + +Callbacks may also be registered for events such as regulation failures. + +Regulator driver interface +========================== + +Drivers for regulator chips register the regulators with the regulator +core, providing operations structures to the core. A notifier interface +allows error conditions to be reported to the core. + +Registration should be triggered by explicit setup done by the platform, +supplying a struct :c:type:`regulator_init_data` for the regulator +containing constraint and supply information. + +Machine interface +================= + +This interface provides a way to define how regulators are connected to +consumers on a given system and what the valid operating parameters are +for the system. + +Supplies +-------- + +Regulator supplies are specified using struct +:c:type:`regulator_consumer_supply`. This is done at driver registration +time as part of the machine constraints. + +Constraints +----------- + +As well as defining the connections the machine interface also provides +constraints defining the operations that clients are allowed to perform +and the parameters that may be set. This is required since generally +regulator devices will offer more flexibility than it is safe to use on +a given system, for example supporting higher supply voltages than the +consumers are rated for. + +This is done at driver registration time` by providing a +struct :c:type:`regulation_constraints`. + +The constraints may also specify an initial configuration for the +regulator in the constraints, which is particularly useful for use with +static consumers. + +API reference +============= + +Due to limitations of the kernel documentation framework and the +existing layout of the source code the entire regulator API is +documented here. + +.. kernel-doc:: include/linux/regulator/consumer.h + :internal: + +.. kernel-doc:: include/linux/regulator/machine.h + :internal: + +.. kernel-doc:: include/linux/regulator/driver.h + :internal: + +.. kernel-doc:: drivers/regulator/core.c + :export: diff --git a/Documentation/hwmon/ds1621 b/Documentation/hwmon/ds1621 index f775e612f582..fa3407997795 100644 --- a/Documentation/hwmon/ds1621 +++ b/Documentation/hwmon/ds1621 @@ -117,10 +117,10 @@ support, which is achieved via the R0 and R1 config register bits, where: R0..R1 ------ - 0 0 => 9 bits, 0.5 degrees Celcius - 1 0 => 10 bits, 0.25 degrees Celcius - 0 1 => 11 bits, 0.125 degrees Celcius - 1 1 => 12 bits, 0.0625 degrees Celcius + 0 0 => 9 bits, 0.5 degrees Celsius + 1 0 => 10 bits, 0.25 degrees Celsius + 0 1 => 11 bits, 0.125 degrees Celsius + 1 1 => 12 bits, 0.0625 degrees Celsius Note: At initial device power-on, the default resolution is set to 12-bits. diff --git a/Documentation/index.rst b/Documentation/index.rst index cb5d77699c60..f6e641a54bbc 100644 --- a/Documentation/index.rst +++ b/Documentation/index.rst @@ -47,7 +47,7 @@ These books get into the details of how specific kernel subsystems work from the point of view of a kernel developer. Much of the information here is taken directly from the kernel source, with supplemental material added as needed (or at least as we managed to add it — probably *not* all that is -needed). +needed). .. toctree:: :maxdepth: 2 @@ -68,6 +68,14 @@ Korean translations translations/ko_KR/index +Chinese translations +-------------------- + +.. toctree:: + :maxdepth: 1 + + translations/zh_CN/index + Indices and tables ================== diff --git a/Documentation/input/input.txt b/Documentation/input/input.txt index 0acfddbe2028..7ebce100fe90 100644 --- a/Documentation/input/input.txt +++ b/Documentation/input/input.txt @@ -279,10 +279,10 @@ struct input_event { 'time' is the timestamp, it returns the time at which the event happened. Type is for example EV_REL for relative moment, EV_KEY for a keypress or -release. More types are defined in include/linux/input.h. +release. More types are defined in include/uapi/linux/input-event-codes.h. 'code' is event code, for example REL_X or KEY_BACKSPACE, again a complete -list is in include/linux/input.h. +list is in include/uapi/linux/input-event-codes.h. 'value' is the value the event carries. Either a relative change for EV_REL, absolute new value for EV_ABS (joysticks ...), or 0 for EV_KEY for diff --git a/Documentation/ioctl/botching-up-ioctls.txt b/Documentation/ioctl/botching-up-ioctls.txt index 36138c632f7a..d02cfb48901c 100644 --- a/Documentation/ioctl/botching-up-ioctls.txt +++ b/Documentation/ioctl/botching-up-ioctls.txt @@ -24,7 +24,7 @@ Prerequisites ------------- First the prerequisites. Without these you have already failed, because you -will need to add a a 32-bit compat layer: +will need to add a 32-bit compat layer: * Only use fixed sized integers. To avoid conflicts with typedefs in userspace the kernel has special types like __u32, __s64. Use them. diff --git a/Documentation/livepatch/livepatch.txt b/Documentation/livepatch/livepatch.txt index 7f04e13ec53d..9d2096c7160d 100644 --- a/Documentation/livepatch/livepatch.txt +++ b/Documentation/livepatch/livepatch.txt @@ -358,7 +358,7 @@ The current Livepatch implementation has several limitations: Each function has to handle TOC and save LR before it could call the ftrace handler. This operation has to be reverted on return. Fortunately, the generic ftrace code has the same problem and all - this is is handled on the ftrace level. + this is handled on the ftrace level. + Kretprobes using the ftrace framework conflict with the patched diff --git a/Documentation/media/Makefile b/Documentation/media/Makefile index 32663602ff25..9b3e70b2cab2 100644 --- a/Documentation/media/Makefile +++ b/Documentation/media/Makefile @@ -36,7 +36,7 @@ quiet_cmd_genpdf = GENPDF $2 cmd_genpdf = convert $2 $3 quiet_cmd_gendot = DOT $2 - cmd_gendot = dot -Tsvg $2 > $3 + cmd_gendot = dot -Tsvg $2 > $3 || { rm -f $3; exit 1; } %.pdf: %.svg @$(call cmd,genpdf,$<,$@) @@ -103,6 +103,7 @@ html: all epub: all xml: all latex: $(IMGPDF) all +linkcheck: clean: -rm -f $(DOTTGT) $(IMGTGT) ${TARGETS} 2>/dev/null diff --git a/Documentation/networking/kcm.txt b/Documentation/networking/kcm.txt index 3476ede5bc2c..9a513295b07c 100644 --- a/Documentation/networking/kcm.txt +++ b/Documentation/networking/kcm.txt @@ -272,7 +272,7 @@ on the socket thus waking up the application thread. When the application sees the error (which may just be a disconnect) it should unattach the socket from KCM and then close it. It is assumed that once an error is posted on the TCP socket the data stream is unrecoverable (i.e. an error -may have occurred in in the middle of receiving a messssge). +may have occurred in the middle of receiving a messssge). TCP connection monitoring ------------------------- diff --git a/Documentation/power/00-INDEX b/Documentation/power/00-INDEX index 7cb6085839f3..7f3c2def2cac 100644 --- a/Documentation/power/00-INDEX +++ b/Documentation/power/00-INDEX @@ -14,8 +14,6 @@ freezing-of-tasks.txt - How processes and controlled during suspend interface.txt - Power management user interface in /sys/power -notifiers.txt - - Registering suspend notifiers in device drivers opp.txt - Operating Performance Point library pci.txt diff --git a/Documentation/power/devices.txt b/Documentation/power/devices.txt deleted file mode 100644 index 73ddea39a9ce..000000000000 --- a/Documentation/power/devices.txt +++ /dev/null @@ -1,716 +0,0 @@ -Device Power Management - -Copyright (c) 2010-2011 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. -Copyright (c) 2010 Alan Stern <stern@rowland.harvard.edu> -Copyright (c) 2014 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com> - - -Most of the code in Linux is device drivers, so most of the Linux power -management (PM) code is also driver-specific. Most drivers will do very -little; others, especially for platforms with small batteries (like cell -phones), will do a lot. - -This writeup gives an overview of how drivers interact with system-wide -power management goals, emphasizing the models and interfaces that are -shared by everything that hooks up to the driver model core. Read it as -background for the domain-specific work you'd do with any specific driver. - - -Two Models for Device Power Management -====================================== -Drivers will use one or both of these models to put devices into low-power -states: - - System Sleep model: - Drivers can enter low-power states as part of entering system-wide - low-power states like "suspend" (also known as "suspend-to-RAM"), or - (mostly for systems with disks) "hibernation" (also known as - "suspend-to-disk"). - - This is something that device, bus, and class drivers collaborate on - by implementing various role-specific suspend and resume methods to - cleanly power down hardware and software subsystems, then reactivate - them without loss of data. - - Some drivers can manage hardware wakeup events, which make the system - leave the low-power state. This feature may be enabled or disabled - using the relevant /sys/devices/.../power/wakeup file (for Ethernet - drivers the ioctl interface used by ethtool may also be used for this - purpose); enabling it may cost some power usage, but let the whole - system enter low-power states more often. - - Runtime Power Management model: - Devices may also be put into low-power states while the system is - running, independently of other power management activity in principle. - However, devices are not generally independent of each other (for - example, a parent device cannot be suspended unless all of its child - devices have been suspended). Moreover, depending on the bus type the - device is on, it may be necessary to carry out some bus-specific - operations on the device for this purpose. Devices put into low power - states at run time may require special handling during system-wide power - transitions (suspend or hibernation). - - For these reasons not only the device driver itself, but also the - appropriate subsystem (bus type, device type or device class) driver and - the PM core are involved in runtime power management. As in the system - sleep power management case, they need to collaborate by implementing - various role-specific suspend and resume methods, so that the hardware - is cleanly powered down and reactivated without data or service loss. - -There's not a lot to be said about those low-power states except that they are -very system-specific, and often device-specific. Also, that if enough devices -have been put into low-power states (at runtime), the effect may be very similar -to entering some system-wide low-power state (system sleep) ... and that -synergies exist, so that several drivers using runtime PM might put the system -into a state where even deeper power saving options are available. - -Most suspended devices will have quiesced all I/O: no more DMA or IRQs (except -for wakeup events), no more data read or written, and requests from upstream -drivers are no longer accepted. A given bus or platform may have different -requirements though. - -Examples of hardware wakeup events include an alarm from a real time clock, -network wake-on-LAN packets, keyboard or mouse activity, and media insertion -or removal (for PCMCIA, MMC/SD, USB, and so on). - - -Interfaces for Entering System Sleep States -=========================================== -There are programming interfaces provided for subsystems (bus type, device type, -device class) and device drivers to allow them to participate in the power -management of devices they are concerned with. These interfaces cover both -system sleep and runtime power management. - - -Device Power Management Operations ----------------------------------- -Device power management operations, at the subsystem level as well as at the -device driver level, are implemented by defining and populating objects of type -struct dev_pm_ops: - -struct dev_pm_ops { - int (*prepare)(struct device *dev); - void (*complete)(struct device *dev); - int (*suspend)(struct device *dev); - int (*resume)(struct device *dev); - int (*freeze)(struct device *dev); - int (*thaw)(struct device *dev); - int (*poweroff)(struct device *dev); - int (*restore)(struct device *dev); - int (*suspend_late)(struct device *dev); - int (*resume_early)(struct device *dev); - int (*freeze_late)(struct device *dev); - int (*thaw_early)(struct device *dev); - int (*poweroff_late)(struct device *dev); - int (*restore_early)(struct device *dev); - int (*suspend_noirq)(struct device *dev); - int (*resume_noirq)(struct device *dev); - int (*freeze_noirq)(struct device *dev); - int (*thaw_noirq)(struct device *dev); - int (*poweroff_noirq)(struct device *dev); - int (*restore_noirq)(struct device *dev); - int (*runtime_suspend)(struct device *dev); - int (*runtime_resume)(struct device *dev); - int (*runtime_idle)(struct device *dev); -}; - -This structure is defined in include/linux/pm.h and the methods included in it -are also described in that file. Their roles will be explained in what follows. -For now, it should be sufficient to remember that the last three methods are -specific to runtime power management while the remaining ones are used during -system-wide power transitions. - -There also is a deprecated "old" or "legacy" interface for power management -operations available at least for some subsystems. This approach does not use -struct dev_pm_ops objects and it is suitable only for implementing system sleep -power management methods. Therefore it is not described in this document, so -please refer directly to the source code for more information about it. - - -Subsystem-Level Methods ------------------------ -The core methods to suspend and resume devices reside in struct dev_pm_ops -pointed to by the ops member of struct dev_pm_domain, or by the pm member of -struct bus_type, struct device_type and struct class. They are mostly of -interest to the people writing infrastructure for platforms and buses, like PCI -or USB, or device type and device class drivers. They also are relevant to the -writers of device drivers whose subsystems (PM domains, device types, device -classes and bus types) don't provide all power management methods. - -Bus drivers implement these methods as appropriate for the hardware and the -drivers using it; PCI works differently from USB, and so on. Not many people -write subsystem-level drivers; most driver code is a "device driver" that builds -on top of bus-specific framework code. - -For more information on these driver calls, see the description later; -they are called in phases for every device, respecting the parent-child -sequencing in the driver model tree. - - -/sys/devices/.../power/wakeup files ------------------------------------ -All device objects in the driver model contain fields that control the handling -of system wakeup events (hardware signals that can force the system out of a -sleep state). These fields are initialized by bus or device driver code using -device_set_wakeup_capable() and device_set_wakeup_enable(), defined in -include/linux/pm_wakeup.h. - -The "power.can_wakeup" flag just records whether the device (and its driver) can -physically support wakeup events. The device_set_wakeup_capable() routine -affects this flag. The "power.wakeup" field is a pointer to an object of type -struct wakeup_source used for controlling whether or not the device should use -its system wakeup mechanism and for notifying the PM core of system wakeup -events signaled by the device. This object is only present for wakeup-capable -devices (i.e. devices whose "can_wakeup" flags are set) and is created (or -removed) by device_set_wakeup_capable(). - -Whether or not a device is capable of issuing wakeup events is a hardware -matter, and the kernel is responsible for keeping track of it. By contrast, -whether or not a wakeup-capable device should issue wakeup events is a policy -decision, and it is managed by user space through a sysfs attribute: the -"power/wakeup" file. User space can write the strings "enabled" or "disabled" -to it to indicate whether or not, respectively, the device is supposed to signal -system wakeup. This file is only present if the "power.wakeup" object exists -for the given device and is created (or removed) along with that object, by -device_set_wakeup_capable(). Reads from the file will return the corresponding -string. - -The "power/wakeup" file is supposed to contain the "disabled" string initially -for the majority of devices; the major exceptions are power buttons, keyboards, -and Ethernet adapters whose WoL (wake-on-LAN) feature has been set up with -ethtool. It should also default to "enabled" for devices that don't generate -wakeup requests on their own but merely forward wakeup requests from one bus to -another (like PCI Express ports). - -The device_may_wakeup() routine returns true only if the "power.wakeup" object -exists and the corresponding "power/wakeup" file contains the string "enabled". -This information is used by subsystems, like the PCI bus type code, to see -whether or not to enable the devices' wakeup mechanisms. If device wakeup -mechanisms are enabled or disabled directly by drivers, they also should use -device_may_wakeup() to decide what to do during a system sleep transition. -Device drivers, however, are not supposed to call device_set_wakeup_enable() -directly in any case. - -It ought to be noted that system wakeup is conceptually different from "remote -wakeup" used by runtime power management, although it may be supported by the -same physical mechanism. Remote wakeup is a feature allowing devices in -low-power states to trigger specific interrupts to signal conditions in which -they should be put into the full-power state. Those interrupts may or may not -be used to signal system wakeup events, depending on the hardware design. On -some systems it is impossible to trigger them from system sleep states. In any -case, remote wakeup should always be enabled for runtime power management for -all devices and drivers that support it. - -/sys/devices/.../power/control files ------------------------------------- -Each device in the driver model has a flag to control whether it is subject to -runtime power management. This flag, called runtime_auto, is initialized by the -bus type (or generally subsystem) code using pm_runtime_allow() or -pm_runtime_forbid(); the default is to allow runtime power management. - -The setting can be adjusted by user space by writing either "on" or "auto" to -the device's power/control sysfs file. Writing "auto" calls pm_runtime_allow(), -setting the flag and allowing the device to be runtime power-managed by its -driver. Writing "on" calls pm_runtime_forbid(), clearing the flag, returning -the device to full power if it was in a low-power state, and preventing the -device from being runtime power-managed. User space can check the current value -of the runtime_auto flag by reading the file. - -The device's runtime_auto flag has no effect on the handling of system-wide -power transitions. In particular, the device can (and in the majority of cases -should and will) be put into a low-power state during a system-wide transition -to a sleep state even though its runtime_auto flag is clear. - -For more information about the runtime power management framework, refer to -Documentation/power/runtime_pm.txt. - - -Calling Drivers to Enter and Leave System Sleep States -====================================================== -When the system goes into a sleep state, each device's driver is asked to -suspend the device by putting it into a state compatible with the target -system state. That's usually some version of "off", but the details are -system-specific. Also, wakeup-enabled devices will usually stay partly -functional in order to wake the system. - -When the system leaves that low-power state, the device's driver is asked to -resume it by returning it to full power. The suspend and resume operations -always go together, and both are multi-phase operations. - -For simple drivers, suspend might quiesce the device using class code -and then turn its hardware as "off" as possible during suspend_noirq. The -matching resume calls would then completely reinitialize the hardware -before reactivating its class I/O queues. - -More power-aware drivers might prepare the devices for triggering system wakeup -events. - - -Call Sequence Guarantees ------------------------- -To ensure that bridges and similar links needing to talk to a device are -available when the device is suspended or resumed, the device tree is -walked in a bottom-up order to suspend devices. A top-down order is -used to resume those devices. - -The ordering of the device tree is defined by the order in which devices -get registered: a child can never be registered, probed or resumed before -its parent; and can't be removed or suspended after that parent. - -The policy is that the device tree should match hardware bus topology. -(Or at least the control bus, for devices which use multiple busses.) -In particular, this means that a device registration may fail if the parent of -the device is suspending (i.e. has been chosen by the PM core as the next -device to suspend) or has already suspended, as well as after all of the other -devices have been suspended. Device drivers must be prepared to cope with such -situations. - - -System Power Management Phases ------------------------------- -Suspending or resuming the system is done in several phases. Different phases -are used for freeze, standby, and memory sleep states ("suspend-to-RAM") and the -hibernation state ("suspend-to-disk"). Each phase involves executing callbacks -for every device before the next phase begins. Not all busses or classes -support all these callbacks and not all drivers use all the callbacks. The -various phases always run after tasks have been frozen and before they are -unfrozen. Furthermore, the *_noirq phases run at a time when IRQ handlers have -been disabled (except for those marked with the IRQF_NO_SUSPEND flag). - -All phases use PM domain, bus, type, class or driver callbacks (that is, methods -defined in dev->pm_domain->ops, dev->bus->pm, dev->type->pm, dev->class->pm or -dev->driver->pm). These callbacks are regarded by the PM core as mutually -exclusive. Moreover, PM domain callbacks always take precedence over all of the -other callbacks and, for example, type callbacks take precedence over bus, class -and driver callbacks. To be precise, the following rules are used to determine -which callback to execute in the given phase: - - 1. If dev->pm_domain is present, the PM core will choose the callback - included in dev->pm_domain->ops for execution - - 2. Otherwise, if both dev->type and dev->type->pm are present, the callback - included in dev->type->pm will be chosen for execution. - - 3. Otherwise, if both dev->class and dev->class->pm are present, the - callback included in dev->class->pm will be chosen for execution. - - 4. Otherwise, if both dev->bus and dev->bus->pm are present, the callback - included in dev->bus->pm will be chosen for execution. - -This allows PM domains and device types to override callbacks provided by bus -types or device classes if necessary. - -The PM domain, type, class and bus callbacks may in turn invoke device- or -driver-specific methods stored in dev->driver->pm, but they don't have to do -that. - -If the subsystem callback chosen for execution is not present, the PM core will -execute the corresponding method from dev->driver->pm instead if there is one. - - -Entering System Suspend ------------------------ -When the system goes into the freeze, standby or memory sleep state, -the phases are: - - prepare, suspend, suspend_late, suspend_noirq. - - 1. The prepare phase is meant to prevent races by preventing new devices - from being registered; the PM core would never know that all the - children of a device had been suspended if new children could be - registered at will. (By contrast, devices may be unregistered at any - time.) Unlike the other suspend-related phases, during the prepare - phase the device tree is traversed top-down. - - After the prepare callback method returns, no new children may be - registered below the device. The method may also prepare the device or - driver in some way for the upcoming system power transition, but it - should not put the device into a low-power state. - - For devices supporting runtime power management, the return value of the - prepare callback can be used to indicate to the PM core that it may - safely leave the device in runtime suspend (if runtime-suspended - already), provided that all of the device's descendants are also left in - runtime suspend. Namely, if the prepare callback returns a positive - number and that happens for all of the descendants of the device too, - and all of them (including the device itself) are runtime-suspended, the - PM core will skip the suspend, suspend_late and suspend_noirq suspend - phases as well as the resume_noirq, resume_early and resume phases of - the following system resume for all of these devices. In that case, - the complete callback will be called directly after the prepare callback - and is entirely responsible for bringing the device back to the - functional state as appropriate. - - Note that this direct-complete procedure applies even if the device is - disabled for runtime PM; only the runtime-PM status matters. It follows - that if a device has system-sleep callbacks but does not support runtime - PM, then its prepare callback must never return a positive value. This - is because all devices are initially set to runtime-suspended with - runtime PM disabled. - - 2. The suspend methods should quiesce the device to stop it from performing - I/O. They also may save the device registers and put it into the - appropriate low-power state, depending on the bus type the device is on, - and they may enable wakeup events. - - 3 For a number of devices it is convenient to split suspend into the - "quiesce device" and "save device state" phases, in which cases - suspend_late is meant to do the latter. It is always executed after - runtime power management has been disabled for all devices. - - 4. The suspend_noirq phase occurs after IRQ handlers have been disabled, - which means that the driver's interrupt handler will not be called while - the callback method is running. The methods should save the values of - the device's registers that weren't saved previously and finally put the - device into the appropriate low-power state. - - The majority of subsystems and device drivers need not implement this - callback. However, bus types allowing devices to share interrupt - vectors, like PCI, generally need it; otherwise a driver might encounter - an error during the suspend phase by fielding a shared interrupt - generated by some other device after its own device had been set to low - power. - -At the end of these phases, drivers should have stopped all I/O transactions -(DMA, IRQs), saved enough state that they can re-initialize or restore previous -state (as needed by the hardware), and placed the device into a low-power state. -On many platforms they will gate off one or more clock sources; sometimes they -will also switch off power supplies or reduce voltages. (Drivers supporting -runtime PM may already have performed some or all of these steps.) - -If device_may_wakeup(dev) returns true, the device should be prepared for -generating hardware wakeup signals to trigger a system wakeup event when the -system is in the sleep state. For example, enable_irq_wake() might identify -GPIO signals hooked up to a switch or other external hardware, and -pci_enable_wake() does something similar for the PCI PME signal. - -If any of these callbacks returns an error, the system won't enter the desired -low-power state. Instead the PM core will unwind its actions by resuming all -the devices that were suspended. - - -Leaving System Suspend ----------------------- -When resuming from freeze, standby or memory sleep, the phases are: - - resume_noirq, resume_early, resume, complete. - - 1. The resume_noirq callback methods should perform any actions needed - before the driver's interrupt handlers are invoked. This generally - means undoing the actions of the suspend_noirq phase. If the bus type - permits devices to share interrupt vectors, like PCI, the method should - bring the device and its driver into a state in which the driver can - recognize if the device is the source of incoming interrupts, if any, - and handle them correctly. - - For example, the PCI bus type's ->pm.resume_noirq() puts the device into - the full-power state (D0 in the PCI terminology) and restores the - standard configuration registers of the device. Then it calls the - device driver's ->pm.resume_noirq() method to perform device-specific - actions. - - 2. The resume_early methods should prepare devices for the execution of - the resume methods. This generally involves undoing the actions of the - preceding suspend_late phase. - - 3 The resume methods should bring the device back to its operating - state, so that it can perform normal I/O. This generally involves - undoing the actions of the suspend phase. - - 4. The complete phase should undo the actions of the prepare phase. Note, - however, that new children may be registered below the device as soon as - the resume callbacks occur; it's not necessary to wait until the - complete phase. - - Moreover, if the preceding prepare callback returned a positive number, - the device may have been left in runtime suspend throughout the whole - system suspend and resume (the suspend, suspend_late, suspend_noirq - phases of system suspend and the resume_noirq, resume_early, resume - phases of system resume may have been skipped for it). In that case, - the complete callback is entirely responsible for bringing the device - back to the functional state after system suspend if necessary. [For - example, it may need to queue up a runtime resume request for the device - for this purpose.] To check if that is the case, the complete callback - can consult the device's power.direct_complete flag. Namely, if that - flag is set when the complete callback is being run, it has been called - directly after the preceding prepare and special action may be required - to make the device work correctly afterward. - -At the end of these phases, drivers should be as functional as they were before -suspending: I/O can be performed using DMA and IRQs, and the relevant clocks are -gated on. - -However, the details here may again be platform-specific. For example, -some systems support multiple "run" states, and the mode in effect at -the end of resume might not be the one which preceded suspension. -That means availability of certain clocks or power supplies changed, -which could easily affect how a driver works. - -Drivers need to be able to handle hardware which has been reset since the -suspend methods were called, for example by complete reinitialization. -This may be the hardest part, and the one most protected by NDA'd documents -and chip errata. It's simplest if the hardware state hasn't changed since -the suspend was carried out, but that can't be guaranteed (in fact, it usually -is not the case). - -Drivers must also be prepared to notice that the device has been removed -while the system was powered down, whenever that's physically possible. -PCMCIA, MMC, USB, Firewire, SCSI, and even IDE are common examples of busses -where common Linux platforms will see such removal. Details of how drivers -will notice and handle such removals are currently bus-specific, and often -involve a separate thread. - -These callbacks may return an error value, but the PM core will ignore such -errors since there's nothing it can do about them other than printing them in -the system log. - - -Entering Hibernation --------------------- -Hibernating the system is more complicated than putting it into the other -sleep states, because it involves creating and saving a system image. -Therefore there are more phases for hibernation, with a different set of -callbacks. These phases always run after tasks have been frozen and memory has -been freed. - -The general procedure for hibernation is to quiesce all devices (freeze), create -an image of the system memory while everything is stable, reactivate all -devices (thaw), write the image to permanent storage, and finally shut down the -system (poweroff). The phases used to accomplish this are: - - prepare, freeze, freeze_late, freeze_noirq, thaw_noirq, thaw_early, - thaw, complete, prepare, poweroff, poweroff_late, poweroff_noirq - - 1. The prepare phase is discussed in the "Entering System Suspend" section - above. - - 2. The freeze methods should quiesce the device so that it doesn't generate - IRQs or DMA, and they may need to save the values of device registers. - However the device does not have to be put in a low-power state, and to - save time it's best not to do so. Also, the device should not be - prepared to generate wakeup events. - - 3. The freeze_late phase is analogous to the suspend_late phase described - above, except that the device should not be put in a low-power state and - should not be allowed to generate wakeup events by it. - - 4. The freeze_noirq phase is analogous to the suspend_noirq phase discussed - above, except again that the device should not be put in a low-power - state and should not be allowed to generate wakeup events. - -At this point the system image is created. All devices should be inactive and -the contents of memory should remain undisturbed while this happens, so that the -image forms an atomic snapshot of the system state. - - 5. The thaw_noirq phase is analogous to the resume_noirq phase discussed - above. The main difference is that its methods can assume the device is - in the same state as at the end of the freeze_noirq phase. - - 6. The thaw_early phase is analogous to the resume_early phase described - above. Its methods should undo the actions of the preceding - freeze_late, if necessary. - - 7. The thaw phase is analogous to the resume phase discussed above. Its - methods should bring the device back to an operating state, so that it - can be used for saving the image if necessary. - - 8. The complete phase is discussed in the "Leaving System Suspend" section - above. - -At this point the system image is saved, and the devices then need to be -prepared for the upcoming system shutdown. This is much like suspending them -before putting the system into the freeze, standby or memory sleep state, -and the phases are similar. - - 9. The prepare phase is discussed above. - - 10. The poweroff phase is analogous to the suspend phase. - - 11. The poweroff_late phase is analogous to the suspend_late phase. - - 12. The poweroff_noirq phase is analogous to the suspend_noirq phase. - -The poweroff, poweroff_late and poweroff_noirq callbacks should do essentially -the same things as the suspend, suspend_late and suspend_noirq callbacks, -respectively. The only notable difference is that they need not store the -device register values, because the registers should already have been stored -during the freeze, freeze_late or freeze_noirq phases. - - -Leaving Hibernation -------------------- -Resuming from hibernation is, again, more complicated than resuming from a sleep -state in which the contents of main memory are preserved, because it requires -a system image to be loaded into memory and the pre-hibernation memory contents -to be restored before control can be passed back to the image kernel. - -Although in principle, the image might be loaded into memory and the -pre-hibernation memory contents restored by the boot loader, in practice this -can't be done because boot loaders aren't smart enough and there is no -established protocol for passing the necessary information. So instead, the -boot loader loads a fresh instance of the kernel, called the boot kernel, into -memory and passes control to it in the usual way. Then the boot kernel reads -the system image, restores the pre-hibernation memory contents, and passes -control to the image kernel. Thus two different kernels are involved in -resuming from hibernation. In fact, the boot kernel may be completely different -from the image kernel: a different configuration and even a different version. -This has important consequences for device drivers and their subsystems. - -To be able to load the system image into memory, the boot kernel needs to -include at least a subset of device drivers allowing it to access the storage -medium containing the image, although it doesn't need to include all of the -drivers present in the image kernel. After the image has been loaded, the -devices managed by the boot kernel need to be prepared for passing control back -to the image kernel. This is very similar to the initial steps involved in -creating a system image, and it is accomplished in the same way, using prepare, -freeze, and freeze_noirq phases. However the devices affected by these phases -are only those having drivers in the boot kernel; other devices will still be in -whatever state the boot loader left them. - -Should the restoration of the pre-hibernation memory contents fail, the boot -kernel would go through the "thawing" procedure described above, using the -thaw_noirq, thaw, and complete phases, and then continue running normally. This -happens only rarely. Most often the pre-hibernation memory contents are -restored successfully and control is passed to the image kernel, which then -becomes responsible for bringing the system back to the working state. - -To achieve this, the image kernel must restore the devices' pre-hibernation -functionality. The operation is much like waking up from the memory sleep -state, although it involves different phases: - - restore_noirq, restore_early, restore, complete - - 1. The restore_noirq phase is analogous to the resume_noirq phase. - - 2. The restore_early phase is analogous to the resume_early phase. - - 3. The restore phase is analogous to the resume phase. - - 4. The complete phase is discussed above. - -The main difference from resume[_early|_noirq] is that restore[_early|_noirq] -must assume the device has been accessed and reconfigured by the boot loader or -the boot kernel. Consequently the state of the device may be different from the -state remembered from the freeze, freeze_late and freeze_noirq phases. The -device may even need to be reset and completely re-initialized. In many cases -this difference doesn't matter, so the resume[_early|_noirq] and -restore[_early|_norq] method pointers can be set to the same routines. -Nevertheless, different callback pointers are used in case there is a situation -where it actually does matter. - - -Device Power Management Domains -------------------------------- -Sometimes devices share reference clocks or other power resources. In those -cases it generally is not possible to put devices into low-power states -individually. Instead, a set of devices sharing a power resource can be put -into a low-power state together at the same time by turning off the shared -power resource. Of course, they also need to be put into the full-power state -together, by turning the shared power resource on. A set of devices with this -property is often referred to as a power domain. A power domain may also be -nested inside another power domain. The nested domain is referred to as the -sub-domain of the parent domain. - -Support for power domains is provided through the pm_domain field of struct -device. This field is a pointer to an object of type struct dev_pm_domain, -defined in include/linux/pm.h, providing a set of power management callbacks -analogous to the subsystem-level and device driver callbacks that are executed -for the given device during all power transitions, instead of the respective -subsystem-level callbacks. Specifically, if a device's pm_domain pointer is -not NULL, the ->suspend() callback from the object pointed to by it will be -executed instead of its subsystem's (e.g. bus type's) ->suspend() callback and -analogously for all of the remaining callbacks. In other words, power -management domain callbacks, if defined for the given device, always take -precedence over the callbacks provided by the device's subsystem (e.g. bus -type). - -The support for device power management domains is only relevant to platforms -needing to use the same device driver power management callbacks in many -different power domain configurations and wanting to avoid incorporating the -support for power domains into subsystem-level callbacks, for example by -modifying the platform bus type. Other platforms need not implement it or take -it into account in any way. - -Devices may be defined as IRQ-safe which indicates to the PM core that their -runtime PM callbacks may be invoked with disabled interrupts (see -Documentation/power/runtime_pm.txt for more information). If an IRQ-safe -device belongs to a PM domain, the runtime PM of the domain will be -disallowed, unless the domain itself is defined as IRQ-safe. However, it -makes sense to define a PM domain as IRQ-safe only if all the devices in it -are IRQ-safe. Moreover, if an IRQ-safe domain has a parent domain, the runtime -PM of the parent is only allowed if the parent itself is IRQ-safe too with the -additional restriction that all child domains of an IRQ-safe parent must also -be IRQ-safe. - -Device Low Power (suspend) States ---------------------------------- -Device low-power states aren't standard. One device might only handle -"on" and "off", while another might support a dozen different versions of -"on" (how many engines are active?), plus a state that gets back to "on" -faster than from a full "off". - -Some busses define rules about what different suspend states mean. PCI -gives one example: after the suspend sequence completes, a non-legacy -PCI device may not perform DMA or issue IRQs, and any wakeup events it -issues would be issued through the PME# bus signal. Plus, there are -several PCI-standard device states, some of which are optional. - -In contrast, integrated system-on-chip processors often use IRQs as the -wakeup event sources (so drivers would call enable_irq_wake) and might -be able to treat DMA completion as a wakeup event (sometimes DMA can stay -active too, it'd only be the CPU and some peripherals that sleep). - -Some details here may be platform-specific. Systems may have devices that -can be fully active in certain sleep states, such as an LCD display that's -refreshed using DMA while most of the system is sleeping lightly ... and -its frame buffer might even be updated by a DSP or other non-Linux CPU while -the Linux control processor stays idle. - -Moreover, the specific actions taken may depend on the target system state. -One target system state might allow a given device to be very operational; -another might require a hard shut down with re-initialization on resume. -And two different target systems might use the same device in different -ways; the aforementioned LCD might be active in one product's "standby", -but a different product using the same SOC might work differently. - - -Power Management Notifiers --------------------------- -There are some operations that cannot be carried out by the power management -callbacks discussed above, because the callbacks occur too late or too early. -To handle these cases, subsystems and device drivers may register power -management notifiers that are called before tasks are frozen and after they have -been thawed. Generally speaking, the PM notifiers are suitable for performing -actions that either require user space to be available, or at least won't -interfere with user space. - -For details refer to Documentation/power/notifiers.txt. - - -Runtime Power Management -======================== -Many devices are able to dynamically power down while the system is still -running. This feature is useful for devices that are not being used, and -can offer significant power savings on a running system. These devices -often support a range of runtime power states, which might use names such -as "off", "sleep", "idle", "active", and so on. Those states will in some -cases (like PCI) be partially constrained by the bus the device uses, and will -usually include hardware states that are also used in system sleep states. - -A system-wide power transition can be started while some devices are in low -power states due to runtime power management. The system sleep PM callbacks -should recognize such situations and react to them appropriately, but the -necessary actions are subsystem-specific. - -In some cases the decision may be made at the subsystem level while in other -cases the device driver may be left to decide. In some cases it may be -desirable to leave a suspended device in that state during a system-wide power -transition, but in other cases the device must be put back into the full-power -state temporarily, for example so that its system wakeup capability can be -disabled. This all depends on the hardware and the design of the subsystem and -device driver in question. - -During system-wide resume from a sleep state it's easiest to put devices into -the full-power state, as explained in Documentation/power/runtime_pm.txt. Refer -to that document for more information regarding this particular issue as well as -for information on the device runtime power management framework in general. diff --git a/Documentation/power/freezing-of-tasks.txt b/Documentation/power/freezing-of-tasks.txt index 85894d83b352..af005770e767 100644 --- a/Documentation/power/freezing-of-tasks.txt +++ b/Documentation/power/freezing-of-tasks.txt @@ -197,7 +197,8 @@ tasks, since it generally exists anyway. A driver must have all firmwares it may need in RAM before suspend() is called. If keeping them is not practical, for example due to their size, they must be -requested early enough using the suspend notifier API described in notifiers.txt. +requested early enough using the suspend notifier API described in +Documentation/driver-api/pm/notifiers.rst. VI. Are there any precautions to be taken to prevent freezing failures? diff --git a/Documentation/power/notifiers.txt b/Documentation/power/notifiers.txt deleted file mode 100644 index a81fa254303d..000000000000 --- a/Documentation/power/notifiers.txt +++ /dev/null @@ -1,55 +0,0 @@ -Suspend notifiers - (C) 2007-2011 Rafael J. Wysocki <rjw@sisk.pl>, GPL - -There are some operations that subsystems or drivers may want to carry out -before hibernation/suspend or after restore/resume, but they require the system -to be fully functional, so the drivers' and subsystems' .suspend() and .resume() -or even .prepare() and .complete() callbacks are not suitable for this purpose. -For example, device drivers may want to upload firmware to their devices after -resume/restore, but they cannot do it by calling request_firmware() from their -.resume() or .complete() routines (user land processes are frozen at these -points). The solution may be to load the firmware into memory before processes -are frozen and upload it from there in the .resume() routine. -A suspend/hibernation notifier may be used for this purpose. - -The subsystems or drivers having such needs can register suspend notifiers that -will be called upon the following events by the PM core: - -PM_HIBERNATION_PREPARE The system is going to hibernate, tasks will be frozen - immediately. This is different from PM_SUSPEND_PREPARE - below because here we do additional work between notifiers - and drivers freezing. - -PM_POST_HIBERNATION The system memory state has been restored from a - hibernation image or an error occurred during - hibernation. Device drivers' restore callbacks have - been executed and tasks have been thawed. - -PM_RESTORE_PREPARE The system is going to restore a hibernation image. - If all goes well, the restored kernel will issue a - PM_POST_HIBERNATION notification. - -PM_POST_RESTORE An error occurred during restore from hibernation. - Device drivers' restore callbacks have been executed - and tasks have been thawed. - -PM_SUSPEND_PREPARE The system is preparing for suspend. - -PM_POST_SUSPEND The system has just resumed or an error occurred during - suspend. Device drivers' resume callbacks have been - executed and tasks have been thawed. - -It is generally assumed that whatever the notifiers do for -PM_HIBERNATION_PREPARE, should be undone for PM_POST_HIBERNATION. Analogously, -operations performed for PM_SUSPEND_PREPARE should be reversed for -PM_POST_SUSPEND. Additionally, all of the notifiers are called for -PM_POST_HIBERNATION if one of them fails for PM_HIBERNATION_PREPARE, and -all of the notifiers are called for PM_POST_SUSPEND if one of them fails for -PM_SUSPEND_PREPARE. - -The hibernation and suspend notifiers are called with pm_mutex held. They are -defined in the usual way, but their last argument is meaningless (it is always -NULL). To register and/or unregister a suspend notifier use the functions -register_pm_notifier() and unregister_pm_notifier(), respectively, defined in -include/linux/suspend.h . If you don't need to unregister the notifier, you can -also use the pm_notifier() macro defined in include/linux/suspend.h . diff --git a/Documentation/power/pci.txt b/Documentation/power/pci.txt index 85c746cbab2c..a1b7f7158930 100644 --- a/Documentation/power/pci.txt +++ b/Documentation/power/pci.txt @@ -713,7 +713,7 @@ In addition to that the prepare() callback may carry out some operations preparing the device to be suspended, although it should not allocate memory (if additional memory is required to suspend the device, it has to be preallocated earlier, for example in a suspend/hibernate notifier as described -in Documentation/power/notifiers.txt). +in Documentation/driver-api/pm/notifiers.rst). 3.1.2. suspend() diff --git a/Documentation/pps/pps.txt b/Documentation/pps/pps.txt index 50022b3c8ebf..1fdbd5447216 100644 --- a/Documentation/pps/pps.txt +++ b/Documentation/pps/pps.txt @@ -63,7 +63,7 @@ for instance) is a PPS source too, and if not they should provide the possibility to open another device as PPS source. In LinuxPPS the PPS sources are simply char devices usually mapped -into files /dev/pps0, /dev/pps1, etc.. +into files /dev/pps0, /dev/pps1, etc. PPS with USB to serial devices @@ -71,9 +71,12 @@ PPS with USB to serial devices It is possible to grab the PPS from an USB to serial device. However, you should take into account the latencies and jitter introduced by -the USB stack. Users has reported clock instability around +-1ms when -synchronized with PPS through USB. This isn't suited for time server -synchronization. +the USB stack. Users have reported clock instability around +-1ms when +synchronized with PPS through USB. With USB 2.0, jitter may decrease +down to the order of 125 microseconds. + +This may be suitable for time server synchronization with NTP because +of its undersampling and algorithms. If your device doesn't report PPS, you can check that the feature is supported by its driver. Most of the time, you only need to add a call @@ -166,7 +169,8 @@ Testing the PPS support In order to test the PPS support even without specific hardware you can use the ktimer driver (see the client subsection in the PPS configuration menu) -and the userland tools provided in the Documentation/pps/ directory. +and the userland tools available in your distribution's pps-tools package, +http://linuxpps.org , or https://github.com/ago/pps-tools . Once you have enabled the compilation of ktimer just modprobe it (if not statically compiled): @@ -183,8 +187,8 @@ and the run ppstest as follow: source 0 - assert 1186592700.388931295, sequence: 365 - clear 0.000000000, sequence: 0 source 0 - assert 1186592701.389032765, sequence: 366 - clear 0.000000000, sequence: 0 -Please, note that to compile userland programs you need the file timepps.h -(see Documentation/pps/). +Please, note that to compile userland programs you need the file timepps.h . +This is available in the pps-tools repository mentioned above. Generators diff --git a/Documentation/thermal/nouveau_thermal b/Documentation/thermal/nouveau_thermal index 60bc29357ac3..6e17a11efcb0 100644 --- a/Documentation/thermal/nouveau_thermal +++ b/Documentation/thermal/nouveau_thermal @@ -42,7 +42,7 @@ thresholds can be configured thanks to the following HWMON attributes: * Critical: temp1_crit and temp1_crit_hyst; * Shutdown: temp1_emergency and temp1_emergency_hyst. -NOTE: Remember that the values are stored as milli degrees Celcius. Don't forget +NOTE: Remember that the values are stored as milli degrees Celsius. Don't forget to multiply! Fan management diff --git a/Documentation/translations/ja_JP/HOWTO b/Documentation/translations/ja_JP/HOWTO index b03fc8047f03..4ebd20750ef1 100644 --- a/Documentation/translations/ja_JP/HOWTO +++ b/Documentation/translations/ja_JP/HOWTO @@ -111,7 +111,7 @@ Linux カーãƒãƒ«ã‚½ãƒ¼ã‚¹ãƒ„リーã¯å¹…広ã„範囲ã®ãƒ‰ã‚ュメントをå カーãƒãƒ«ã®å¤‰æ›´ãŒã€ã‚«ãƒ¼ãƒãƒ«ãŒãƒ¦ãƒ¼ã‚¶ç©ºé–“ã«å…¬é–‹ã—ã¦ã„るインターフェイス㮠変更を引ãèµ·ã“ã™å ´åˆã€ãã®å¤‰æ›´ã‚’説明ã™ã‚‹ãƒžãƒ‹ãƒ¥ã‚¢ãƒ«ãƒšãƒ¼ã‚¸ã®ãƒ‘ッãƒã‚„æƒ…å ± をマニュアルページã®ãƒ¡ãƒ³ãƒ†ãƒŠ mtk.manpages@gmail.com ã«é€ã‚Šã€CC ã‚’ -linux-api@ver.kernel.org ã«é€ã‚‹ã“ã¨ã‚’勧ã‚ã¾ã™ã€‚ +linux-api@vger.kernel.org ã«é€ã‚‹ã“ã¨ã‚’勧ã‚ã¾ã™ã€‚ 以下ã¯ã‚«ãƒ¼ãƒãƒ«ã‚½ãƒ¼ã‚¹ãƒ„リーã«å«ã¾ã‚Œã¦ã„ã‚‹èªã‚“ã§ãŠãã¹ãファイルã®ä¸€è¦§ã§ ã™- diff --git a/Documentation/translations/ko_KR/howto.rst b/Documentation/translations/ko_KR/howto.rst index 3b0c15b277e0..2333697251dd 100644 --- a/Documentation/translations/ko_KR/howto.rst +++ b/Documentation/translations/ko_KR/howto.rst @@ -289,8 +289,8 @@ pub/linux/kernel/v4.x/ ë””ë ‰í† ë¦¬ì—ì„œ 참조ë 수 있다.개발 프로세ì Andrew Mortonì˜ ê¸€ì´ ìžˆë‹¤. *"커ë„ì´ ì–¸ì œ ë°°í¬ë 지는 ì•„ë¬´ë„ ëª¨ë¥¸ë‹¤. 왜ëƒí•˜ë©´ ë°°í¬ëŠ” ì•Œë ¤ì§„ - ë²„ê·¸ì˜ ìƒí™©ì— ë”°ë¼ ë°°í¬ë˜ëŠ” 것ì´ì§€ ë¯¸ë¦¬ì •í•´ ë†“ì€ ì‹œê°„ì— ë”°ë¼ - ë°°í¬ë˜ëŠ” ê²ƒì€ ì•„ë‹ˆê¸° 때문ì´ë‹¤."* + ë²„ê·¸ì˜ ìƒí™©ì— ë”°ë¼ ë°°í¬ë˜ëŠ” 것ì´ì§€ ë¯¸ë¦¬ì •í•´ ë†“ì€ ì‹œê°„ì— ë”°ë¼ + ë°°í¬ë˜ëŠ” ê²ƒì€ ì•„ë‹ˆê¸° 때문ì´ë‹¤."* 4.x.y - ì•ˆì • ì»¤ë„ íŠ¸ë¦¬ ~~~~~~~~~~~~~~~~~~~~~~ diff --git a/Documentation/translations/zh_CN/CodingStyle b/Documentation/translations/zh_CN/CodingStyle deleted file mode 100644 index dc101f48e713..000000000000 --- a/Documentation/translations/zh_CN/CodingStyle +++ /dev/null @@ -1,813 +0,0 @@ -Chinese translated version of Documentation/process/coding-style.rst - -If you have any comment or update to the content, please post to LKML directly. -However, if you have problem communicating in English you can also ask the -Chinese maintainer for help. Contact the Chinese maintainer, if this -translation is outdated or there is problem with translation. - -Chinese maintainer: Zhang Le <r0bertz@gentoo.org> ---------------------------------------------------------------------- -Documentation/process/coding-style.rstçš„ä¸æ–‡ç¿»è¯‘ - -如果想评论或更新本文的内容,请直接å‘信到LKMLã€‚å¦‚æžœä½ ä½¿ç”¨è‹±æ–‡äº¤æµæœ‰å›°éš¾çš„è¯ï¼Œä¹Ÿå¯ -以å‘ä¸æ–‡ç‰ˆç»´æŠ¤è€…求助。如果本翻译更新ä¸åŠæ—¶æˆ–者翻译å˜åœ¨é—®é¢˜ï¼Œè¯·è”ç³»ä¸æ–‡ç‰ˆç»´æŠ¤è€…。 - -ä¸æ–‡ç‰ˆç»´æŠ¤è€…: å¼ ä¹ Zhang Le <r0bertz@gentoo.org> -ä¸æ–‡ç‰ˆç¿»è¯‘者: å¼ ä¹ Zhang Le <r0bertz@gentoo.org> -ä¸æ–‡ç‰ˆæ ¡è¯‘者: çŽ‹èª Wang Cong <xiyou.wangcong@gmail.com> - wheelz <kernel.zeng@gmail.com> - 管æ—东 Xudong Guan <xudong.guan@gmail.com> - Li Zefan <lizf@cn.fujitsu.com> - Wang Chen <wangchen@cn.fujitsu.com> -以下为æ£æ–‡ ---------------------------------------------------------------------- - - Linuxå†…æ ¸ä»£ç é£Žæ ¼ - -这是一个简çŸçš„文档,æ述了 linux å†…æ ¸çš„é¦–é€‰ä»£ç é£Žæ ¼ã€‚ä»£ç é£Žæ ¼æ˜¯å› äººè€Œå¼‚çš„ï¼Œè€Œä¸”æˆ‘ -ä¸æ„¿æ„æŠŠè‡ªå·±çš„è§‚ç‚¹å¼ºåŠ ç»™ä»»ä½•äººï¼Œä½†è¿™å°±åƒæˆ‘去åšä»»ä½•äº‹æƒ…都必须éµå¾ªçš„åŽŸåˆ™é‚£æ ·ï¼Œæˆ‘ä¹Ÿ -希望在ç»å¤§å¤šæ•°äº‹ä¸Šä¿æŒè¿™ç§çš„æ€åº¦ã€‚请(在写代ç 时)至少考虑一下这里的代ç é£Žæ ¼ã€‚ - -é¦–å…ˆï¼Œæˆ‘å»ºè®®ä½ æ‰“å°ä¸€ä»½ GNU 代ç 规范,然åŽä¸è¦è¯»ã€‚烧了它,这是一个具有é‡å¤§è±¡å¾æ€§æ„义 -的动作。 - -ä¸ç®¡æ€Žæ ·ï¼ŒçŽ°åœ¨æˆ‘们开始: - - - ç¬¬ä¸€ç« ï¼šç¼©è¿› - -制表符是 8 个å—符,所以缩进也是 8 个å—符。有些异端è¿åŠ¨è¯•å›¾å°†ç¼©è¿›å˜ä¸º 4(甚至 2ï¼ï¼‰ -个å—ç¬¦æ·±ï¼Œè¿™å‡ ä¹Žç›¸å½“äºŽå°è¯•å°†åœ†å‘¨çŽ‡çš„值定义为 3。 - -ç†ç”±ï¼šç¼©è¿›çš„全部æ„义就在于清楚的定义一个控制å—èµ·æ¢äºŽä½•å¤„ã€‚å°¤å…¶æ˜¯å½“ä½ ç›¯ç€ä½ çš„å±å¹• -è¿žç»çœ‹äº† 20 å°æ—¶ä¹‹åŽï¼Œä½ 将会å‘çŽ°å¤§ä¸€ç‚¹çš„ç¼©è¿›ä¼šä½¿ä½ æ›´å®¹æ˜“åˆ†è¾¨ç¼©è¿›ã€‚ - -现在,有些人会抱怨 8 个å—符的缩进会使代ç å‘å³è¾¹ç§»åŠ¨çš„太远,在 80 个å—符的终端å±å¹•ä¸Š -å°±å¾ˆéš¾è¯»è¿™æ ·çš„ä»£ç 。这个问题的ç”æ¡ˆæ˜¯ï¼Œå¦‚æžœä½ éœ€è¦ 3 级以上的缩进,ä¸ç®¡ç”¨ä½•ç§æ–¹å¼ä½ -的代ç å·²ç»æœ‰é—®é¢˜äº†ï¼Œåº”该修æ£ä½ 的程åºã€‚ - -简而言之,8 个å—符的缩进å¯ä»¥è®©ä»£ç æ›´å®¹æ˜“é˜…è¯»ï¼Œè¿˜æœ‰ä¸€ä¸ªå¥½å¤„æ˜¯å½“ä½ çš„å‡½æ•°åµŒå¥—å¤ªæ·±çš„ -时候å¯ä»¥ç»™ä½ è¦å‘Šã€‚留心这个è¦å‘Šã€‚ - -在 switch è¯å¥ä¸æ¶ˆé™¤å¤šçº§ç¼©è¿›çš„首选的方å¼æ˜¯è®© “switch†和从属于它的 “caseâ€ æ ‡ç¾ -对é½äºŽåŒä¸€åˆ—,而ä¸è¦ “两次缩进†“caseâ€ æ ‡ç¾ã€‚比如: - - switch (suffix) { - case 'G': - case 'g': - mem <<= 30; - break; - case 'M': - case 'm': - mem <<= 20; - break; - case 'K': - case 'k': - mem <<= 10; - /* fall through */ - default: - break; - } - -ä¸è¦æŠŠå¤šä¸ªè¯å¥æ”¾åœ¨ä¸€è¡Œé‡Œï¼Œé™¤éžä½ 有什么东西è¦éšè—: - - if (condition) do_this; - do_something_everytime; - -也ä¸è¦åœ¨ä¸€è¡Œé‡Œæ”¾å¤šä¸ªèµ‹å€¼è¯å¥ã€‚å†…æ ¸ä»£ç é£Žæ ¼è¶…çº§ç®€å•ã€‚就是é¿å…å¯èƒ½å¯¼è‡´åˆ«äººè¯¯è¯»çš„表 -è¾¾å¼ã€‚ - -除了注释ã€æ–‡æ¡£å’Œ Kconfig 之外,ä¸è¦ä½¿ç”¨ç©ºæ ¼æ¥ç¼©è¿›ï¼Œå‰é¢çš„例å是例外,是有æ„为之。 - -选用一个好的编辑器,ä¸è¦åœ¨è¡Œå°¾ç•™ç©ºæ ¼ã€‚ - - - ç¬¬äºŒç« ï¼šæŠŠé•¿çš„è¡Œå’Œå—符串打散 - -代ç é£Žæ ¼çš„æ„义就在于使用平常使用的工具æ¥ç»´æŒä»£ç çš„å¯è¯»æ€§å’Œå¯ç»´æŠ¤æ€§ã€‚ - -æ¯ä¸€è¡Œçš„长度的é™åˆ¶æ˜¯ 80 列,我们强烈建议您éµå®ˆè¿™ä¸ªæƒ¯ä¾‹ã€‚ - -长于 80 列的è¯å¥è¦æ‰“æ•£æˆæœ‰æ„义的片段。除éžè¶…过 80 åˆ—èƒ½æ˜¾è‘—å¢žåŠ å¯è¯»æ€§ï¼Œå¹¶ä¸”ä¸ä¼šéšè— -ä¿¡æ¯ã€‚å片段è¦æ˜Žæ˜¾çŸäºŽæ¯ç‰‡æ®µï¼Œå¹¶æ˜Žæ˜¾é å³ã€‚è¿™åŒæ ·é€‚用于有ç€å¾ˆé•¿å‚数列表的函数头。 -然而,ç»å¯¹ä¸è¦æ‰“散对用户å¯è§çš„å—符串,例如 printk ä¿¡æ¯ï¼Œå› ä¸ºè¿™å°†å¯¼è‡´æ— æ³• grep 这些 -ä¿¡æ¯ã€‚ - - ç¬¬ä¸‰ç« ï¼šå¤§æ‹¬å·å’Œç©ºæ ¼çš„放置 - -Cè¯è¨€é£Žæ ¼ä¸å¦å¤–一个常è§é—®é¢˜æ˜¯å¤§æ‹¬å·çš„放置。和缩进大å°ä¸åŒï¼Œé€‰æ‹©æˆ–弃用æŸç§æ”¾ç½®ç– -ç•¥å¹¶æ²¡æœ‰å¤šå°‘æŠ€æœ¯ä¸Šçš„åŽŸå› ï¼Œä¸è¿‡é¦–选的方å¼ï¼Œå°±åƒ Kernighan å’Œ Ritchie 展示给我们的, -是把起始大括å·æ”¾åœ¨è¡Œå°¾ï¼Œè€ŒæŠŠç»“æŸå¤§æ‹¬å·æ”¾åœ¨è¡Œé¦–,所以: - - if (x is true) { - we do y - } - -这适用于所有的éžå‡½æ•°è¯å¥å—(ifã€switchã€forã€whileã€do)。比如: - - switch (action) { - case KOBJ_ADD: - return "add"; - case KOBJ_REMOVE: - return "remove"; - case KOBJ_CHANGE: - return "change"; - default: - return NULL; - } - -ä¸è¿‡ï¼Œæœ‰ä¸€ä¸ªä¾‹å¤–,那就是函数:函数的起始大括å·æ”¾ç½®äºŽä¸‹ä¸€è¡Œçš„开头,所以: - - int function(int x) - { - body of function - } - -全世界的异端å¯èƒ½ä¼šæŠ±æ€¨è¿™ä¸ªä¸ä¸€è‡´æ€§æ˜¯â€¦â€¦å‘ƒâ€¦â€¦ä¸ä¸€è‡´çš„,ä¸è¿‡æ‰€æœ‰æ€ç»´å¥å…¨çš„äººéƒ½çŸ¥é“ -(a) K&R 是 _æ£ç¡®çš„_,并且 (b) K&R 是æ£ç¡®çš„。æ¤å¤–,ä¸ç®¡æ€Žæ ·å‡½æ•°éƒ½æ˜¯ç‰¹æ®Šçš„(C -函数是ä¸èƒ½åµŒå¥—的)。 - -注æ„结æŸå¤§æ‹¬å·ç‹¬è‡ªå æ®ä¸€è¡Œï¼Œé™¤éžå®ƒåŽé¢è·Ÿç€åŒä¸€ä¸ªè¯å¥çš„剩余部分,也就是 do è¯å¥ä¸çš„ -“while†或者 if è¯å¥ä¸çš„ “elseâ€ï¼Œåƒè¿™æ ·ï¼š - - do { - body of do-loop - } while (condition); - -å’Œ - - if (x == y) { - .. - } else if (x > y) { - ... - } else { - .... - } - -ç†ç”±ï¼šK&R。 - -也请注æ„è¿™ç§å¤§æ‹¬å·çš„放置方å¼ä¹Ÿèƒ½ä½¿ç©ºï¼ˆæˆ–者差ä¸å¤šç©ºçš„)行的数é‡æœ€å°åŒ–,åŒæ—¶ä¸å¤±å¯ -è¯»æ€§ã€‚å› æ¤ï¼Œç”±äºŽä½ çš„å±å¹•ä¸Šçš„新行是ä¸å¯å†ç”Ÿèµ„æºï¼ˆæƒ³æƒ³ 25 行的终端å±å¹•ï¼‰ï¼Œä½ 将会有更 -多的空行æ¥æ”¾ç½®æ³¨é‡Šã€‚ - -当åªæœ‰ä¸€ä¸ªå•ç‹¬çš„è¯å¥çš„时候,ä¸ç”¨åŠ ä¸å¿…è¦çš„大括å·ã€‚ - - if (condition) - action(); - -å’Œ - - if (condition) - do_this(); - else - do_that(); - -这并ä¸é€‚用于åªæœ‰ä¸€ä¸ªæ¡ä»¶åˆ†æ”¯æ˜¯å•è¯å¥çš„情况;这时所有分支都è¦ä½¿ç”¨å¤§æ‹¬å·ï¼š - - if (condition) { - do_this(); - do_that(); - } else { - otherwise(); - } - - 3.1ï¼šç©ºæ ¼ - -Linux å†…æ ¸çš„ç©ºæ ¼ä½¿ç”¨æ–¹å¼ï¼ˆä¸»è¦ï¼‰å–决于它是用于函数还是关键å—。(大多数)关键å—åŽ -è¦åŠ ä¸€ä¸ªç©ºæ ¼ã€‚å€¼å¾—æ³¨æ„的例外是 sizeofã€typeofã€alignof å’Œ __attribute__,这些 -关键å—æŸäº›ç¨‹åº¦ä¸Šçœ‹èµ·æ¥æ›´åƒå‡½æ•°ï¼ˆå®ƒä»¬åœ¨ Linux 里也常常伴éšå°æ‹¬å·è€Œä½¿ç”¨ï¼Œå°½ç®¡åœ¨ C 里 -è¿™æ ·çš„å°æ‹¬å·ä¸æ˜¯å¿…éœ€çš„ï¼Œå°±åƒ â€œstruct fileinfo info†声明过åŽçš„ “sizeof infoâ€ï¼‰ã€‚ - -所以在这些关键å—之åŽæ”¾ä¸€ä¸ªç©ºæ ¼ï¼š - - if, switch, case, for, do, while - -但是ä¸è¦åœ¨ sizeofã€typeofã€alignof 或者 __attribute__ 这些关键å—之åŽæ”¾ç©ºæ ¼ã€‚例如, - - s = sizeof(struct file); - -ä¸è¦åœ¨å°æ‹¬å·é‡Œçš„表达å¼ä¸¤ä¾§åŠ ç©ºæ ¼ã€‚è¿™æ˜¯ä¸€ä¸ªå例: - - s = sizeof( struct file ); - -当声明指针类型或者返回指针类型的函数时,“*†的首选使用方å¼æ˜¯ä½¿ä¹‹é è¿‘å˜é‡å或者函 -æ•°å,而ä¸æ˜¯é 近类型å。例å: - - char *linux_banner; - unsigned long long memparse(char *ptr, char **retptr); - char *match_strdup(substring_t *s); - -在大多数二元和三元æ“ä½œç¬¦ä¸¤ä¾§ä½¿ç”¨ä¸€ä¸ªç©ºæ ¼ï¼Œä¾‹å¦‚ä¸‹é¢æ‰€æœ‰è¿™äº›æ“作符: - - = + - < > * / % | & ^ <= >= == != ? : - -但是一元æ“作符åŽä¸è¦åŠ ç©ºæ ¼ï¼š - - & * + - ~ ! sizeof typeof alignof __attribute__ defined - -åŽç¼€è‡ªåŠ 和自å‡ä¸€å…ƒæ“作符å‰ä¸åŠ ç©ºæ ¼ï¼š - - ++ -- - -å‰ç¼€è‡ªåŠ 和自å‡ä¸€å…ƒæ“作符åŽä¸åŠ ç©ºæ ¼ï¼š - - ++ -- - -‘.’ å’Œ “->†结构体æˆå‘˜æ“作符å‰åŽä¸åŠ ç©ºæ ¼ã€‚ - -ä¸è¦åœ¨è¡Œå°¾ç•™ç©ºç™½ã€‚有些å¯ä»¥è‡ªåŠ¨ç¼©è¿›çš„ç¼–è¾‘å™¨ä¼šåœ¨æ–°è¡Œçš„è¡Œé¦–åŠ å…¥é€‚é‡çš„空白,然åŽä½ -å°±å¯ä»¥ç›´æŽ¥åœ¨é‚£ä¸€è¡Œè¾“入代ç 。ä¸è¿‡å‡å¦‚ä½ æœ€åŽæ²¡æœ‰åœ¨é‚£ä¸€è¡Œè¾“入代ç ï¼Œæœ‰äº›ç¼–è¾‘å™¨å°±ä¸ -会移除已ç»åŠ 入的空白,就åƒä½ æ•…æ„留下一个åªæœ‰ç©ºç™½çš„行。包å«è¡Œå°¾ç©ºç™½çš„è¡Œå°±è¿™æ ·äº§ -生了。 - -当gitå‘现补ä¸åŒ…å«äº†è¡Œå°¾ç©ºç™½çš„时候会è¦å‘Šä½ ,并且å¯ä»¥åº”ä½ çš„è¦æ±‚去掉行尾空白;ä¸è¿‡ -å¦‚æžœä½ æ˜¯æ£åœ¨æ‰“一系列补ä¸ï¼Œè¿™æ ·åšä¼šå¯¼è‡´åŽé¢çš„è¡¥ä¸å¤±è´¥ï¼Œå› ä¸ºä½ æ”¹å˜äº†è¡¥ä¸çš„上下文。 - - - ç¬¬å››ç« ï¼šå‘½å - -C是一个简朴的è¯è¨€ï¼Œä½ 的命åä¹Ÿåº”è¯¥è¿™æ ·ã€‚å’Œ Modula-2 å’Œ Pascal 程åºå‘˜ä¸åŒï¼ŒC 程åºå‘˜ -ä¸ä½¿ç”¨ç±»ä¼¼ ThisVariableIsATemporaryCounter è¿™æ ·åŽä¸½çš„åå—。C 程åºå‘˜ä¼šç§°é‚£ä¸ªå˜é‡ -为 “tmpâ€ï¼Œè¿™æ ·å†™èµ·æ¥ä¼šæ›´å®¹æ˜“,而且至少ä¸ä¼šä»¤å…¶éš¾äºŽç†è§£ã€‚ - -ä¸è¿‡ï¼Œè™½ç„¶æ··ç”¨å¤§å°å†™çš„åå—是ä¸æ倡使用的,但是全局å˜é‡è¿˜æ˜¯éœ€è¦ä¸€ä¸ªå…·æ述性的åå— -。称一个全局函数为 “foo†是一个难以饶æ•çš„错误。 - -全局å˜é‡ï¼ˆåªæœ‰å½“ä½ çœŸæ£éœ€è¦å®ƒä»¬çš„时候å†ç”¨å®ƒï¼‰éœ€è¦æœ‰ä¸€ä¸ªå…·æ述性的åå—,就åƒå…¨å±€å‡½ -æ•°ã€‚å¦‚æžœä½ æœ‰ä¸€ä¸ªå¯ä»¥è®¡ç®—活动用户数é‡çš„å‡½æ•°ï¼Œä½ åº”è¯¥å«å®ƒ “count_active_users()†-或者类似的åå—ï¼Œä½ ä¸åº”该å«å®ƒ “cntuser()â€ã€‚ - -在函数åä¸åŒ…å«å‡½æ•°ç±»åž‹ï¼ˆæ‰€è°“的匈牙利命å法)是脑å出了问题——编译器知é“那些类型而 -ä¸”èƒ½å¤Ÿæ£€æŸ¥é‚£äº›ç±»åž‹ï¼Œè¿™æ ·åšåªèƒ½æŠŠç¨‹åºå‘˜å¼„ç³Šæ¶‚äº†ã€‚éš¾æ€ªå¾®è½¯æ€»æ˜¯åˆ¶é€ å‡ºæœ‰é—®é¢˜çš„ç¨‹åºã€‚ - -本地å˜é‡å应该简çŸï¼Œè€Œä¸”能够表达相关的å«ä¹‰ã€‚å¦‚æžœä½ æœ‰ä¸€äº›éšæœºçš„整数型的循环计数器 -,它应该被称为 “iâ€ã€‚å«å®ƒ “loop_counterâ€ å¹¶æ— ç›Šå¤„ï¼Œå¦‚æžœå®ƒæ²¡æœ‰è¢«è¯¯è§£çš„å¯èƒ½çš„è¯ã€‚ -类似的,“tmp†å¯ä»¥ç”¨æ¥ç§°å‘¼ä»»æ„类型的临时å˜é‡ã€‚ - -å¦‚æžœä½ æ€•æ··æ·†äº†ä½ çš„æœ¬åœ°å˜é‡åï¼Œä½ å°±é‡åˆ°å¦ä¸€ä¸ªé—®é¢˜äº†ï¼Œå«åšå‡½æ•°å¢žé•¿è·å°”蒙失衡综åˆç—‡ -。请看第å…ç« ï¼ˆå‡½æ•°ï¼‰ã€‚ - - - ç¬¬äº”ç« ï¼šTypedef - -ä¸è¦ä½¿ç”¨ç±»ä¼¼ “vps_t†之类的东西。 - -对结构体和指针使用 typedef æ˜¯ä¸€ä¸ªé”™è¯¯ã€‚å½“ä½ åœ¨ä»£ç 里看到: - - vps_t a; - -这代表什么æ„æ€å‘¢ï¼Ÿ - -相åï¼Œå¦‚æžœæ˜¯è¿™æ · - - struct virtual_container *a; - -ä½ å°±çŸ¥é“ â€œa†是什么了。 - -很多人认为 typedef “能æ高å¯è¯»æ€§â€ã€‚实际ä¸æ˜¯è¿™æ ·çš„。它们åªåœ¨ä¸‹åˆ—情况下有用: - - (a) 完全ä¸é€æ˜Žçš„对象(这ç§æƒ…况下è¦ä¸»åŠ¨ä½¿ç”¨ typedef æ¥éšè—这个对象实际上是什么)。 - - 例如:“pte_t†ç‰ä¸é€æ˜Žå¯¹è±¡ï¼Œä½ åªèƒ½ç”¨åˆé€‚的访问函数æ¥è®¿é—®å®ƒä»¬ã€‚ - - 注æ„ï¼ä¸é€æ˜Žæ€§å’Œâ€œè®¿é—®å‡½æ•°â€æœ¬èº«æ˜¯ä¸å¥½çš„。我们使用 pte_t ç‰ç±»åž‹çš„åŽŸå› åœ¨äºŽçœŸçš„æ˜¯ - 完全没有任何共用的å¯è®¿é—®ä¿¡æ¯ã€‚ - - (b) 清楚的整数类型,如æ¤ï¼Œè¿™å±‚抽象就å¯ä»¥å¸®åŠ©æ¶ˆé™¤åˆ°åº•æ˜¯ “int†还是 “long†的混淆。 - - u8/u16/u32 是完全没有问题的 typedef,ä¸è¿‡å®ƒä»¬æ›´ç¬¦åˆç±»åˆ« (d) 而ä¸æ˜¯è¿™é‡Œã€‚ - - å†æ¬¡æ³¨æ„ï¼è¦è¿™æ ·åšï¼Œå¿…é¡»äº‹å‡ºæœ‰å› ã€‚å¦‚æžœæŸä¸ªå˜é‡æ˜¯ “unsigned longâ€œï¼Œé‚£ä¹ˆæ²¡æœ‰å¿…è¦ - - typedef unsigned long myflags_t; - - ä¸è¿‡å¦‚æžœæœ‰ä¸€ä¸ªæ˜Žç¡®çš„åŽŸå› ï¼Œæ¯”å¦‚å®ƒåœ¨æŸç§æƒ…况下å¯èƒ½ä¼šæ˜¯ä¸€ä¸ª “unsigned int†而在 - 其他情况下å¯èƒ½ä¸º “unsigned longâ€ï¼Œé‚£ä¹ˆå°±ä¸è¦çŠ¹è±«ï¼Œè¯·åŠ¡å¿…使用 typedef。 - - (c) å½“ä½ ä½¿ç”¨sparse按å—é¢çš„创建一个新类型æ¥åšç±»åž‹æ£€æŸ¥çš„时候。 - - (d) å’Œæ ‡å‡†C99类型相åŒçš„类型,在æŸäº›ä¾‹å¤–的情况下。 - - 虽然让眼ç›å’Œè„‘ç‹æ¥é€‚åº”æ–°çš„æ ‡å‡†ç±»åž‹æ¯”å¦‚ “uint32_t†ä¸éœ€è¦èŠ±å¾ˆå¤šæ—¶é—´ï¼Œå¯æ˜¯æœ‰äº› - 人ä»ç„¶æ‹’ç»ä½¿ç”¨å®ƒä»¬ã€‚ - - å› æ¤ï¼ŒLinux 特有的ç‰åŒäºŽæ ‡å‡†ç±»åž‹çš„ “u8/u16/u32/u64†类型和它们的有符å·ç±»åž‹æ˜¯è¢« - å…è®¸çš„â€”â€”å°½ç®¡åœ¨ä½ è‡ªå·±çš„æ–°ä»£ç ä¸ï¼Œå®ƒä»¬ä¸æ˜¯å¼ºåˆ¶è¦æ±‚è¦ä½¿ç”¨çš„。 - - 当编辑已ç»ä½¿ç”¨äº†æŸä¸ªç±»åž‹é›†çš„已有代ç æ—¶ï¼Œä½ åº”è¯¥éµå¾ªé‚£äº›ä»£ç ä¸å·²ç»åšå‡ºçš„选择。 - - (e) å¯ä»¥åœ¨ç”¨æˆ·ç©ºé—´å®‰å…¨ä½¿ç”¨çš„类型。 - - 在æŸäº›ç”¨æˆ·ç©ºé—´å¯è§çš„结构体里,我们ä¸èƒ½è¦æ±‚C99类型而且ä¸èƒ½ç”¨ä¸Šé¢æ到的 “u32†- ç±»åž‹ã€‚å› æ¤ï¼Œæˆ‘们在与用户空间共享的所有结构体ä¸ä½¿ç”¨ __u32 和类似的类型。 - -å¯èƒ½è¿˜æœ‰å…¶ä»–的情况,ä¸è¿‡åŸºæœ¬çš„规则是永远ä¸è¦ä½¿ç”¨ typedef,除éžä½ å¯ä»¥æ˜Žç¡®çš„应用上 -è¿°æŸä¸ªè§„则ä¸çš„一个。 - -总的æ¥è¯´ï¼Œå¦‚æžœä¸€ä¸ªæŒ‡é’ˆæˆ–è€…ä¸€ä¸ªç»“æž„ä½“é‡Œçš„å…ƒç´ å¯ä»¥åˆç†çš„è¢«ç›´æŽ¥è®¿é—®åˆ°ï¼Œé‚£ä¹ˆå®ƒä»¬å°±ä¸ -应该是一个 typedef。 - - - 第å…ç« ï¼šå‡½æ•° - -函数应该简çŸè€Œæ¼‚亮,并且åªå®Œæˆä¸€ä»¶äº‹æƒ…。函数应该å¯ä»¥ä¸€å±æˆ–者两å±æ˜¾ç¤ºå®Œï¼ˆæˆ‘们都知 -é“ ISO/ANSI å±å¹•å¤§å°æ˜¯ 80x24),åªåšä¸€ä»¶äº‹æƒ…,而且把它åšå¥½ã€‚ - -一个函数的最大长度是和该函数的å¤æ‚度和缩进级数æˆåæ¯”çš„ã€‚æ‰€ä»¥ï¼Œå¦‚æžœä½ æœ‰ä¸€ä¸ªç†è®ºä¸Š -很简å•çš„åªæœ‰ä¸€ä¸ªå¾ˆé•¿ï¼ˆä½†æ˜¯ç®€å•ï¼‰çš„ case è¯å¥çš„å‡½æ•°ï¼Œè€Œä¸”ä½ éœ€è¦åœ¨æ¯ä¸ª case é‡Œåš -很多很å°çš„äº‹æƒ…ï¼Œè¿™æ ·çš„å‡½æ•°å°½ç®¡å¾ˆé•¿ï¼Œä½†ä¹Ÿæ˜¯å¯ä»¥çš„。 - -ä¸è¿‡ï¼Œå¦‚æžœä½ æœ‰ä¸€ä¸ªå¤æ‚çš„å‡½æ•°ï¼Œè€Œä¸”ä½ æ€€ç–‘ä¸€ä¸ªå¤©åˆ†ä¸æ˜¯å¾ˆé«˜çš„高ä¸ä¸€å¹´çº§å¦ç”Ÿå¯èƒ½ç”šè‡³ -æžä¸æ¸…æ¥šè¿™ä¸ªå‡½æ•°çš„ç›®çš„ï¼Œä½ åº”è¯¥ä¸¥æ ¼çš„éµå®ˆå‰é¢æ到的长度é™åˆ¶ã€‚使用辅助函数,并为之 -å–个具æ述性的åå—ï¼ˆå¦‚æžœä½ è§‰å¾—å®ƒä»¬çš„æ€§èƒ½å¾ˆé‡è¦çš„è¯ï¼Œå¯ä»¥è®©ç¼–译器内è”å®ƒä»¬ï¼Œè¿™æ ·çš„ -æ•ˆæžœå¾€å¾€ä¼šæ¯”ä½ å†™ä¸€ä¸ªå¤æ‚函数的效果è¦å¥½ã€‚) - -函数的å¦å¤–一个衡é‡æ ‡å‡†æ˜¯æœ¬åœ°å˜é‡çš„æ•°é‡ã€‚æ¤æ•°é‡ä¸åº”超过 5ï¼10 个,å¦åˆ™ä½ 的函数就有 -问题了。é‡æ–°è€ƒè™‘ä¸€ä¸‹ä½ çš„å‡½æ•°ï¼ŒæŠŠå®ƒåˆ†æ‹†æˆæ›´å°çš„函数。人的大脑一般å¯ä»¥è½»æ¾çš„åŒæ—¶è·Ÿ -踪 7 个ä¸åŒçš„事物,如果å†å¢žå¤šçš„è¯ï¼Œå°±ä¼šç³Šæ¶‚了。å³ä¾¿ä½ èªé¢–è¿‡äººï¼Œä½ ä¹Ÿå¯èƒ½ä¼šè®°ä¸æ¸…ä½ -2 个星期å‰åšè¿‡çš„事情。 - -在æºæ–‡ä»¶é‡Œï¼Œä½¿ç”¨ç©ºè¡Œéš”å¼€ä¸åŒçš„函数。如果该函数需è¦è¢«å¯¼å‡ºï¼Œå®ƒçš„ EXPORT* å®åº”该紧贴 -在它的结æŸå¤§æ‹¬å·ä¹‹ä¸‹ã€‚比如: - - int system_is_up(void) - { - return system_state == SYSTEM_RUNNING; - } - EXPORT_SYMBOL(system_is_up); - -在函数原型ä¸ï¼ŒåŒ…å«å‡½æ•°å和它们的数æ®ç±»åž‹ã€‚虽然Cè¯è¨€é‡Œæ²¡æœ‰è¿™æ ·çš„è¦æ±‚,在 Linux 里这 -是æ倡的åšæ³•ï¼Œå› ä¸ºè¿™æ ·å¯ä»¥å¾ˆç®€å•çš„给读者æ供更多的有价值的信æ¯ã€‚ - - - ç¬¬ä¸ƒç« ï¼šé›†ä¸çš„函数退出途径 - -虽然被æŸäº›äººå£°ç§°å·²ç»è¿‡æ—¶ï¼Œä½†æ˜¯ goto è¯å¥çš„ç‰ä»·ç‰©è¿˜æ˜¯ç»å¸¸è¢«ç¼–译器所使用,具体形å¼æ˜¯ -æ— æ¡ä»¶è·³è½¬æŒ‡ä»¤ã€‚ - -当一个函数从多个ä½ç½®é€€å‡ºï¼Œå¹¶ä¸”需è¦åšä¸€äº›ç±»ä¼¼æ¸…ç†çš„常è§æ“作时,goto è¯å¥å°±å¾ˆæ–¹ä¾¿äº†ã€‚ -如果并ä¸éœ€è¦æ¸…ç†æ“作,那么直接 return å³å¯ã€‚ - -ç†ç”±æ˜¯ï¼š - -- æ— æ¡ä»¶è¯å¥å®¹æ˜“ç†è§£å’Œè·Ÿè¸ª -- 嵌套程度å‡å° -- å¯ä»¥é¿å…由于修改时忘记更新æŸä¸ªå•ç‹¬çš„退出点而导致的错误 -- å‡è½»äº†ç¼–è¯‘å™¨çš„å·¥ä½œï¼Œæ— éœ€åˆ é™¤å†—ä½™ä»£ç ;) - - int fun(int a) - { - int result = 0; - char *buffer; - - buffer = kmalloc(SIZE, GFP_KERNEL); - if (!buffer) - return -ENOMEM; - - if (condition1) { - while (loop1) { - ... - } - result = 1; - goto out_buffer; - } - ... - out_buffer: - kfree(buffer); - return result; - } - -一个需è¦æ³¨æ„的常è§é”™è¯¯æ˜¯â€œä¸€ä¸ª err 错误â€ï¼Œå°±åƒè¿™æ ·ï¼š - - err: - kfree(foo->bar); - kfree(foo); - return ret; - -这段代ç 的错误是,在æŸäº›é€€å‡ºè·¯å¾„上 “foo†是 NULL。通常情况下,通过把它分离æˆä¸¤ä¸ª -é”™è¯¯æ ‡ç¾ â€œerr_bar:†和 “err_foo:†æ¥ä¿®å¤è¿™ä¸ªé”™è¯¯ã€‚ - - ç¬¬å…«ç« ï¼šæ³¨é‡Š - -注释是好的,ä¸è¿‡æœ‰è¿‡åº¦æ³¨é‡Šçš„å±é™©ã€‚永远ä¸è¦åœ¨æ³¨é‡Šé‡Œè§£é‡Šä½ 的代ç 是如何è¿ä½œçš„:更好 -çš„åšæ³•æ˜¯è®©åˆ«äººä¸€çœ‹ä½ 的代ç å°±å¯ä»¥æ˜Žç™½ï¼Œè§£é‡Šå†™çš„很差的代ç 是浪费时间。 - -ä¸€èˆ¬çš„ï¼Œä½ æƒ³è¦ä½ çš„æ³¨é‡Šå‘Šè¯‰åˆ«äººä½ çš„ä»£ç åšäº†ä»€ä¹ˆï¼Œè€Œä¸æ˜¯æ€Žä¹ˆåšçš„ã€‚ä¹Ÿè¯·ä½ ä¸è¦æŠŠæ³¨é‡Š -放在一个函数体内部:如果函数å¤æ‚åˆ°ä½ éœ€è¦ç‹¬ç«‹çš„注释其ä¸çš„ä¸€éƒ¨åˆ†ï¼Œä½ å¾ˆå¯èƒ½éœ€è¦å›žåˆ° -第å…ç« çœ‹ä¸€çœ‹ã€‚ä½ å¯ä»¥åšä¸€äº›å°æ³¨é‡Šæ¥æ³¨æ˜Žæˆ–è¦å‘ŠæŸäº›å¾ˆèªæ˜Žï¼ˆæˆ–者槽糕)的åšæ³•ï¼Œä½†ä¸è¦ -åŠ å¤ªå¤šã€‚ä½ åº”è¯¥åšçš„,是把注释放在函数的头部,告诉人们它åšäº†ä»€ä¹ˆï¼Œä¹Ÿå¯ä»¥åŠ 上它åšè¿™ -äº›äº‹æƒ…çš„åŽŸå› ã€‚ - -å½“æ³¨é‡Šå†…æ ¸API函数时,请使用 kernel-doc æ ¼å¼ã€‚请看 -Documentation/doc-guide/å’Œscripts/kernel-doc 以获得详细信æ¯ã€‚ - -Linuxçš„æ³¨é‡Šé£Žæ ¼æ˜¯ C89 “/* ... */â€ é£Žæ ¼ã€‚ä¸è¦ä½¿ç”¨ C99 é£Žæ ¼ “// ...†注释。 - -é•¿ï¼ˆå¤šè¡Œï¼‰çš„é¦–é€‰æ³¨é‡Šé£Žæ ¼æ˜¯ï¼š - - /* - * This is the preferred style for multi-line - * comments in the Linux kernel source code. - * Please use it consistently. - * - * Description: A column of asterisks on the left side, - * with beginning and ending almost-blank lines. - */ - -对于在 net/ å’Œ drivers/net/ çš„æ–‡ä»¶ï¼Œé¦–é€‰çš„é•¿ï¼ˆå¤šè¡Œï¼‰æ³¨é‡Šé£Žæ ¼æœ‰äº›ä¸åŒã€‚ - - /* The preferred comment style for files in net/ and drivers/net - * looks like this. - * - * It is nearly the same as the generally preferred comment style, - * but there is no initial almost-blank line. - */ - -注释数æ®ä¹Ÿæ˜¯å¾ˆé‡è¦çš„,ä¸ç®¡æ˜¯åŸºæœ¬ç±»åž‹è¿˜æ˜¯è¡ç”Ÿç±»åž‹ã€‚为了方便实现这一点,æ¯ä¸€è¡Œåº”åª -声明一个数æ®ï¼ˆä¸è¦ä½¿ç”¨é€—å·æ¥ä¸€æ¬¡å£°æ˜Žå¤šä¸ªæ•°æ®ï¼‰ã€‚è¿™æ ·ä½ å°±æœ‰ç©ºé—´æ¥ä¸ºæ¯ä¸ªæ•°æ®å†™ä¸€æ®µ -å°æ³¨é‡Šæ¥è§£é‡Šå®ƒä»¬çš„用途了。 - - - 第ä¹ç« ï¼šä½ å·²ç»æŠŠäº‹æƒ…弄糟了 - -è¿™æ²¡ä»€ä¹ˆï¼Œæˆ‘ä»¬éƒ½æ˜¯è¿™æ ·ã€‚å¯èƒ½ä½ 的使用了很长时间 Unix 的朋å‹å·²ç»å‘Šè¯‰ä½ “GNU emacs†能 -è‡ªåŠ¨å¸®ä½ æ ¼å¼åŒ– C æºä»£ç ï¼Œè€Œä¸”ä½ ä¹Ÿæ³¨æ„åˆ°äº†ï¼Œç¡®å®žæ˜¯è¿™æ ·ï¼Œä¸è¿‡å®ƒæ‰€ä½¿ç”¨çš„默认值和我们 -想è¦çš„相去甚远(实际上,甚至比éšæœºæ‰“的还è¦å·®â€”â€”æ— æ•°ä¸ªçŒ´å在 GNU emacs 里打å—æ°¸è¿œä¸ -ä¼šåˆ›é€ å‡ºä¸€ä¸ªå¥½ç¨‹åºï¼‰ï¼ˆè¯‘注:请å‚考 Infinite Monkey Theorem) - -æ‰€ä»¥ä½ è¦ä¹ˆæ”¾å¼ƒ GNU emacs,è¦ä¹ˆæ”¹å˜å®ƒè®©å®ƒä½¿ç”¨æ›´åˆç†çš„设定。è¦é‡‡ç”¨åŽä¸€ä¸ªæ–¹æ¡ˆï¼Œä½ å¯ -以把下é¢è¿™æ®µç²˜è´´åˆ°ä½ çš„ .emacs 文件里。 - -(defun c-lineup-arglist-tabs-only (ignored) - "Line up argument lists by tabs, not spaces" - (let* ((anchor (c-langelem-pos c-syntactic-element)) - (column (c-langelem-2nd-pos c-syntactic-element)) - (offset (- (1+ column) anchor)) - (steps (floor offset c-basic-offset))) - (* (max steps 1) - c-basic-offset))) - -(add-hook 'c-mode-common-hook - (lambda () - ;; Add kernel style - (c-add-style - "linux-tabs-only" - '("linux" (c-offsets-alist - (arglist-cont-nonempty - c-lineup-gcc-asm-reg - c-lineup-arglist-tabs-only)))))) - -(add-hook 'c-mode-hook - (lambda () - (let ((filename (buffer-file-name))) - ;; Enable kernel mode for the appropriate files - (when (and filename - (string-match (expand-file-name "~/src/linux-trees") - filename)) - (setq indent-tabs-mode t) - (setq show-trailing-whitespace t) - (c-set-style "linux-tabs-only"))))) - -这会让 emacs 在 ~/src/linux-trees 目录下的 C æºæ–‡ä»¶èŽ·å¾—æ›´å¥½çš„å†…æ ¸ä»£ç é£Žæ ¼ã€‚ - -ä¸è¿‡å°±ç®—ä½ å°è¯•è®© emacs æ£ç¡®çš„æ ¼å¼åŒ–代ç 失败了,也并ä¸æ„味ç€ä½ 失去了一切:还å¯ä»¥ç”¨ -“indentâ€ã€‚ - -ä¸è¿‡ï¼ŒGNU indent 也有和 GNU emacs ä¸€æ ·æœ‰é—®é¢˜çš„è®¾å®šï¼Œæ‰€ä»¥ä½ éœ€è¦ç»™å®ƒä¸€äº›å‘½ä»¤é€‰é¡¹ã€‚ä¸ -过,这还ä¸ç®—å¤ªç³Ÿç³•ï¼Œå› ä¸ºå°±ç®—æ˜¯ GNU indent çš„ä½œè€…ä¹Ÿè®¤åŒ K&R çš„æƒå¨æ€§ï¼ˆGNU 的人并ä¸æ˜¯ -å人,他们åªæ˜¯åœ¨è¿™ä¸ªé—®é¢˜ä¸Šè¢«ä¸¥é‡çš„è¯¯å¯¼äº†ï¼‰ï¼Œæ‰€ä»¥ä½ åªè¦ç»™ indent 指定选项 “-kr -i8†-(代表 “K&R,8 个å—符缩进â€ï¼‰ï¼Œæˆ–者使用 “scripts/Lindentâ€ï¼Œè¿™æ ·å°±å¯ä»¥ä»¥æœ€æ—¶é«¦çš„æ–¹å¼ -缩进æºä»£ç 。 - -“indent†有很多选项,特别是é‡æ–°æ ¼å¼åŒ–æ³¨é‡Šçš„æ—¶å€™ï¼Œä½ å¯èƒ½éœ€è¦çœ‹ä¸€ä¸‹å®ƒçš„手册页。ä¸è¿‡ -è®°ä½ï¼šâ€œindent†ä¸èƒ½ä¿®æ£åçš„ç¼–ç¨‹ä¹ æƒ¯ã€‚ - - - 第åç« ï¼šKconfig é…置文件 - -对于é布æºç æ ‘çš„æ‰€æœ‰ Kconfig* é…置文件æ¥è¯´ï¼Œå®ƒä»¬ç¼©è¿›æ–¹å¼ä¸Ž C 代ç 相比有所ä¸åŒã€‚紧挨 -在 “config†定义下é¢çš„行缩进一个制表符,帮助信æ¯åˆ™å†å¤šç¼©è¿› 2 ä¸ªç©ºæ ¼ã€‚æ¯”å¦‚ï¼š - -config AUDIT - bool "Auditing support" - depends on NET - help - Enable auditing infrastructure that can be used with another - kernel subsystem, such as SELinux (which requires this for - logging of avc messages output). Does not do system-call - auditing without CONFIG_AUDITSYSCALL. - -而那些å±é™©çš„功能(比如æŸäº›æ–‡ä»¶ç³»ç»Ÿçš„写支æŒï¼‰åº”该在它们的æ示å—符串里显著的声明这 -一点: - -config ADFS_FS_RW - bool "ADFS write support (DANGEROUS)" - depends on ADFS_FS - ... - -è¦æŸ¥çœ‹é…置文件的完整文档,请看 Documentation/kbuild/kconfig-language.txt。 - - - 第åä¸€ç« ï¼šæ•°æ®ç»“æž„ - -如果一个数æ®ç»“构,在创建和销æ¯å®ƒçš„å•çº¿æ‰§è¡ŒçŽ¯å¢ƒä¹‹å¤–å¯è§ï¼Œé‚£ä¹ˆå®ƒå¿…é¡»è¦æœ‰ä¸€ä¸ªå¼•ç”¨è®¡ -æ•°å™¨ã€‚å†…æ ¸é‡Œæ²¡æœ‰åžƒåœ¾æ”¶é›†ï¼ˆå¹¶ä¸”å†…æ ¸ä¹‹å¤–çš„åžƒåœ¾æ”¶é›†æ…¢ä¸”æ•ˆçŽ‡ä½Žä¸‹ï¼‰ï¼Œè¿™æ„味ç€ä½ ç»å¯¹éœ€ -è¦è®°å½•ä½ 对这ç§æ•°æ®ç»“构的使用情况。 - -引用计数æ„味ç€ä½ 能够é¿å…上é”,并且å…许多个用户并行访问这个数æ®ç»“构——而ä¸éœ€è¦æ‹…心 -这个数æ®ç»“æž„ä»…ä»…å› ä¸ºæš‚æ—¶ä¸è¢«ä½¿ç”¨å°±æ¶ˆå¤±äº†ï¼Œé‚£äº›ç”¨æˆ·å¯èƒ½ä¸è¿‡æ˜¯æ²‰ç¡äº†ä¸€é˜µæˆ–者åšäº†ä¸€ -些其他事情而已。 - -注æ„上é”ä¸èƒ½å–代引用计数。上é”是为了ä¿æŒæ•°æ®ç»“构的一致性,而引用计数是一个内å˜ç®¡ -ç†æŠ€å·§ã€‚通常二者都需è¦ï¼Œä¸è¦æŠŠä¸¤ä¸ªæžæ··äº†ã€‚ - -很多数æ®ç»“构实际上有2级引用计数,它们通常有ä¸åŒâ€œç±»â€çš„用户。å类计数器统计å类用 -户的数é‡ï¼Œæ¯å½“å类计数器å‡è‡³é›¶æ—¶ï¼Œå…¨å±€è®¡æ•°å™¨å‡ä¸€ã€‚ - -è¿™ç§â€œå¤šçº§å¼•ç”¨è®¡æ•°â€çš„例åå¯ä»¥åœ¨å†…å˜ç®¡ç†ï¼ˆâ€œstruct mm_structâ€ï¼šmm_users å’Œ mm_count) -和文件系统(“struct super_blockâ€ï¼šs_countå’Œs_active)ä¸æ‰¾åˆ°ã€‚ - -è®°ä½ï¼šå¦‚æžœå¦ä¸€ä¸ªæ‰§è¡Œçº¿ç´¢å¯ä»¥æ‰¾åˆ°ä½ çš„æ•°æ®ç»“构,但是这个数æ®ç»“构没有引用计数器,这 -é‡Œå‡ ä¹Žè‚¯å®šæ˜¯ä¸€ä¸ª bug。 - - - 第åäºŒç« ï¼šå®ï¼Œæžšä¸¾å’ŒRTL - -用于定义常é‡çš„å®çš„åå—åŠæžšä¸¾é‡Œçš„æ ‡ç¾éœ€è¦å¤§å†™ã€‚ - -#define CONSTANT 0x12345 - -åœ¨å®šä¹‰å‡ ä¸ªç›¸å…³çš„å¸¸é‡æ—¶ï¼Œæœ€å¥½ç”¨æžšä¸¾ã€‚ - -å®çš„åå—请用大写å—æ¯ï¼Œä¸è¿‡å½¢å¦‚函数的å®çš„åå—å¯ä»¥ç”¨å°å†™å—æ¯ã€‚ - -一般的,如果能写æˆå†…è”函数就ä¸è¦å†™æˆåƒå‡½æ•°çš„å®ã€‚ - -å«æœ‰å¤šä¸ªè¯å¥çš„å®åº”该被包å«åœ¨ä¸€ä¸ª do-while 代ç å—里: - - #define macrofun(a, b, c) \ - do { \ - if (a == 5) \ - do_this(b, c); \ - } while (0) - -使用å®çš„时候应é¿å…的事情: - -1) å½±å“控制æµç¨‹çš„å®ï¼š - - #define FOO(x) \ - do { \ - if (blah(x) < 0) \ - return -EBUGGERED; \ - } while (0) - -éžå¸¸ä¸å¥½ã€‚它看起æ¥åƒä¸€ä¸ªå‡½æ•°ï¼Œä¸è¿‡å´èƒ½å¯¼è‡´â€œè°ƒç”¨â€å®ƒçš„函数退出;ä¸è¦æ‰“乱读者大脑里 -çš„è¯æ³•åˆ†æžå™¨ã€‚ - -2) ä¾èµ–于一个固定åå—的本地å˜é‡çš„å®ï¼š - - #define FOO(val) bar(index, val) - -å¯èƒ½çœ‹èµ·æ¥åƒæ˜¯ä¸ªä¸é”™çš„东西,ä¸è¿‡å®ƒéžå¸¸å®¹æ˜“把读代ç 的人æžç³Šæ¶‚ï¼Œè€Œä¸”å®¹æ˜“å¯¼è‡´çœ‹èµ·æ¥ -ä¸ç›¸å…³çš„改动带æ¥é”™è¯¯ã€‚ - -3) 作为左值的带å‚æ•°çš„å®ï¼š FOO(x) = y;如果有人把 FOO å˜æˆä¸€ä¸ªå†…è”函数的è¯ï¼Œè¿™ç§ç”¨ -法就会出错了。 - -4) 忘记了优先级:使用表达å¼å®šä¹‰å¸¸é‡çš„å®å¿…须将表达å¼ç½®äºŽä¸€å¯¹å°æ‹¬å·ä¹‹å†…。带å‚æ•°çš„ -å®ä¹Ÿè¦æ³¨æ„æ¤ç±»é—®é¢˜ã€‚ - - #define CONSTANT 0x4000 - #define CONSTEXP (CONSTANT | 3) - -5) 在å®é‡Œå®šä¹‰ç±»ä¼¼å‡½æ•°çš„本地å˜é‡æ—¶å‘½å冲çªï¼š - - #define FOO(x) \ - ({ \ - typeof(x) ret; \ - ret = calc_ret(x); \ - (ret); \ - }) - -ret 是本地å˜é‡çš„通用åå— - __foo_ret æ›´ä¸å®¹æ˜“与一个已å˜åœ¨çš„å˜é‡å†²çªã€‚ - -cpp 手册对å®çš„讲解很详细。gcc internals 手册也详细讲解了 RTL(译注:register -transfer languageï¼‰ï¼Œå†…æ ¸é‡Œçš„æ±‡ç¼–è¯è¨€ç»å¸¸ç”¨åˆ°å®ƒã€‚ - - - 第åä¸‰ç« ï¼šæ‰“å°å†…æ ¸æ¶ˆæ¯ - -å†…æ ¸å¼€å‘者应该是å—过良好教育的。请一定注æ„å†…æ ¸ä¿¡æ¯çš„拼写,以给人以好的å°è±¡ã€‚ä¸è¦ -用ä¸è§„范的å•è¯æ¯”如 “dontâ€ï¼Œè€Œè¦ç”¨ “do notâ€æˆ–者 “don'tâ€ã€‚ä¿è¯è¿™äº›ä¿¡æ¯ç®€å•ã€æ˜Žäº†ã€ -æ— æ§ä¹‰ã€‚ - -å†…æ ¸ä¿¡æ¯ä¸å¿…以å¥å·ï¼ˆè¯‘注:英文å¥å·ï¼Œå³ç‚¹ï¼‰ç»“æŸã€‚ - -在å°æ‹¬å·é‡Œæ‰“å°æ•°å— (%d) 没有任何价值,应该é¿å…è¿™æ ·åšã€‚ - -<linux/device.h> 里有一些驱动模型诊æ–å®ï¼Œä½ 应该使用它们,以确ä¿ä¿¡æ¯å¯¹åº”于æ£ç¡®çš„ -è®¾å¤‡å’Œé©±åŠ¨ï¼Œå¹¶ä¸”è¢«æ ‡è®°äº†æ£ç¡®çš„消æ¯çº§åˆ«ã€‚这些å®æœ‰ï¼šdev_err(),dev_warn(), -dev_info() ç‰ç‰ã€‚对于那些ä¸å’ŒæŸä¸ªç‰¹å®šè®¾å¤‡ç›¸å…³è¿žçš„ä¿¡æ¯ï¼Œ<linux/printk.h> 定义了 -pr_notice(),pr_info(),pr_warn(),pr_err() 和其他。 - -写出好的调试信æ¯å¯ä»¥æ˜¯ä¸€ä¸ªå¾ˆå¤§çš„æŒ‘æˆ˜ï¼›ä¸€æ—¦ä½ å†™å‡ºåŽï¼Œè¿™äº›ä¿¡æ¯åœ¨è¿œç¨‹é™¤é”™æ—¶èƒ½æä¾›æžå¤§ -的帮助。然而打å°è°ƒè¯•ä¿¡æ¯çš„处ç†æ–¹å¼åŒæ‰“å°éžè°ƒè¯•ä¿¡æ¯ä¸åŒã€‚其他 pr_XXX() å‡½æ•°èƒ½æ— æ¡ä»¶åœ° -打å°ï¼Œpr_debug() å´ä¸ï¼›é»˜è®¤æƒ…况下它ä¸ä¼šè¢«ç¼–译,除éžå®šä¹‰äº† DEBUG 或设定了 -CONFIG_DYNAMIC_DEBUG。实际这åŒæ ·æ˜¯ä¸ºäº† dev_dbg(),一个相关约定是在一个已ç»å¼€å¯äº† -DEBUG 时,使用 VERBOSE_DEBUG æ¥æ·»åŠ dev_vdbg()。 - -许多å系统拥有 Kconfig 调试选项æ¥å¼€å¯ -DDEBUG 在对应的 Makefile 里é¢ï¼›åœ¨å…¶ä»– -情况下,特殊文件使用 #define DEBUG。当一æ¡è°ƒè¯•ä¿¡æ¯éœ€è¦è¢«æ— æ¡ä»¶æ‰“å°æ—¶ï¼Œä¾‹å¦‚,如果 -å·²ç»åŒ…å«ä¸€ä¸ªè°ƒè¯•ç›¸å…³çš„ #ifdef æ¡ä»¶ï¼Œprintk(KERN_DEBUG ...) å°±å¯è¢«ä½¿ç”¨ã€‚ - - - 第åå››ç« ï¼šåˆ†é…å†…å˜ - -å†…æ ¸æ供了下é¢çš„一般用途的内å˜åˆ†é…函数: -kmalloc(),kzalloc(),kmalloc_array(),kcalloc(),vmalloc() å’Œ vzalloc()。 -请å‚考 API 文档以获å–有关它们的详细信æ¯ã€‚ - -ä¼ é€’ç»“æž„ä½“å¤§å°çš„首选形å¼æ˜¯è¿™æ ·çš„: - - p = kmalloc(sizeof(*p), ...); - -å¦å¤–一ç§ä¼ 递方å¼ä¸ï¼Œsizeof çš„æ“作数是结构体的åå—ï¼Œè¿™æ ·ä¼šé™ä½Žå¯è¯»æ€§ï¼Œå¹¶ä¸”å¯èƒ½ä¼šå¼• -å…¥ bug。有å¯èƒ½æŒ‡é’ˆå˜é‡ç±»åž‹è¢«æ”¹å˜æ—¶ï¼Œè€Œå¯¹åº”çš„ä¼ é€’ç»™å†…å˜åˆ†é…函数的 sizeof 的结果ä¸å˜ã€‚ - -强制转æ¢ä¸€ä¸ª void 指针返回值是多余的。C è¯è¨€æœ¬èº«ä¿è¯äº†ä»Ž void 指针到其他任何指针类型 -的转æ¢æ˜¯æ²¡æœ‰é—®é¢˜çš„。 - -分é…一个数组的首选形å¼æ˜¯è¿™æ ·çš„: - - p = kmalloc_array(n, sizeof(...), ...); - -分é…一个零长数组的首选形å¼æ˜¯è¿™æ ·çš„: - - p = kcalloc(n, sizeof(...), ...); - -两ç§å½¢å¼æ£€æŸ¥åˆ†é…å¤§å° n * sizeof(...) 的溢出,如果溢出返回 NULL。 - - - 第åäº”ç« ï¼šå†…è”弊病 - -有一个常è§çš„误解是内è”函数是 gcc æ供的å¯ä»¥è®©ä»£ç è¿è¡Œæ›´å¿«çš„ä¸€ä¸ªé€‰é¡¹ã€‚è™½ç„¶ä½¿ç”¨å†…è” -函数有时候是æ°å½“的(比如作为一ç§æ›¿ä»£å®çš„æ–¹å¼ï¼Œè¯·çœ‹ç¬¬åäºŒç« ï¼‰ï¼Œä¸è¿‡å¾ˆå¤šæƒ…况下ä¸æ˜¯ -è¿™æ ·ã€‚inline 关键å—çš„è¿‡åº¦ä½¿ç”¨ä¼šä½¿å†…æ ¸å˜å¤§ï¼Œä»Žè€Œä½¿æ•´ä¸ªç³»ç»Ÿè¿è¡Œé€Ÿåº¦å˜æ…¢ã€‚å› ä¸ºå¤§å†…æ ¸ -会å 用更多的指令高速缓å˜ï¼ˆè¯‘注:一级缓å˜é€šå¸¸æ˜¯æŒ‡ä»¤ç¼“å˜å’Œæ•°æ®ç¼“å˜åˆ†å¼€çš„)而且会导 -致 pagecache çš„å¯ç”¨å†…å˜å‡å°‘。想象一下,一次pagecache未命ä¸å°±ä¼šå¯¼è‡´ä¸€æ¬¡ç£ç›˜å¯»å€ï¼Œ -将耗时 5 毫秒。5 毫秒的时间内 CPU 能执行很多很多指令。 - -一个基本的原则是如果一个函数有 3 行以上,就ä¸è¦æŠŠå®ƒå˜æˆå†…è”函数。这个原则的一个例 -å¤–æ˜¯ï¼Œå¦‚æžœä½ çŸ¥é“æŸä¸ªå‚数是一个编译时常é‡ï¼Œè€Œä¸”å› ä¸ºè¿™ä¸ªå¸¸é‡ä½ 确定编译器在编译时能 -ä¼˜åŒ–æŽ‰ä½ çš„å‡½æ•°çš„å¤§éƒ¨åˆ†ä»£ç ,那ä»ç„¶å¯ä»¥ç»™å®ƒåŠ 上 inline 关键å—。kmalloc() 内è”函数就 -是一个很好的例å。 - -人们ç»å¸¸ä¸»å¼ ç»™ static 的而且åªç”¨äº†ä¸€æ¬¡çš„å‡½æ•°åŠ ä¸Š inline,如æ¤ä¸ä¼šæœ‰ä»»ä½•æŸå¤±ï¼Œå› 为没 -有什么好æƒè¡¡çš„。虽然从技术上说这是æ£ç¡®çš„,但是实际上这ç§æƒ…况下å³ä½¿ä¸åŠ inline gcc -也å¯ä»¥è‡ªåŠ¨ä½¿å…¶å†…è”。而且其他用户å¯èƒ½ä¼šè¦æ±‚移除 inline,由æ¤è€Œæ¥çš„争论会抵消 inline -自身的潜在价值,得ä¸å¿å¤±ã€‚ - - - 第åå…ç« ï¼šå‡½æ•°è¿”å›žå€¼åŠå‘½å - -函数å¯ä»¥è¿”回很多ç§ä¸åŒç±»åž‹çš„值,最常è§çš„一ç§æ˜¯è¡¨æ˜Žå‡½æ•°æ‰§è¡ŒæˆåŠŸæˆ–è€…å¤±è´¥çš„å€¼ã€‚è¿™æ · -的一个值å¯ä»¥è¡¨ç¤ºä¸ºä¸€ä¸ªé”™è¯¯ä»£ç 整数(-Exxxï¼å¤±è´¥ï¼Œ0ï¼æˆåŠŸï¼‰æˆ–者一个“æˆåŠŸâ€å¸ƒå°”值( -0ï¼å¤±è´¥ï¼Œéž0ï¼æˆåŠŸï¼‰ã€‚ - -æ··åˆä½¿ç”¨è¿™ä¸¤ç§è¡¨è¾¾æ–¹å¼æ˜¯éš¾äºŽå‘现的 bug çš„æ¥æºã€‚如果 C è¯è¨€æœ¬èº«ä¸¥æ ¼åŒºåˆ†æ•´å½¢å’Œå¸ƒå°”åž‹å˜ -é‡ï¼Œé‚£ä¹ˆç¼–译器就能够帮我们å‘现这些错误……ä¸è¿‡ C è¯è¨€ä¸åŒºåˆ†ã€‚为了é¿å…äº§ç”Ÿè¿™ç§ bug,请 -éµå¾ªä¸‹é¢çš„惯例: - - 如果函数的åå—是一个动作或者强制性的命令,那么这个函数应该返回错误代ç æ•´ - 数。如果是一个判æ–,那么函数应该返回一个“æˆåŠŸâ€å¸ƒå°”值。 - -比如,“add work†是一个命令,所以 add_work() 函数在æˆåŠŸæ—¶è¿”回 0,在失败时返回 -EBUSY。 -ç±»ä¼¼çš„ï¼Œå› ä¸º “PCI device present†是一个判æ–,所以 pci_dev_present() 函数在æˆåŠŸæ‰¾åˆ° -一个匹é…的设备时应该返回 1,如果找ä¸åˆ°æ—¶åº”该返回 0。 - -所有导出(译注:EXPORT)的函数都必须éµå®ˆè¿™ä¸ªæƒ¯ä¾‹ï¼Œæ‰€æœ‰çš„公共函数也都应该如æ¤ã€‚ç§ -有(static)函数ä¸éœ€è¦å¦‚æ¤ï¼Œä½†æ˜¯æˆ‘们也推èè¿™æ ·åšã€‚ - -返回值是实际计算结果而ä¸æ˜¯è®¡ç®—是å¦æˆåŠŸçš„æ ‡å¿—çš„å‡½æ•°ä¸å—æ¤æƒ¯ä¾‹çš„é™åˆ¶ã€‚一般的,他们 -通过返回一些æ£å¸¸å€¼èŒƒå›´ä¹‹å¤–的结果æ¥è¡¨ç¤ºå‡ºé”™ã€‚典型的例å是返回指针的函数,他们使用 -NULL 或者 ERR_PTR 机制æ¥æŠ¥å‘Šé”™è¯¯ã€‚ - - - 第åä¸ƒç« ï¼šä¸è¦é‡æ–°å‘æ˜Žå†…æ ¸å® - -头文件 include/linux/kernel.h 包å«äº†ä¸€äº›å®ï¼Œä½ 应该使用它们,而ä¸è¦è‡ªå·±å†™ä¸€äº›å®ƒä»¬çš„ -å˜ç§ã€‚æ¯”å¦‚ï¼Œå¦‚æžœä½ éœ€è¦è®¡ç®—ä¸€ä¸ªæ•°ç»„çš„é•¿åº¦ï¼Œä½¿ç”¨è¿™ä¸ªå® - - #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) - -ç±»ä¼¼çš„ï¼Œå¦‚æžœä½ è¦è®¡ç®—æŸç»“构体æˆå‘˜çš„大å°ï¼Œä½¿ç”¨ - - #define FIELD_SIZEOF(t, f) (sizeof(((t*)0)->f)) - -还有å¯ä»¥åšä¸¥æ ¼çš„类型检查的 min() å’Œ max() å®ï¼Œå¦‚æžœä½ éœ€è¦å¯ä»¥ä½¿ç”¨å®ƒä»¬ã€‚ä½ å¯ä»¥è‡ªå·±çœ‹çœ‹ -é‚£ä¸ªå¤´æ–‡ä»¶é‡Œè¿˜å®šä¹‰äº†ä»€ä¹ˆä½ å¯ä»¥æ‹¿æ¥ç”¨çš„东西,如果有定义的è¯ï¼Œä½ å°±ä¸åº”åœ¨ä½ çš„ä»£ç 里 -自己é‡æ–°å®šä¹‰ã€‚ - - - 第åå…«ç« ï¼šç¼–è¾‘å™¨æ¨¡å¼è¡Œå’Œå…¶ä»–需è¦ç½—嗦的事情 - -有一些编辑器å¯ä»¥è§£é‡ŠåµŒå…¥åœ¨æºæ–‡ä»¶é‡Œçš„ç”±ä¸€äº›ç‰¹æ®Šæ ‡è®°æ ‡æ˜Žçš„é…置信æ¯ã€‚比如,emacs -èƒ½å¤Ÿè§£é‡Šè¢«æ ‡è®°æˆè¿™æ ·çš„行: - - -*- mode: c -*- - -æˆ–è€…è¿™æ ·çš„ï¼š - - /* - Local Variables: - compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c" - End: - */ - -Vim èƒ½å¤Ÿè§£é‡Šè¿™æ ·çš„æ ‡è®°ï¼š - - /* vim:set sw=8 noet */ - -ä¸è¦åœ¨æºä»£ç ä¸åŒ…å«ä»»ä½•è¿™æ ·çš„内容。æ¯ä¸ªäººéƒ½æœ‰ä»–自己的编辑器é…ç½®ï¼Œä½ çš„æºæ–‡ä»¶ä¸åº” -该覆盖别人的é…置。这包括有关缩进和模å¼é…ç½®çš„æ ‡è®°ã€‚äººä»¬å¯ä»¥ä½¿ç”¨ä»–们自己定制的模 -å¼ï¼Œæˆ–者使用其他å¯ä»¥äº§ç”Ÿæ£ç¡®çš„缩进的巧妙方法。 - - - 第åä¹ç« :内è”汇编 - -在特定架构的代ç ä¸ï¼Œä½ 也许需è¦å†…è”汇编æ¥ä½¿ç”¨ CPU 接å£å’Œå¹³å°ç›¸å…³åŠŸèƒ½ã€‚åœ¨éœ€è¦ -这么åšæ—¶ï¼Œä¸è¦çŠ¹è±«ã€‚然而,当 C å¯ä»¥å®Œæˆå·¥ä½œæ—¶ï¼Œä¸è¦æ— 端地使用内è”汇编。如果 -å¯èƒ½ï¼Œä½ å¯ä»¥å¹¶ä¸”应该用 C 和硬件交互。 - -考虑去写通用一点的内è”汇编作为简明的辅助函数,而ä¸æ˜¯é‡å¤å†™ä¸‹å®ƒä»¬çš„ç»†èŠ‚ã€‚è®°ä½ -内è”汇编å¯ä»¥ä½¿ç”¨ C å‚数。 - -大而特殊的汇编函数应该放在 .S 文件ä¸ï¼Œå¯¹åº” C 的原型定义在 C 头文件ä¸ã€‚汇编 -函数的 C 原型应该使用 “asmlinkageâ€ã€‚ - -ä½ å¯èƒ½éœ€è¦å°†ä½ 的汇编è¯å¥æ ‡è®°ä¸º volatile,æ¥é˜»æ¢ GCC 在没å‘现任何副作用åŽå°± -ç§»é™¤äº†å®ƒã€‚ä½ ä¸å¿…æ€»æ˜¯è¿™æ ·åšï¼Œè™½ç„¶ï¼Œè¿™æ ·å¯ä»¥é™åˆ¶ä¸å¿…è¦çš„优化。 - -在写一个包å«å¤šæ¡æŒ‡ä»¤çš„å•ä¸ªå†…è”汇编è¯å¥æ—¶ï¼ŒæŠŠæ¯æ¡æŒ‡ä»¤ç”¨å¼•å·å—符串分离,并写在 -å•ç‹¬ä¸€è¡Œï¼Œåœ¨æ¯ä¸ªå—符串结尾,除了 \n\t 结尾之外,在汇编输出ä¸é€‚当地缩进下 -一æ¡æŒ‡ä»¤ï¼š - - asm ("magic %reg1, #42\n\t" - "more_magic %reg2, %reg3" - : /* outputs */ : /* inputs */ : /* clobbers */); - - - 第二åç« ï¼šæ¡ä»¶ç¼–译 - -åªè¦å¯èƒ½ï¼Œå°±ä¸è¦åœ¨ .c 文件里é¢ä½¿ç”¨é¢„处ç†æ¡ä»¶ï¼›è¿™æ ·åšè®©ä»£ç 更难阅读并且逻辑难以 -跟踪。替代方案是,在头文件定义函数在这些 .c 文件ä¸ä½¿ç”¨è¿™ç±»çš„æ¡ä»¶è¡¨è¾¾å¼ï¼Œæ供空 -æ“作的桩版本(译注:桩程åºï¼Œæ˜¯æŒ‡ç”¨æ¥æ›¿æ¢ä¸€éƒ¨åˆ†åŠŸèƒ½çš„程åºæ®µï¼‰åœ¨ #else 情况下, -å†ä»Ž .c 文件ä¸æ— æ¡ä»¶åœ°è°ƒç”¨è¿™äº›å‡½æ•°ã€‚编译器会é¿å…生æˆä»»ä½•æ¡©è°ƒç”¨çš„代ç ,产生一致 -çš„ç»“æžœï¼Œä½†é€»è¾‘å°†æ›´åŠ æ¸…æ™°ã€‚ - -å®å¯ç¼–译整个函数,而ä¸æ˜¯éƒ¨åˆ†å‡½æ•°æˆ–部分表达å¼ã€‚而ä¸æ˜¯åœ¨ä¸€ä¸ªè¡¨è¾¾å¼æ·»åŠ ifdef, -解æžéƒ¨åˆ†æˆ–全部表达å¼åˆ°ä¸€ä¸ªå•ç‹¬çš„辅助函数,并应用æ¡ä»¶åˆ°è¯¥å‡½æ•°å†…。 - -å¦‚æžœä½ æœ‰ä¸€ä¸ªåœ¨ç‰¹å®šé…ç½®ä¸å¯èƒ½æ˜¯æœªä½¿ç”¨çš„函数或å˜é‡ï¼Œç¼–译器将è¦å‘Šå®ƒå®šä¹‰äº†ä½†æœªä½¿ç”¨ï¼Œ -æ ‡è®°è¿™ä¸ªå®šä¹‰ä¸º __maybe_unused 而ä¸æ˜¯å°†å®ƒåŒ…å«åœ¨ä¸€ä¸ªé¢„处ç†æ¡ä»¶ä¸ã€‚(然而,如果 -一个函数或å˜é‡æ€»æ˜¯æœªä½¿ç”¨çš„ï¼Œå°±ç›´æŽ¥åˆ é™¤å®ƒã€‚ï¼‰ - -在代ç ä¸ï¼Œå¯èƒ½çš„情况下,使用 IS_ENABLED å®æ¥è½¬åŒ–æŸä¸ª Kconfig æ ‡è®°ä¸º C 的布尔 -表达å¼ï¼Œå¹¶åœ¨æ£å¸¸çš„ C æ¡ä»¶ä¸ä½¿ç”¨å®ƒï¼š - - if (IS_ENABLED(CONFIG_SOMETHING)) { - ... - } - -ç¼–è¯‘å™¨ä¼šæ— æ¡ä»¶åœ°åšå¸¸æ•°åˆå¹¶ï¼Œå°±åƒä½¿ç”¨ #ifdef é‚£æ ·ï¼ŒåŒ…å«æˆ–排除代ç å—,所以这ä¸ä¼š -带æ¥ä»»ä½•è¿è¡Œæ—¶å¼€é”€ã€‚然而,这ç§æ–¹æ³•ä¾æ—§å…许 C 编译器查看å—内的代ç ,并检查它的æ£ç¡® -性(è¯æ³•ï¼Œç±»åž‹ï¼Œç¬¦å·å¼•ç”¨ï¼Œç‰ç‰ï¼‰ã€‚å› æ¤ï¼Œå¦‚æžœæ¡ä»¶ä¸æ»¡è¶³ï¼Œä»£ç å—内的引用符å·å°†ä¸å˜åœ¨ï¼Œ -ä½ å¿…é¡»ç»§ç»ä½¿ç”¨ #ifdef。 - -在任何有æ„义的 #if 或 #ifdef å—çš„æœ«å°¾ï¼ˆè¶…è¿‡å‡ è¡Œï¼‰ï¼Œåœ¨ #endif åŒä¸€è¡Œçš„åŽé¢å†™ä¸‹ -注释,指出该æ¡ä»¶è¡¨è¾¾å¼è¢«ä½¿ç”¨ã€‚例如: - - #ifdef CONFIG_SOMETHING - ... - #endif /* CONFIG_SOMETHING */ - - - 附录 I:å‚考 - -The C Programming Language, 第二版 -作者:Brian W. Kernighan å’Œ Denni M. Ritchie. -Prentice Hall, Inc., 1988. -ISBN 0-13-110362-8 (软皮), 0-13-110370-9 (硬皮). - -The Practice of Programming -作者:Brian W. Kernighan å’Œ Rob Pike. -Addison-Wesley, Inc., 1999. -ISBN 0-201-61586-X. - -GNU 手册 - éµå¾ª K&R æ ‡å‡†å’Œæ¤æ–‡æœ¬ - cpp, gcc, gcc internals and indent, -都å¯ä»¥ä»Ž http://www.gnu.org/manual/ 找到 - -WG14是Cè¯è¨€çš„å›½é™…æ ‡å‡†åŒ–å·¥ä½œç»„ï¼ŒURL: http://www.open-std.org/JTC1/SC22/WG14/ - -Kernel process/coding-style.rst,作者 greg@kroah.com å‘表于OLS 2002: -http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_talk/html/ diff --git a/Documentation/translations/zh_CN/coding-style.rst b/Documentation/translations/zh_CN/coding-style.rst new file mode 100644 index 000000000000..1466aa64b8b4 --- /dev/null +++ b/Documentation/translations/zh_CN/coding-style.rst @@ -0,0 +1,950 @@ +Chinese translated version of Documentation/process/coding-style.rst + +If you have any comment or update to the content, please post to LKML directly. +However, if you have problem communicating in English you can also ask the +Chinese maintainer for help. Contact the Chinese maintainer, if this +translation is outdated or there is problem with translation. + +Chinese maintainer: Zhang Le <r0bertz@gentoo.org> + +--------------------------------------------------------------------- + +Documentation/process/coding-style.rst çš„ä¸æ–‡ç¿»è¯‘ + +如果想评论或更新本文的内容,请直接å‘信到LKMLã€‚å¦‚æžœä½ ä½¿ç”¨è‹±æ–‡äº¤æµæœ‰å›°éš¾çš„è¯ï¼Œ +也å¯ä»¥å‘ä¸æ–‡ç‰ˆç»´æŠ¤è€…求助。如果本翻译更新ä¸åŠæ—¶æˆ–者翻译å˜åœ¨é—®é¢˜ï¼Œè¯·è”ç³»ä¸æ–‡ç‰ˆ +维护者:: + + ä¸æ–‡ç‰ˆç»´æŠ¤è€…: å¼ ä¹ Zhang Le <r0bertz@gentoo.org> + ä¸æ–‡ç‰ˆç¿»è¯‘者: å¼ ä¹ Zhang Le <r0bertz@gentoo.org> + ä¸æ–‡ç‰ˆæ ¡è¯‘者: çŽ‹èª Wang Cong <xiyou.wangcong@gmail.com> + wheelz <kernel.zeng@gmail.com> + 管æ—东 Xudong Guan <xudong.guan@gmail.com> + Li Zefan <lizf@cn.fujitsu.com> + Wang Chen <wangchen@cn.fujitsu.com> + +以下为æ£æ–‡ + +--------------------------------------------------------------------- + +Linux å†…æ ¸ä»£ç é£Žæ ¼ +========================= + +这是一个简çŸçš„文档,æ述了 linux å†…æ ¸çš„é¦–é€‰ä»£ç é£Žæ ¼ã€‚ä»£ç é£Žæ ¼æ˜¯å› äººè€Œå¼‚çš„ï¼Œ +而且我ä¸æ„¿æ„æŠŠè‡ªå·±çš„è§‚ç‚¹å¼ºåŠ ç»™ä»»ä½•äººï¼Œä½†è¿™å°±åƒæˆ‘去åšä»»ä½•äº‹æƒ…都必须éµå¾ªçš„原则 +é‚£æ ·ï¼Œæˆ‘ä¹Ÿå¸Œæœ›åœ¨ç»å¤§å¤šæ•°äº‹ä¸Šä¿æŒè¿™ç§çš„æ€åº¦ã€‚请 (在写代ç æ—¶) 至少考虑一下这里 +的代ç é£Žæ ¼ã€‚ + +é¦–å…ˆï¼Œæˆ‘å»ºè®®ä½ æ‰“å°ä¸€ä»½ GNU 代ç 规范,然åŽä¸è¦è¯»ã€‚烧了它,这是一个具有é‡å¤§è±¡å¾ +性æ„义的动作。 + +ä¸ç®¡æ€Žæ ·ï¼ŒçŽ°åœ¨æˆ‘们开始: + + +1) 缩进 +-------------- + +制表符是 8 个å—符,所以缩进也是 8 个å—符。有些异端è¿åŠ¨è¯•å›¾å°†ç¼©è¿›å˜ä¸º 4 (甚至 +2ï¼) å—ç¬¦æ·±ï¼Œè¿™å‡ ä¹Žç›¸å½“äºŽå°è¯•å°†åœ†å‘¨çŽ‡çš„值定义为 3。 + +ç†ç”±ï¼šç¼©è¿›çš„全部æ„义就在于清楚的定义一个控制å—èµ·æ¢äºŽä½•å¤„ã€‚å°¤å…¶æ˜¯å½“ä½ ç›¯ç€ä½ çš„ +å±å¹•è¿žç»çœ‹äº† 20 å°æ—¶ä¹‹åŽï¼Œä½ 将会å‘çŽ°å¤§ä¸€ç‚¹çš„ç¼©è¿›ä¼šä½¿ä½ æ›´å®¹æ˜“åˆ†è¾¨ç¼©è¿›ã€‚ + +现在,有些人会抱怨 8 个å—符的缩进会使代ç å‘å³è¾¹ç§»åŠ¨çš„太远,在 80 个å—符的终端 +å±å¹•ä¸Šå°±å¾ˆéš¾è¯»è¿™æ ·çš„代ç 。这个问题的ç”æ¡ˆæ˜¯ï¼Œå¦‚æžœä½ éœ€è¦ 3 级以上的缩进,ä¸ç®¡ç”¨ +何ç§æ–¹å¼ä½ 的代ç å·²ç»æœ‰é—®é¢˜äº†ï¼Œåº”该修æ£ä½ 的程åºã€‚ + +简而言之,8 个å—符的缩进å¯ä»¥è®©ä»£ç æ›´å®¹æ˜“é˜…è¯»ï¼Œè¿˜æœ‰ä¸€ä¸ªå¥½å¤„æ˜¯å½“ä½ çš„å‡½æ•°åµŒå¥—å¤ª +深的时候å¯ä»¥ç»™ä½ è¦å‘Šã€‚留心这个è¦å‘Šã€‚ + +在 switch è¯å¥ä¸æ¶ˆé™¤å¤šçº§ç¼©è¿›çš„首选的方å¼æ˜¯è®© ``switch`` 和从属于它的 ``case`` +æ ‡ç¾å¯¹é½äºŽåŒä¸€åˆ—,而ä¸è¦ ``两次缩进`` ``case`` æ ‡ç¾ã€‚比如: + +.. code-block:: c + + switch (suffix) { + case 'G': + case 'g': + mem <<= 30; + break; + case 'M': + case 'm': + mem <<= 20; + break; + case 'K': + case 'k': + mem <<= 10; + /* fall through */ + default: + break; + } + +ä¸è¦æŠŠå¤šä¸ªè¯å¥æ”¾åœ¨ä¸€è¡Œé‡Œï¼Œé™¤éžä½ 有什么东西è¦éšè—: + +.. code-block:: c + + if (condition) do_this; + do_something_everytime; + +也ä¸è¦åœ¨ä¸€è¡Œé‡Œæ”¾å¤šä¸ªèµ‹å€¼è¯å¥ã€‚å†…æ ¸ä»£ç é£Žæ ¼è¶…çº§ç®€å•ã€‚就是é¿å…å¯èƒ½å¯¼è‡´åˆ«äººè¯¯è¯» +的表达å¼ã€‚ + +除了注释ã€æ–‡æ¡£å’Œ Kconfig 之外,ä¸è¦ä½¿ç”¨ç©ºæ ¼æ¥ç¼©è¿›ï¼Œå‰é¢çš„例å是例外,是有æ„为 +之。 + +选用一个好的编辑器,ä¸è¦åœ¨è¡Œå°¾ç•™ç©ºæ ¼ã€‚ + + +2) 把长的行和å—符串打散 +------------------------------ + +代ç é£Žæ ¼çš„æ„义就在于使用平常使用的工具æ¥ç»´æŒä»£ç çš„å¯è¯»æ€§å’Œå¯ç»´æŠ¤æ€§ã€‚ + +æ¯ä¸€è¡Œçš„长度的é™åˆ¶æ˜¯ 80 列,我们强烈建议您éµå®ˆè¿™ä¸ªæƒ¯ä¾‹ã€‚ + +长于 80 列的è¯å¥è¦æ‰“æ•£æˆæœ‰æ„义的片段。除éžè¶…过 80 åˆ—èƒ½æ˜¾è‘—å¢žåŠ å¯è¯»æ€§ï¼Œå¹¶ä¸”ä¸ +会éšè—ä¿¡æ¯ã€‚å片段è¦æ˜Žæ˜¾çŸäºŽæ¯ç‰‡æ®µï¼Œå¹¶æ˜Žæ˜¾é å³ã€‚è¿™åŒæ ·é€‚用于有ç€å¾ˆé•¿å‚数列表 +的函数头。然而,ç»å¯¹ä¸è¦æ‰“散对用户å¯è§çš„å—符串,例如 printk ä¿¡æ¯ï¼Œå› ä¸ºè¿™æ ·å°± +很难对它们 grep。 + + +3) 大括å·å’Œç©ºæ ¼çš„放置 +------------------------------ + +C è¯è¨€é£Žæ ¼ä¸å¦å¤–一个常è§é—®é¢˜æ˜¯å¤§æ‹¬å·çš„放置。和缩进大å°ä¸åŒï¼Œé€‰æ‹©æˆ–弃用æŸç§æ”¾ +ç½®ç–ç•¥å¹¶æ²¡æœ‰å¤šå°‘æŠ€æœ¯ä¸Šçš„åŽŸå› ï¼Œä¸è¿‡é¦–选的方å¼ï¼Œå°±åƒ Kernighan å’Œ Ritchie 展示 +给我们的,是把起始大括å·æ”¾åœ¨è¡Œå°¾ï¼Œè€ŒæŠŠç»“æŸå¤§æ‹¬å·æ”¾åœ¨è¡Œé¦–,所以: + +.. code-block:: c + + if (x is true) { + we do y + } + +这适用于所有的éžå‡½æ•°è¯å¥å— (if, switch, for, while, do)。比如: + +.. code-block:: c + + switch (action) { + case KOBJ_ADD: + return "add"; + case KOBJ_REMOVE: + return "remove"; + case KOBJ_CHANGE: + return "change"; + default: + return NULL; + } + +ä¸è¿‡ï¼Œæœ‰ä¸€ä¸ªä¾‹å¤–,那就是函数:函数的起始大括å·æ”¾ç½®äºŽä¸‹ä¸€è¡Œçš„开头,所以: + +.. code-block:: c + + int function(int x) + { + body of function + } + +全世界的异端å¯èƒ½ä¼šæŠ±æ€¨è¿™ä¸ªä¸ä¸€è‡´æ€§æ˜¯... 呃... ä¸ä¸€è‡´çš„,ä¸è¿‡æ‰€æœ‰æ€ç»´å¥å…¨çš„人 +éƒ½çŸ¥é“ (a) K&R 是 **æ£ç¡®çš„** 并且 (b) K&R 是æ£ç¡®çš„。æ¤å¤–,ä¸ç®¡æ€Žæ ·å‡½æ•°éƒ½æ˜¯ç‰¹ +殊的 (C 函数是ä¸èƒ½åµŒå¥—çš„)。 + +注æ„结æŸå¤§æ‹¬å·ç‹¬è‡ªå æ®ä¸€è¡Œï¼Œé™¤éžå®ƒåŽé¢è·Ÿç€åŒä¸€ä¸ªè¯å¥çš„剩余部分,也就是 do è¯ +å¥ä¸çš„ "while" 或者 if è¯å¥ä¸çš„ "else",åƒè¿™æ ·ï¼š + +.. code-block:: c + + do { + body of do-loop + } while (condition); + +å’Œ + +.. code-block:: c + + if (x == y) { + .. + } else if (x > y) { + ... + } else { + .... + } + +ç†ç”±ï¼šK&R。 + +也请注æ„è¿™ç§å¤§æ‹¬å·çš„放置方å¼ä¹Ÿèƒ½ä½¿ç©º (或者差ä¸å¤šç©ºçš„) 行的数é‡æœ€å°åŒ–,åŒæ—¶ä¸ +失å¯è¯»æ€§ã€‚å› æ¤ï¼Œç”±äºŽä½ çš„å±å¹•ä¸Šçš„新行是ä¸å¯å†ç”Ÿèµ„æº (想想 25 行的终端å±å¹•)ï¼Œä½ +将会有更多的空行æ¥æ”¾ç½®æ³¨é‡Šã€‚ + +当åªæœ‰ä¸€ä¸ªå•ç‹¬çš„è¯å¥çš„时候,ä¸ç”¨åŠ ä¸å¿…è¦çš„大括å·ã€‚ + +.. code-block:: c + + if (condition) + action(); + +å’Œ + +.. code-block:: c + + if (condition) + do_this(); + else + do_that(); + +这并ä¸é€‚用于åªæœ‰ä¸€ä¸ªæ¡ä»¶åˆ†æ”¯æ˜¯å•è¯å¥çš„情况;这时所有分支都è¦ä½¿ç”¨å¤§æ‹¬å·ï¼š + +.. code-block:: c + + if (condition) { + do_this(); + do_that(); + } else { + otherwise(); + } + +3.1) ç©ºæ ¼ +******************** + +Linux å†…æ ¸çš„ç©ºæ ¼ä½¿ç”¨æ–¹å¼ (主è¦) å–决于它是用于函数还是关键å—。(大多数) å…³é”®å— +åŽè¦åŠ ä¸€ä¸ªç©ºæ ¼ã€‚å€¼å¾—æ³¨æ„的例外是 sizeof, typeof, alignof å’Œ __attribute__,这 +些关键å—æŸäº›ç¨‹åº¦ä¸Šçœ‹èµ·æ¥æ›´åƒå‡½æ•° (它们在 Linux 里也常常伴éšå°æ‹¬å·è€Œä½¿ç”¨ï¼Œå°½ç®¡ +在 C é‡Œè¿™æ ·çš„å°æ‹¬å·ä¸æ˜¯å¿…éœ€çš„ï¼Œå°±åƒ ``struct fileinfo info;`` 声明过åŽçš„ +``sizeof info``)。 + +所以在这些关键å—之åŽæ”¾ä¸€ä¸ªç©ºæ ¼:: + + if, switch, case, for, do, while + +但是ä¸è¦åœ¨ sizeof, typeof, alignof 或者 __attribute__ 这些关键å—之åŽæ”¾ç©ºæ ¼ã€‚ +例如, + +.. code-block:: c + + s = sizeof(struct file); + +ä¸è¦åœ¨å°æ‹¬å·é‡Œçš„表达å¼ä¸¤ä¾§åŠ ç©ºæ ¼ã€‚è¿™æ˜¯ä¸€ä¸ª **å例** : + +.. code-block:: c + + s = sizeof( struct file ); + +当声明指针类型或者返回指针类型的函数时, ``*`` 的首选使用方å¼æ˜¯ä½¿ä¹‹é è¿‘å˜é‡å +或者函数å,而ä¸æ˜¯é 近类型å。例å: + +.. code-block:: c + + char *linux_banner; + unsigned long long memparse(char *ptr, char **retptr); + char *match_strdup(substring_t *s); + +在大多数二元和三元æ“ä½œç¬¦ä¸¤ä¾§ä½¿ç”¨ä¸€ä¸ªç©ºæ ¼ï¼Œä¾‹å¦‚ä¸‹é¢æ‰€æœ‰è¿™äº›æ“作符:: + + = + - < > * / % | & ^ <= >= == != ? : + +但是一元æ“作符åŽä¸è¦åŠ ç©ºæ ¼:: + + & * + - ~ ! sizeof typeof alignof __attribute__ defined + +åŽç¼€è‡ªåŠ 和自å‡ä¸€å…ƒæ“作符å‰ä¸åŠ ç©ºæ ¼:: + + ++ -- + +å‰ç¼€è‡ªåŠ 和自å‡ä¸€å…ƒæ“作符åŽä¸åŠ ç©ºæ ¼:: + + ++ -- + +``.`` å’Œ ``->`` 结构体æˆå‘˜æ“作符å‰åŽä¸åŠ ç©ºæ ¼ã€‚ + +ä¸è¦åœ¨è¡Œå°¾ç•™ç©ºç™½ã€‚有些å¯ä»¥è‡ªåŠ¨ç¼©è¿›çš„ç¼–è¾‘å™¨ä¼šåœ¨æ–°è¡Œçš„è¡Œé¦–åŠ å…¥é€‚é‡çš„ç©ºç™½ï¼Œç„¶åŽ +ä½ å°±å¯ä»¥ç›´æŽ¥åœ¨é‚£ä¸€è¡Œè¾“入代ç 。ä¸è¿‡å‡å¦‚ä½ æœ€åŽæ²¡æœ‰åœ¨é‚£ä¸€è¡Œè¾“入代ç ,有些编辑器 +å°±ä¸ä¼šç§»é™¤å·²ç»åŠ 入的空白,就åƒä½ æ•…æ„留下一个åªæœ‰ç©ºç™½çš„行。包å«è¡Œå°¾ç©ºç™½çš„行就 +è¿™æ ·äº§ç”Ÿäº†ã€‚ + +当 git å‘现补ä¸åŒ…å«äº†è¡Œå°¾ç©ºç™½çš„时候会è¦å‘Šä½ ,并且å¯ä»¥åº”ä½ çš„è¦æ±‚去掉行尾空白; +ä¸è¿‡å¦‚æžœä½ æ˜¯æ£åœ¨æ‰“一系列补ä¸ï¼Œè¿™æ ·åšä¼šå¯¼è‡´åŽé¢çš„è¡¥ä¸å¤±è´¥ï¼Œå› ä¸ºä½ æ”¹å˜äº†è¡¥ä¸çš„ +上下文。 + + +4) 命å +------------------------------ + +C 是一个简朴的è¯è¨€ï¼Œä½ 的命åä¹Ÿåº”è¯¥è¿™æ ·ã€‚å’Œ Modula-2 å’Œ Pascal 程åºå‘˜ä¸åŒï¼Œ +C 程åºå‘˜ä¸ä½¿ç”¨ç±»ä¼¼ ThisVariableIsATemporaryCounter è¿™æ ·åŽä¸½çš„åå—。C 程åºå‘˜ä¼š +称那个å˜é‡ä¸º ``tmp`` ï¼Œè¿™æ ·å†™èµ·æ¥ä¼šæ›´å®¹æ˜“,而且至少ä¸ä¼šä»¤å…¶éš¾äºŽç†è§£ã€‚ + +ä¸è¿‡ï¼Œè™½ç„¶æ··ç”¨å¤§å°å†™çš„åå—是ä¸æ倡使用的,但是全局å˜é‡è¿˜æ˜¯éœ€è¦ä¸€ä¸ªå…·æ述性的 +åå—。称一个全局函数为 ``foo`` 是一个难以饶æ•çš„错误。 + +全局å˜é‡ (åªæœ‰å½“ä½ **真æ£** 需è¦å®ƒä»¬çš„时候å†ç”¨å®ƒ) 需è¦æœ‰ä¸€ä¸ªå…·æ述性的åå—,就 +åƒå…¨å±€å‡½æ•°ã€‚å¦‚æžœä½ æœ‰ä¸€ä¸ªå¯ä»¥è®¡ç®—活动用户数é‡çš„å‡½æ•°ï¼Œä½ åº”è¯¥å«å®ƒ +``count_active_users()`` 或者类似的åå—ï¼Œä½ ä¸åº”该å«å®ƒ ``cntuser()`` 。 + +在函数åä¸åŒ…å«å‡½æ•°ç±»åž‹ (所谓的匈牙利命å法) 是脑å出了问题——编译器知é“那些类 +åž‹è€Œä¸”èƒ½å¤Ÿæ£€æŸ¥é‚£äº›ç±»åž‹ï¼Œè¿™æ ·åšåªèƒ½æŠŠç¨‹åºå‘˜å¼„ç³Šæ¶‚äº†ã€‚éš¾æ€ªå¾®è½¯æ€»æ˜¯åˆ¶é€ å‡ºæœ‰é—®é¢˜ +的程åºã€‚ + +本地å˜é‡å应该简çŸï¼Œè€Œä¸”能够表达相关的å«ä¹‰ã€‚å¦‚æžœä½ æœ‰ä¸€äº›éšæœºçš„整数型的循环计 +数器,它应该被称为 ``i`` 。å«å®ƒ ``loop_counter`` å¹¶æ— ç›Šå¤„ï¼Œå¦‚æžœå®ƒæ²¡æœ‰è¢«è¯¯è§£çš„ +å¯èƒ½çš„è¯ã€‚类似的, ``tmp`` å¯ä»¥ç”¨æ¥ç§°å‘¼ä»»æ„类型的临时å˜é‡ã€‚ + +å¦‚æžœä½ æ€•æ··æ·†äº†ä½ çš„æœ¬åœ°å˜é‡åï¼Œä½ å°±é‡åˆ°å¦ä¸€ä¸ªé—®é¢˜äº†ï¼Œå«åšå‡½æ•°å¢žé•¿è·å°”蒙失衡综 +åˆç—‡ã€‚请看第å…ç« (函数)。 + + +5) Typedef +----------- + +ä¸è¦ä½¿ç”¨ç±»ä¼¼ ``vps_t`` 之类的东西。 + +对结构体和指针使用 typedef 是一个 **错误** ã€‚å½“ä½ åœ¨ä»£ç 里看到: + +.. code-block:: c + + vps_t a; + +这代表什么æ„æ€å‘¢ï¼Ÿ + +相åï¼Œå¦‚æžœæ˜¯è¿™æ · + +.. code-block:: c + + struct virtual_container *a; + +ä½ å°±çŸ¥é“ ``a`` 是什么了。 + +很多人认为 typedef ``能æ高å¯è¯»æ€§`` 。实际ä¸æ˜¯è¿™æ ·çš„。它们åªåœ¨ä¸‹åˆ—情况下有用: + + (a) 完全ä¸é€æ˜Žçš„对象 (è¿™ç§æƒ…况下è¦ä¸»åŠ¨ä½¿ç”¨ typedef æ¥ **éšè—** 这个对象实际上 + 是什么)。 + + 例如: ``pte_t`` ç‰ä¸é€æ˜Žå¯¹è±¡ï¼Œä½ åªèƒ½ç”¨åˆé€‚的访问函数æ¥è®¿é—®å®ƒä»¬ã€‚ + + .. note:: + + ä¸é€æ˜Žæ€§å’Œ "访问函数" 本身是ä¸å¥½çš„。我们使用 pte_t ç‰ç±»åž‹çš„åŽŸå› åœ¨äºŽçœŸ + 的是完全没有任何共用的å¯è®¿é—®ä¿¡æ¯ã€‚ + + (b) 清楚的整数类型,如æ¤ï¼Œè¿™å±‚抽象就å¯ä»¥ **帮助** 消除到底是 ``int`` 还是 + ``long`` 的混淆。 + + u8/u16/u32 是完全没有问题的 typedef,ä¸è¿‡å®ƒä»¬æ›´ç¬¦åˆç±»åˆ« (d) 而ä¸æ˜¯è¿™é‡Œã€‚ + + .. note:: + + è¦è¿™æ ·åšï¼Œå¿…é¡»äº‹å‡ºæœ‰å› ã€‚å¦‚æžœæŸä¸ªå˜é‡æ˜¯ ``unsigned long`` ï¼Œé‚£ä¹ˆæ²¡æœ‰å¿…è¦ + + typedef unsigned long myflags_t; + + ä¸è¿‡å¦‚æžœæœ‰ä¸€ä¸ªæ˜Žç¡®çš„åŽŸå› ï¼Œæ¯”å¦‚å®ƒåœ¨æŸç§æƒ…况下å¯èƒ½ä¼šæ˜¯ä¸€ä¸ª ``unsigned int`` + 而在其他情况下å¯èƒ½ä¸º ``unsigned long`` ,那么就ä¸è¦çŠ¹è±«ï¼Œè¯·åŠ¡å¿…使用 + typedef。 + + (c) å½“ä½ ä½¿ç”¨ sparse 按å—é¢çš„创建一个 **æ–°** 类型æ¥åšç±»åž‹æ£€æŸ¥çš„时候。 + + (d) å’Œæ ‡å‡† C99 类型相åŒçš„类型,在æŸäº›ä¾‹å¤–的情况下。 + + 虽然让眼ç›å’Œè„‘ç‹æ¥é€‚åº”æ–°çš„æ ‡å‡†ç±»åž‹æ¯”å¦‚ ``uint32_t`` ä¸éœ€è¦èŠ±å¾ˆå¤šæ—¶é—´ï¼Œå¯ + 是有些人ä»ç„¶æ‹’ç»ä½¿ç”¨å®ƒä»¬ã€‚ + + å› æ¤ï¼ŒLinux 特有的ç‰åŒäºŽæ ‡å‡†ç±»åž‹çš„ ``u8/u16/u32/u64`` ç±»åž‹å’Œå®ƒä»¬çš„æœ‰ç¬¦å· + 类型是被å…è®¸çš„â€”â€”å°½ç®¡åœ¨ä½ è‡ªå·±çš„æ–°ä»£ç ä¸ï¼Œå®ƒä»¬ä¸æ˜¯å¼ºåˆ¶è¦æ±‚è¦ä½¿ç”¨çš„。 + + 当编辑已ç»ä½¿ç”¨äº†æŸä¸ªç±»åž‹é›†çš„已有代ç æ—¶ï¼Œä½ åº”è¯¥éµå¾ªé‚£äº›ä»£ç ä¸å·²ç»åšå‡ºçš„选 + 择。 + + (e) å¯ä»¥åœ¨ç”¨æˆ·ç©ºé—´å®‰å…¨ä½¿ç”¨çš„类型。 + + 在æŸäº›ç”¨æˆ·ç©ºé—´å¯è§çš„结构体里,我们ä¸èƒ½è¦æ±‚ C99 类型而且ä¸èƒ½ç”¨ä¸Šé¢æ到的 + ``u32`` ç±»åž‹ã€‚å› æ¤ï¼Œæˆ‘们在与用户空间共享的所有结构体ä¸ä½¿ç”¨ __u32 和类似 + 的类型。 + +å¯èƒ½è¿˜æœ‰å…¶ä»–的情况,ä¸è¿‡åŸºæœ¬çš„规则是 **永远ä¸è¦** 使用 typedef,除éžä½ å¯ä»¥æ˜Ž +确的应用上述æŸä¸ªè§„则ä¸çš„一个。 + +总的æ¥è¯´ï¼Œå¦‚æžœä¸€ä¸ªæŒ‡é’ˆæˆ–è€…ä¸€ä¸ªç»“æž„ä½“é‡Œçš„å…ƒç´ å¯ä»¥åˆç†çš„被直接访问到,那么它们 +å°±ä¸åº”该是一个 typedef。 + + +6) 函数 +------------------------------ + +函数应该简çŸè€Œæ¼‚亮,并且åªå®Œæˆä¸€ä»¶äº‹æƒ…。函数应该å¯ä»¥ä¸€å±æˆ–者两å±æ˜¾ç¤ºå®Œ (我们 +éƒ½çŸ¥é“ ISO/ANSI å±å¹•å¤§å°æ˜¯ 80x24),åªåšä¸€ä»¶äº‹æƒ…,而且把它åšå¥½ã€‚ + +一个函数的最大长度是和该函数的å¤æ‚度和缩进级数æˆåæ¯”çš„ã€‚æ‰€ä»¥ï¼Œå¦‚æžœä½ æœ‰ä¸€ä¸ªç† +论上很简å•çš„åªæœ‰ä¸€ä¸ªå¾ˆé•¿ (但是简å•) çš„ case è¯å¥çš„å‡½æ•°ï¼Œè€Œä¸”ä½ éœ€è¦åœ¨æ¯ä¸ª case +里åšå¾ˆå¤šå¾ˆå°çš„äº‹æƒ…ï¼Œè¿™æ ·çš„å‡½æ•°å°½ç®¡å¾ˆé•¿ï¼Œä½†ä¹Ÿæ˜¯å¯ä»¥çš„。 + +ä¸è¿‡ï¼Œå¦‚æžœä½ æœ‰ä¸€ä¸ªå¤æ‚çš„å‡½æ•°ï¼Œè€Œä¸”ä½ æ€€ç–‘ä¸€ä¸ªå¤©åˆ†ä¸æ˜¯å¾ˆé«˜çš„高ä¸ä¸€å¹´çº§å¦ç”Ÿå¯èƒ½ +甚至æžä¸æ¸…æ¥šè¿™ä¸ªå‡½æ•°çš„ç›®çš„ï¼Œä½ åº”è¯¥ä¸¥æ ¼éµå®ˆå‰é¢æ到的长度é™åˆ¶ã€‚使用辅助函数, +并为之å–个具æ述性的åå— (å¦‚æžœä½ è§‰å¾—å®ƒä»¬çš„æ€§èƒ½å¾ˆé‡è¦çš„è¯ï¼Œå¯ä»¥è®©ç¼–译器内è”它 +ä»¬ï¼Œè¿™æ ·çš„æ•ˆæžœå¾€å¾€ä¼šæ¯”ä½ å†™ä¸€ä¸ªå¤æ‚函数的效果è¦å¥½ã€‚) + +函数的å¦å¤–一个衡é‡æ ‡å‡†æ˜¯æœ¬åœ°å˜é‡çš„æ•°é‡ã€‚æ¤æ•°é‡ä¸åº”超过 5ï¼10 个,å¦åˆ™ä½ 的函数 +就有问题了。é‡æ–°è€ƒè™‘ä¸€ä¸‹ä½ çš„å‡½æ•°ï¼ŒæŠŠå®ƒåˆ†æ‹†æˆæ›´å°çš„函数。人的大脑一般å¯ä»¥è½»æ¾ +çš„åŒæ—¶è·Ÿè¸ª 7 个ä¸åŒçš„事物,如果å†å¢žå¤šçš„è¯ï¼Œå°±ä¼šç³Šæ¶‚了。å³ä¾¿ä½ èªé¢–è¿‡äººï¼Œä½ ä¹Ÿå¯ +能会记ä¸æ¸…ä½ 2 个星期å‰åšè¿‡çš„事情。 + +在æºæ–‡ä»¶é‡Œï¼Œä½¿ç”¨ç©ºè¡Œéš”å¼€ä¸åŒçš„函数。如果该函数需è¦è¢«å¯¼å‡ºï¼Œå®ƒçš„ **EXPORT** å® +应该紧贴在它的结æŸå¤§æ‹¬å·ä¹‹ä¸‹ã€‚比如: + +.. code-block:: c + + int system_is_up(void) + { + return system_state == SYSTEM_RUNNING; + } + EXPORT_SYMBOL(system_is_up); + +在函数原型ä¸ï¼ŒåŒ…å«å‡½æ•°å和它们的数æ®ç±»åž‹ã€‚虽然 C è¯è¨€é‡Œæ²¡æœ‰è¿™æ ·çš„è¦æ±‚,在 +Linux 里这是æ倡的åšæ³•ï¼Œå› ä¸ºè¿™æ ·å¯ä»¥å¾ˆç®€å•çš„给读者æ供更多的有价值的信æ¯ã€‚ + + +7) 集ä¸çš„函数退出途径 +------------------------------ + +虽然被æŸäº›äººå£°ç§°å·²ç»è¿‡æ—¶ï¼Œä½†æ˜¯ goto è¯å¥çš„ç‰ä»·ç‰©è¿˜æ˜¯ç»å¸¸è¢«ç¼–译器所使用,具体 +å½¢å¼æ˜¯æ— æ¡ä»¶è·³è½¬æŒ‡ä»¤ã€‚ + +当一个函数从多个ä½ç½®é€€å‡ºï¼Œå¹¶ä¸”需è¦åšä¸€äº›ç±»ä¼¼æ¸…ç†çš„常è§æ“作时,goto è¯å¥å°±å¾ˆæ–¹ +便了。如果并ä¸éœ€è¦æ¸…ç†æ“作,那么直接 return å³å¯ã€‚ + +选择一个能够说明 goto 行为或它为何å˜åœ¨çš„æ ‡ç¾å。如果 goto è¦é‡Šæ”¾ ``buffer``, +一个ä¸é”™çš„åå—å¯ä»¥æ˜¯ ``out_free_buffer:`` ã€‚åˆ«åŽ»ä½¿ç”¨åƒ ``err1:`` å’Œ ``err2:`` +è¿™æ ·çš„GW_BASIC åç§°ï¼Œå› ä¸ºä¸€æ—¦ä½ æ·»åŠ æˆ–åˆ é™¤äº† (函数的) é€€å‡ºè·¯å¾„ï¼Œä½ å°±å¿…é¡»å¯¹å®ƒä»¬ +é‡æ–°ç¼–å·ï¼Œè¿™æ ·ä¼šéš¾ä»¥åŽ»æ£€éªŒæ£ç¡®æ€§ã€‚ + +使用 goto çš„ç†ç”±æ˜¯ï¼š + +- æ— æ¡ä»¶è¯å¥å®¹æ˜“ç†è§£å’Œè·Ÿè¸ª +- 嵌套程度å‡å° +- å¯ä»¥é¿å…由于修改时忘记更新个别的退出点而导致错误 +- 让编译器çœåŽ»åˆ 除冗余代ç 的工作 ;) + +.. code-block:: c + + int fun(int a) + { + int result = 0; + char *buffer; + + buffer = kmalloc(SIZE, GFP_KERNEL); + if (!buffer) + return -ENOMEM; + + if (condition1) { + while (loop1) { + ... + } + result = 1; + goto out_free_buffer; + } + ... + out_free_buffer: + kfree(buffer); + return result; + } + +一个需è¦æ³¨æ„的常è§é”™è¯¯æ˜¯ ``一个 err 错误`` ,就åƒè¿™æ ·ï¼š + +.. code-block:: c + + err: + kfree(foo->bar); + kfree(foo); + return ret; + +这段代ç 的错误是,在æŸäº›é€€å‡ºè·¯å¾„上 ``foo`` 是 NULL。通常情况下,通过把它分离 +æˆä¸¤ä¸ªé”™è¯¯æ ‡ç¾ ``err_free_bar:`` å’Œ ``err_free_foo:`` æ¥ä¿®å¤è¿™ä¸ªé”™è¯¯ï¼š + +.. code-block:: c + + err_free_bar: + kfree(foo->bar); + err_free_foo: + kfree(foo); + return ret; + +ç†æƒ³æƒ…å†µä¸‹ï¼Œä½ åº”è¯¥æ¨¡æ‹Ÿé”™è¯¯æ¥æµ‹è¯•æ‰€æœ‰é€€å‡ºè·¯å¾„。 + + +8) 注释 +------------------------------ + +注释是好的,ä¸è¿‡æœ‰è¿‡åº¦æ³¨é‡Šçš„å±é™©ã€‚永远ä¸è¦åœ¨æ³¨é‡Šé‡Œè§£é‡Šä½ 的代ç 是如何è¿ä½œçš„: +更好的åšæ³•æ˜¯è®©åˆ«äººä¸€çœ‹ä½ 的代ç å°±å¯ä»¥æ˜Žç™½ï¼Œè§£é‡Šå†™çš„很差的代ç 是浪费时间。 + +ä¸€èˆ¬çš„ï¼Œä½ æƒ³è¦ä½ çš„æ³¨é‡Šå‘Šè¯‰åˆ«äººä½ çš„ä»£ç åšäº†ä»€ä¹ˆï¼Œè€Œä¸æ˜¯æ€Žä¹ˆåšçš„ã€‚ä¹Ÿè¯·ä½ ä¸è¦æŠŠ +注释放在一个函数体内部:如果函数å¤æ‚åˆ°ä½ éœ€è¦ç‹¬ç«‹çš„注释其ä¸çš„ä¸€éƒ¨åˆ†ï¼Œä½ å¾ˆå¯èƒ½ +需è¦å›žåˆ°ç¬¬å…ç« çœ‹ä¸€çœ‹ã€‚ä½ å¯ä»¥åšä¸€äº›å°æ³¨é‡Šæ¥æ³¨æ˜Žæˆ–è¦å‘ŠæŸäº›å¾ˆèªæ˜Ž (或者槽糕) çš„ +åšæ³•ï¼Œä½†ä¸è¦åŠ å¤ªå¤šã€‚ä½ åº”è¯¥åšçš„,是把注释放在函数的头部,告诉人们它åšäº†ä»€ä¹ˆï¼Œ +也å¯ä»¥åŠ 上它åšè¿™äº›äº‹æƒ…çš„åŽŸå› ã€‚ + +å½“æ³¨é‡Šå†…æ ¸ API 函数时,请使用 kernel-doc æ ¼å¼ã€‚请看 +Documentation/doc-guide/ å’Œ scripts/kernel-doc 以获得详细信æ¯ã€‚ + +é•¿ (多行) æ³¨é‡Šçš„é¦–é€‰é£Žæ ¼æ˜¯ï¼š + +.. code-block:: c + + /* + * This is the preferred style for multi-line + * comments in the Linux kernel source code. + * Please use it consistently. + * + * Description: A column of asterisks on the left side, + * with beginning and ending almost-blank lines. + */ + +对于在 net/ å’Œ drivers/net/ 的文件,首选的长 (多行) æ³¨é‡Šé£Žæ ¼æœ‰äº›ä¸åŒã€‚ + +.. code-block:: c + + /* The preferred comment style for files in net/ and drivers/net + * looks like this. + * + * It is nearly the same as the generally preferred comment style, + * but there is no initial almost-blank line. + */ + +注释数æ®ä¹Ÿæ˜¯å¾ˆé‡è¦çš„,ä¸ç®¡æ˜¯åŸºæœ¬ç±»åž‹è¿˜æ˜¯è¡ç”Ÿç±»åž‹ã€‚为了方便实现这一点,æ¯ä¸€è¡Œ +应åªå£°æ˜Žä¸€ä¸ªæ•°æ® (ä¸è¦ä½¿ç”¨é€—å·æ¥ä¸€æ¬¡å£°æ˜Žå¤šä¸ªæ•°æ®)ã€‚è¿™æ ·ä½ å°±æœ‰ç©ºé—´æ¥ä¸ºæ¯ä¸ªæ•°æ® +写一段å°æ³¨é‡Šæ¥è§£é‡Šå®ƒä»¬çš„用途了。 + + +9) ä½ å·²ç»æŠŠäº‹æƒ…弄糟了 +------------------------------ + +è¿™æ²¡ä»€ä¹ˆï¼Œæˆ‘ä»¬éƒ½æ˜¯è¿™æ ·ã€‚å¯èƒ½ä½ 的使用了很长时间 Unix 的朋å‹å·²ç»å‘Šè¯‰ä½ +``GNU emacs`` èƒ½è‡ªåŠ¨å¸®ä½ æ ¼å¼åŒ– C æºä»£ç ï¼Œè€Œä¸”ä½ ä¹Ÿæ³¨æ„åˆ°äº†ï¼Œç¡®å®žæ˜¯è¿™æ ·ï¼Œä¸è¿‡å®ƒ +所使用的默认值和我们想è¦çš„相去甚远 (实际上,甚至比éšæœºæ‰“的还è¦å·®â€”â€”æ— æ•°ä¸ªçŒ´å +在 GNU emacs 里打å—永远ä¸ä¼šåˆ›é€ 出一个好程åº) (译注:Infinite Monkey Theorem) + +æ‰€ä»¥ä½ è¦ä¹ˆæ”¾å¼ƒ GNU emacs,è¦ä¹ˆæ”¹å˜å®ƒè®©å®ƒä½¿ç”¨æ›´åˆç†çš„设定。è¦é‡‡ç”¨åŽä¸€ä¸ªæ–¹æ¡ˆï¼Œ +ä½ å¯ä»¥æŠŠä¸‹é¢è¿™æ®µç²˜è´´åˆ°ä½ çš„ .emacs 文件里。 + +.. code-block:: none + + (defun c-lineup-arglist-tabs-only (ignored) + "Line up argument lists by tabs, not spaces" + (let* ((anchor (c-langelem-pos c-syntactic-element)) + (column (c-langelem-2nd-pos c-syntactic-element)) + (offset (- (1+ column) anchor)) + (steps (floor offset c-basic-offset))) + (* (max steps 1) + c-basic-offset))) + + (add-hook 'c-mode-common-hook + (lambda () + ;; Add kernel style + (c-add-style + "linux-tabs-only" + '("linux" (c-offsets-alist + (arglist-cont-nonempty + c-lineup-gcc-asm-reg + c-lineup-arglist-tabs-only)))))) + + (add-hook 'c-mode-hook + (lambda () + (let ((filename (buffer-file-name))) + ;; Enable kernel mode for the appropriate files + (when (and filename + (string-match (expand-file-name "~/src/linux-trees") + filename)) + (setq indent-tabs-mode t) + (setq show-trailing-whitespace t) + (c-set-style "linux-tabs-only"))))) + +这会让 emacs 在 ``~/src/linux-trees`` 下的 C æºæ–‡ä»¶èŽ·å¾—æ›´å¥½çš„å†…æ ¸ä»£ç é£Žæ ¼ã€‚ + +ä¸è¿‡å°±ç®—ä½ å°è¯•è®© emacs æ£ç¡®çš„æ ¼å¼åŒ–代ç 失败了,也并ä¸æ„味ç€ä½ å¤±åŽ»äº†ä¸€åˆ‡ï¼šè¿˜å¯ +以用 ``indent`` 。 + +ä¸è¿‡ï¼ŒGNU indent 也有和 GNU emacs ä¸€æ ·æœ‰é—®é¢˜çš„è®¾å®šï¼Œæ‰€ä»¥ä½ éœ€è¦ç»™å®ƒä¸€äº›å‘½ä»¤é€‰ +项。ä¸è¿‡ï¼Œè¿™è¿˜ä¸ç®—å¤ªç³Ÿç³•ï¼Œå› ä¸ºå°±ç®—æ˜¯ GNU indent çš„ä½œè€…ä¹Ÿè®¤åŒ K&R çš„æƒå¨æ€§ +(GNU 的人并ä¸æ˜¯å人,他们åªæ˜¯åœ¨è¿™ä¸ªé—®é¢˜ä¸Šè¢«ä¸¥é‡çš„误导了)ï¼Œæ‰€ä»¥ä½ åªè¦ç»™ indent +指定选项 ``-kr -i8`` (代表 ``K&R,8 å—符缩进``),或使用 ``scripts/Lindent`` +è¿™æ ·å°±å¯ä»¥ä»¥æœ€æ—¶é«¦çš„æ–¹å¼ç¼©è¿›æºä»£ç 。 + +``indent`` 有很多选项,特别是é‡æ–°æ ¼å¼åŒ–æ³¨é‡Šçš„æ—¶å€™ï¼Œä½ å¯èƒ½éœ€è¦çœ‹ä¸€ä¸‹å®ƒçš„手册。 +ä¸è¿‡è®°ä½ï¼š ``indent`` ä¸èƒ½ä¿®æ£åçš„ç¼–ç¨‹ä¹ æƒ¯ã€‚ + + +10) Kconfig é…置文件 +------------------------------ + +对于é布æºç æ ‘çš„æ‰€æœ‰ Kconfig* é…置文件æ¥è¯´ï¼Œå®ƒä»¬ç¼©è¿›æ–¹å¼æœ‰æ‰€ä¸åŒã€‚ç´§æŒ¨ç€ +``config`` 定义的行,用一个制表符缩进,然而 help ä¿¡æ¯çš„缩进则é¢å¤–å¢žåŠ 2 个空 +æ ¼ã€‚ä¸¾ä¸ªä¾‹å:: + + config AUDIT + bool "Auditing support" + depends on NET + help + Enable auditing infrastructure that can be used with another + kernel subsystem, such as SELinux (which requires this for + logging of avc messages output). Does not do system-call + auditing without CONFIG_AUDITSYSCALL. + +而那些å±é™©çš„功能 (比如æŸäº›æ–‡ä»¶ç³»ç»Ÿçš„写支æŒ) 应该在它们的æ示å—符串里显著的声 +明这一点:: + + config ADFS_FS_RW + bool "ADFS write support (DANGEROUS)" + depends on ADFS_FS + ... + +è¦æŸ¥çœ‹é…置文件的完整文档,请看 Documentation/kbuild/kconfig-language.txt。 + + +11) æ•°æ®ç»“æž„ +------------------------------ + +如果一个数æ®ç»“构,在创建和销æ¯å®ƒçš„å•çº¿æ‰§è¡ŒçŽ¯å¢ƒä¹‹å¤–å¯è§ï¼Œé‚£ä¹ˆå®ƒå¿…é¡»è¦æœ‰ä¸€ä¸ªå¼• +ç”¨è®¡æ•°å™¨ã€‚å†…æ ¸é‡Œæ²¡æœ‰åžƒåœ¾æ”¶é›† (å¹¶ä¸”å†…æ ¸ä¹‹å¤–çš„åžƒåœ¾æ”¶é›†æ…¢ä¸”æ•ˆçŽ‡ä½Žä¸‹),这æ„味ç€ä½ +ç»å¯¹éœ€è¦è®°å½•ä½ 对这ç§æ•°æ®ç»“构的使用情况。 + +引用计数æ„味ç€ä½ 能够é¿å…上é”,并且å…许多个用户并行访问这个数æ®ç»“构——而ä¸éœ€è¦ +担心这个数æ®ç»“æž„ä»…ä»…å› ä¸ºæš‚æ—¶ä¸è¢«ä½¿ç”¨å°±æ¶ˆå¤±äº†ï¼Œé‚£äº›ç”¨æˆ·å¯èƒ½ä¸è¿‡æ˜¯æ²‰ç¡äº†ä¸€é˜µæˆ– +者åšäº†ä¸€äº›å…¶ä»–事情而已。 + +注æ„ä¸Šé” **ä¸èƒ½** å–代引用计数。上é”是为了ä¿æŒæ•°æ®ç»“构的一致性,而引用计数是一 +个内å˜ç®¡ç†æŠ€å·§ã€‚通常二者都需è¦ï¼Œä¸è¦æŠŠä¸¤ä¸ªæžæ··äº†ã€‚ + +很多数æ®ç»“构实际上有 2 级引用计数,它们通常有ä¸åŒ ``ç±»`` 的用户。å类计数器统 +计å类用户的数é‡ï¼Œæ¯å½“å类计数器å‡è‡³é›¶æ—¶ï¼Œå…¨å±€è®¡æ•°å™¨å‡ä¸€ã€‚ + +è¿™ç§ ``多级引用计数`` 的例åå¯ä»¥åœ¨å†…å˜ç®¡ç† (``struct mm_struct``: mm_users å’Œ +mm_count),和文件系统 (``struct super_block``: s_count å’Œ s_active) ä¸æ‰¾åˆ°ã€‚ + +è®°ä½ï¼šå¦‚æžœå¦ä¸€ä¸ªæ‰§è¡Œçº¿ç´¢å¯ä»¥æ‰¾åˆ°ä½ çš„æ•°æ®ç»“构,但这个数æ®ç»“构没有引用计数器, +è¿™é‡Œå‡ ä¹Žè‚¯å®šæ˜¯ä¸€ä¸ª bug。 + + +12) å®ï¼Œæžšä¸¾å’ŒRTL +------------------------------ + +用于定义常é‡çš„å®çš„åå—åŠæžšä¸¾é‡Œçš„æ ‡ç¾éœ€è¦å¤§å†™ã€‚ + +.. code-block:: c + + #define CONSTANT 0x12345 + +åœ¨å®šä¹‰å‡ ä¸ªç›¸å…³çš„å¸¸é‡æ—¶ï¼Œæœ€å¥½ç”¨æžšä¸¾ã€‚ + +å®çš„åå—请用大写å—æ¯ï¼Œä¸è¿‡å½¢å¦‚函数的å®çš„åå—å¯ä»¥ç”¨å°å†™å—æ¯ã€‚ + +一般的,如果能写æˆå†…è”函数就ä¸è¦å†™æˆåƒå‡½æ•°çš„å®ã€‚ + +å«æœ‰å¤šä¸ªè¯å¥çš„å®åº”该被包å«åœ¨ä¸€ä¸ª do-while 代ç å—里: + +.. code-block:: c + + #define macrofun(a, b, c) \ + do { \ + if (a == 5) \ + do_this(b, c); \ + } while (0) + +使用å®çš„时候应é¿å…的事情: + +1) å½±å“控制æµç¨‹çš„å®ï¼š + +.. code-block:: c + + #define FOO(x) \ + do { \ + if (blah(x) < 0) \ + return -EBUGGERED; \ + } while (0) + +**éžå¸¸** ä¸å¥½ã€‚它看起æ¥åƒä¸€ä¸ªå‡½æ•°ï¼Œä¸è¿‡å´èƒ½å¯¼è‡´ ``调用`` 它的函数退出;ä¸è¦æ‰“ +乱读者大脑里的è¯æ³•åˆ†æžå™¨ã€‚ + +2) ä¾èµ–于一个固定åå—的本地å˜é‡çš„å®ï¼š + +.. code-block:: c + + #define FOO(val) bar(index, val) + +å¯èƒ½çœ‹èµ·æ¥åƒæ˜¯ä¸ªä¸é”™çš„东西,ä¸è¿‡å®ƒéžå¸¸å®¹æ˜“把读代ç 的人æžç³Šæ¶‚,而且容易导致看起 +æ¥ä¸ç›¸å…³çš„改动带æ¥é”™è¯¯ã€‚ + +3) 作为左值的带å‚æ•°çš„å®ï¼š FOO(x) = y;如果有人把 FOO å˜æˆä¸€ä¸ªå†…è”函数的è¯ï¼Œè¿™ + ç§ç”¨æ³•å°±ä¼šå‡ºé”™äº†ã€‚ + +4) 忘记了优先级:使用表达å¼å®šä¹‰å¸¸é‡çš„å®å¿…须将表达å¼ç½®äºŽä¸€å¯¹å°æ‹¬å·ä¹‹å†…。带å‚æ•° + çš„å®ä¹Ÿè¦æ³¨æ„æ¤ç±»é—®é¢˜ã€‚ + +.. code-block:: c + + #define CONSTANT 0x4000 + #define CONSTEXP (CONSTANT | 3) + +5) 在å®é‡Œå®šä¹‰ç±»ä¼¼å‡½æ•°çš„本地å˜é‡æ—¶å‘½å冲çªï¼š + +.. code-block:: c + + #define FOO(x) \ + ({ \ + typeof(x) ret; \ + ret = calc_ret(x); \ + (ret); \ + }) + +ret 是本地å˜é‡çš„通用åå— - __foo_ret æ›´ä¸å®¹æ˜“与一个已å˜åœ¨çš„å˜é‡å†²çªã€‚ + +cpp 手册对å®çš„讲解很详细。gcc internals 手册也详细讲解了 RTLï¼Œå†…æ ¸é‡Œçš„æ±‡ç¼–è¯ +言ç»å¸¸ç”¨åˆ°å®ƒã€‚ + + +13) 打å°å†…æ ¸æ¶ˆæ¯ +------------------------------ + +å†…æ ¸å¼€å‘者应该是å—过良好教育的。请一定注æ„å†…æ ¸ä¿¡æ¯çš„拼写,以给人以好的å°è±¡ã€‚ +ä¸è¦ç”¨ä¸è§„范的å•è¯æ¯”如 ``dont``,而è¦ç”¨ ``do not`` 或者 ``don't`` 。ä¿è¯è¿™äº›ä¿¡ +æ¯ç®€å•æ˜Žäº†,æ— æ§ä¹‰ã€‚ + +å†…æ ¸ä¿¡æ¯ä¸å¿…以英文å¥å·ç»“æŸã€‚ + +在å°æ‹¬å·é‡Œæ‰“å°æ•°å— (%d) 没有任何价值,应该é¿å…è¿™æ ·åšã€‚ + +<linux/device.h> 里有一些驱动模型诊æ–å®ï¼Œä½ 应该使用它们,以确ä¿ä¿¡æ¯å¯¹åº”于æ£ç¡® +çš„è®¾å¤‡å’Œé©±åŠ¨ï¼Œå¹¶ä¸”è¢«æ ‡è®°äº†æ£ç¡®çš„消æ¯çº§åˆ«ã€‚这些å®æœ‰ï¼šdev_err(), dev_warn(), +dev_info() ç‰ç‰ã€‚对于那些ä¸å’ŒæŸä¸ªç‰¹å®šè®¾å¤‡ç›¸å…³è¿žçš„ä¿¡æ¯ï¼Œ<linux/printk.h> 定义 +了 pr_notice(), pr_info(), pr_warn(), pr_err() 和其他。 + +写出好的调试信æ¯å¯ä»¥æ˜¯ä¸€ä¸ªå¾ˆå¤§çš„æŒ‘æˆ˜ï¼›ä¸€æ—¦ä½ å†™å‡ºåŽï¼Œè¿™äº›ä¿¡æ¯åœ¨è¿œç¨‹é™¤é”™æ—¶èƒ½æ +ä¾›æžå¤§çš„帮助。然而打å°è°ƒè¯•ä¿¡æ¯çš„处ç†æ–¹å¼åŒæ‰“å°éžè°ƒè¯•ä¿¡æ¯ä¸åŒã€‚其他 pr_XXX() +å‡½æ•°èƒ½æ— æ¡ä»¶åœ°æ‰“å°ï¼Œpr_debug() å´ä¸ï¼›é»˜è®¤æƒ…况下它ä¸ä¼šè¢«ç¼–译,除éžå®šä¹‰äº† DEBUG +或设定了 CONFIG_DYNAMIC_DEBUG。实际这åŒæ ·æ˜¯ä¸ºäº† dev_dbg(),一个相关约定是在一 +个已ç»å¼€å¯äº† DEBUG 时,使用 VERBOSE_DEBUG æ¥æ·»åŠ dev_vdbg()。 + +许多å系统拥有 Kconfig 调试选项æ¥å¼€å¯ -DDEBUG 在对应的 Makefile 里é¢ï¼›åœ¨å…¶ä»– +情况下,特殊文件使用 #define DEBUG。当一æ¡è°ƒè¯•ä¿¡æ¯éœ€è¦è¢«æ— æ¡ä»¶æ‰“å°æ—¶ï¼Œä¾‹å¦‚, +如果已ç»åŒ…å«ä¸€ä¸ªè°ƒè¯•ç›¸å…³çš„ #ifdef æ¡ä»¶ï¼Œprintk(KERN_DEBUG ...) å°±å¯è¢«ä½¿ç”¨ã€‚ + + +14) 分é…å†…å˜ +------------------------------ + +å†…æ ¸æ供了下é¢çš„一般用途的内å˜åˆ†é…函数: +kmalloc(), kzalloc(), kmalloc_array(), kcalloc(), vmalloc() å’Œ vzalloc()。 +请å‚考 API 文档以获å–有关它们的详细信æ¯ã€‚ + +ä¼ é€’ç»“æž„ä½“å¤§å°çš„首选形å¼æ˜¯è¿™æ ·çš„: + +.. code-block:: c + + p = kmalloc(sizeof(*p), ...); + +å¦å¤–一ç§ä¼ 递方å¼ä¸ï¼Œsizeof çš„æ“作数是结构体的åå—ï¼Œè¿™æ ·ä¼šé™ä½Žå¯è¯»æ€§ï¼Œå¹¶ä¸”å¯èƒ½ +会引入 bug。有å¯èƒ½æŒ‡é’ˆå˜é‡ç±»åž‹è¢«æ”¹å˜æ—¶ï¼Œè€Œå¯¹åº”çš„ä¼ é€’ç»™å†…å˜åˆ†é…函数的 sizeof +的结果ä¸å˜ã€‚ + +强制转æ¢ä¸€ä¸ª void 指针返回值是多余的。C è¯è¨€æœ¬èº«ä¿è¯äº†ä»Ž void 指针到其他任何 +指针类型的转æ¢æ˜¯æ²¡æœ‰é—®é¢˜çš„。 + +分é…一个数组的首选形å¼æ˜¯è¿™æ ·çš„: + +.. code-block:: c + + p = kmalloc_array(n, sizeof(...), ...); + +分é…一个零长数组的首选形å¼æ˜¯è¿™æ ·çš„: + +.. code-block:: c + + p = kcalloc(n, sizeof(...), ...); + +两ç§å½¢å¼æ£€æŸ¥åˆ†é…å¤§å° n * sizeof(...) 的溢出,如果溢出返回 NULL。 + + +15) 内è”弊病 +------------------------------ + +有一个常è§çš„误解是 ``内è”`` 是 gcc æ供的å¯ä»¥è®©ä»£ç è¿è¡Œæ›´å¿«çš„一个选项。虽然使 +用内è”函数有时候是æ°å½“çš„ (比如作为一ç§æ›¿ä»£å®çš„æ–¹å¼ï¼Œè¯·çœ‹ç¬¬åäºŒç« ),ä¸è¿‡å¾ˆå¤šæƒ… +况下ä¸æ˜¯è¿™æ ·ã€‚inline çš„è¿‡åº¦ä½¿ç”¨ä¼šä½¿å†…æ ¸å˜å¤§ï¼Œä»Žè€Œä½¿æ•´ä¸ªç³»ç»Ÿè¿è¡Œé€Ÿåº¦å˜æ…¢ã€‚ +å› ä¸ºä½“ç§¯å¤§å†…æ ¸ä¼šå 用更多的指令高速缓å˜ï¼Œè€Œä¸”会导致 pagecache çš„å¯ç”¨å†…å˜å‡å°‘。 +想象一下,一次 pagecache 未命ä¸å°±ä¼šå¯¼è‡´ä¸€æ¬¡ç£ç›˜å¯»å€ï¼Œå°†è€—æ—¶ 5 毫秒。5 毫秒的 +时间内 CPU 能执行很多很多指令。 + +一个基本的原则是如果一个函数有 3 行以上,就ä¸è¦æŠŠå®ƒå˜æˆå†…è”函数。这个原则的一 +ä¸ªä¾‹å¤–æ˜¯ï¼Œå¦‚æžœä½ çŸ¥é“æŸä¸ªå‚数是一个编译时常é‡ï¼Œè€Œä¸”å› ä¸ºè¿™ä¸ªå¸¸é‡ä½ 确定编译器在 +ç¼–è¯‘æ—¶èƒ½ä¼˜åŒ–æŽ‰ä½ çš„å‡½æ•°çš„å¤§éƒ¨åˆ†ä»£ç ,那ä»ç„¶å¯ä»¥ç»™å®ƒåŠ 上 inline 关键å—。 +kmalloc() 内è”函数就是一个很好的例å。 + +人们ç»å¸¸ä¸»å¼ ç»™ static 的而且åªç”¨äº†ä¸€æ¬¡çš„å‡½æ•°åŠ ä¸Š inline,如æ¤ä¸ä¼šæœ‰ä»»ä½•æŸå¤±ï¼Œ +å› ä¸ºæ²¡æœ‰ä»€ä¹ˆå¥½æƒè¡¡çš„。虽然从技术上说这是æ£ç¡®çš„,但是实际上这ç§æƒ…况下å³ä½¿ä¸åŠ +inline gcc 也å¯ä»¥è‡ªåŠ¨ä½¿å…¶å†…è”。而且其他用户å¯èƒ½ä¼šè¦æ±‚移除 inline,由æ¤è€Œæ¥çš„ +争论会抵消 inline 自身的潜在价值,得ä¸å¿å¤±ã€‚ + + +16) 函数返回值åŠå‘½å +------------------------------ + +函数å¯ä»¥è¿”回多ç§ä¸åŒç±»åž‹çš„值,最常è§çš„一ç§æ˜¯è¡¨æ˜Žå‡½æ•°æ‰§è¡ŒæˆåŠŸæˆ–è€…å¤±è´¥çš„å€¼ã€‚è¿™æ · +的一个值å¯ä»¥è¡¨ç¤ºä¸ºä¸€ä¸ªé”™è¯¯ä»£ç æ•´æ•° (-Exxxï¼å¤±è´¥ï¼Œ0ï¼æˆåŠŸ) 或者一个 ``æˆåŠŸ`` +布尔值 (0ï¼å¤±è´¥ï¼Œéž0ï¼æˆåŠŸ)。 + +æ··åˆä½¿ç”¨è¿™ä¸¤ç§è¡¨è¾¾æ–¹å¼æ˜¯éš¾äºŽå‘现的 bug çš„æ¥æºã€‚如果 C è¯è¨€æœ¬èº«ä¸¥æ ¼åŒºåˆ†æ•´å½¢å’Œ +布尔型å˜é‡ï¼Œé‚£ä¹ˆç¼–译器就能够帮我们å‘现这些错误... ä¸è¿‡ C è¯è¨€ä¸åŒºåˆ†ã€‚为了é¿å… +äº§ç”Ÿè¿™ç§ bug,请éµå¾ªä¸‹é¢çš„惯例:: + + 如果函数的åå—是一个动作或者强制性的命令,那么这个函数应该返回错误代 + ç 整数。如果是一个判æ–,那么函数应该返回一个 "æˆåŠŸ" 布尔值。 + +比如, ``add work`` 是一个命令,所以 add_work() 在æˆåŠŸæ—¶è¿”回 0,在失败时返回 +-EBUSYã€‚ç±»ä¼¼çš„ï¼Œå› ä¸º ``PCI device present`` 是一个判æ–,所以 pci_dev_present() +在æˆåŠŸæ‰¾åˆ°ä¸€ä¸ªåŒ¹é…的设备时应该返回 1,如果找ä¸åˆ°æ—¶åº”该返回 0。 + +所有 EXPORTed 函数都必须éµå®ˆè¿™ä¸ªæƒ¯ä¾‹ï¼Œæ‰€æœ‰çš„公共函数也都应该如æ¤ã€‚ç§æœ‰ +(static) 函数ä¸éœ€è¦å¦‚æ¤ï¼Œä½†æ˜¯æˆ‘们也推èè¿™æ ·åšã€‚ + +返回值是实际计算结果而ä¸æ˜¯è®¡ç®—是å¦æˆåŠŸçš„æ ‡å¿—çš„å‡½æ•°ä¸å—æ¤æƒ¯ä¾‹çš„é™åˆ¶ã€‚一般的, +他们通过返回一些æ£å¸¸å€¼èŒƒå›´ä¹‹å¤–的结果æ¥è¡¨ç¤ºå‡ºé”™ã€‚典型的例å是返回指针的函数, +他们使用 NULL 或者 ERR_PTR 机制æ¥æŠ¥å‘Šé”™è¯¯ã€‚ + + +17) ä¸è¦é‡æ–°å‘æ˜Žå†…æ ¸å® +------------------------------ + +头文件 include/linux/kernel.h 包å«äº†ä¸€äº›å®ï¼Œä½ 应该使用它们,而ä¸è¦è‡ªå·±å†™ä¸€äº› +它们的å˜ç§ã€‚æ¯”å¦‚ï¼Œå¦‚æžœä½ éœ€è¦è®¡ç®—ä¸€ä¸ªæ•°ç»„çš„é•¿åº¦ï¼Œä½¿ç”¨è¿™ä¸ªå® + +.. code-block:: c + + #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) + +ç±»ä¼¼çš„ï¼Œå¦‚æžœä½ è¦è®¡ç®—æŸç»“构体æˆå‘˜çš„大å°ï¼Œä½¿ç”¨ + +.. code-block:: c + + #define FIELD_SIZEOF(t, f) (sizeof(((t*)0)->f)) + +还有å¯ä»¥åšä¸¥æ ¼çš„类型检查的 min() å’Œ max() å®ï¼Œå¦‚æžœä½ éœ€è¦å¯ä»¥ä½¿ç”¨å®ƒä»¬ã€‚ä½ å¯ä»¥ +è‡ªå·±çœ‹çœ‹é‚£ä¸ªå¤´æ–‡ä»¶é‡Œè¿˜å®šä¹‰äº†ä»€ä¹ˆä½ å¯ä»¥æ‹¿æ¥ç”¨çš„东西,如果有定义的è¯ï¼Œä½ å°±ä¸åº” +åœ¨ä½ çš„ä»£ç 里自己é‡æ–°å®šä¹‰ã€‚ + + +18) 编辑器模å¼è¡Œå’Œå…¶ä»–需è¦ç½—嗦的事情 +-------------------------------------------------- + +有一些编辑器å¯ä»¥è§£é‡ŠåµŒå…¥åœ¨æºæ–‡ä»¶é‡Œçš„ç”±ä¸€äº›ç‰¹æ®Šæ ‡è®°æ ‡æ˜Žçš„é…置信æ¯ã€‚比如,emacs +èƒ½å¤Ÿè§£é‡Šè¢«æ ‡è®°æˆè¿™æ ·çš„行: + +.. code-block:: c + + -*- mode: c -*- + +æˆ–è€…è¿™æ ·çš„ï¼š + +.. code-block:: c + + /* + Local Variables: + compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c" + End: + */ + +Vim èƒ½å¤Ÿè§£é‡Šè¿™æ ·çš„æ ‡è®°ï¼š + +.. code-block:: c + + /* vim:set sw=8 noet */ + +ä¸è¦åœ¨æºä»£ç ä¸åŒ…å«ä»»ä½•è¿™æ ·çš„内容。æ¯ä¸ªäººéƒ½æœ‰ä»–自己的编辑器é…ç½®ï¼Œä½ çš„æºæ–‡ä»¶ä¸ +应该覆盖别人的é…置。这包括有关缩进和模å¼é…ç½®çš„æ ‡è®°ã€‚äººä»¬å¯ä»¥ä½¿ç”¨ä»–们自己定制 +的模å¼ï¼Œæˆ–者使用其他å¯ä»¥äº§ç”Ÿæ£ç¡®çš„缩进的巧妙方法。 + + +19) 内è”汇编 +------------------------------ + +在特定架构的代ç ä¸ï¼Œä½ å¯èƒ½éœ€è¦å†…è”汇编与 CPU 和平å°ç›¸å…³åŠŸèƒ½è¿žæŽ¥ã€‚需è¦è¿™ä¹ˆåšæ—¶ +å°±ä¸è¦çŠ¹è±«ã€‚然而,当 C å¯ä»¥å®Œæˆå·¥ä½œæ—¶ï¼Œä¸è¦å¹³ç™½æ— 故地使用内è”汇编。在å¯èƒ½çš„情 +å†µä¸‹ï¼Œä½ å¯ä»¥å¹¶ä¸”应该用 C 和硬件沟通。 + +请考虑去写æ†ç»‘通用ä½å…ƒ (wrap common bits) 的内è”汇编的简å•è¾…助函数,别去é‡å¤ +地写下åªæœ‰ç»†å¾®å·®å¼‚内è”汇编。记ä½å†…è”汇编å¯ä»¥ä½¿ç”¨ C å‚数。 + +大型,有一定å¤æ‚度的汇编函数应该放在 .S 文件内,用相应的 C 原型定义在 C 头文 +件ä¸ã€‚汇编函数的 C 原型应该使用 ``asmlinkage`` 。 + +ä½ å¯èƒ½éœ€è¦æŠŠæ±‡ç¼–è¯å¥æ ‡è®°ä¸º volatile,用æ¥é˜»æ¢ GCC 在没å‘现任何副作用åŽå°±æŠŠå®ƒ +ç§»é™¤äº†ã€‚ä½ ä¸å¿…æ€»æ˜¯è¿™æ ·åšï¼Œå°½ç®¡ï¼Œè¿™ä¸å¿…è¦çš„举动会é™åˆ¶ä¼˜åŒ–。 + +在写一个包å«å¤šæ¡æŒ‡ä»¤çš„å•ä¸ªå†…è”汇编è¯å¥æ—¶ï¼ŒæŠŠæ¯æ¡æŒ‡ä»¤ç”¨å¼•å·åˆ†å‰²è€Œä¸”å„å 一行, +除了最åŽä¸€æ¡æŒ‡ä»¤å¤–,在æ¯ä¸ªæŒ‡ä»¤ç»“å°¾åŠ ä¸Š \n\t,让汇编输出时å¯ä»¥æ£ç¡®åœ°ç¼©è¿›ä¸‹ä¸€æ¡ +指令: + +.. code-block:: c + + asm ("magic %reg1, #42\n\t" + "more_magic %reg2, %reg3" + : /* outputs */ : /* inputs */ : /* clobbers */); + + +20) æ¡ä»¶ç¼–译 +------------------------------ + +åªè¦å¯èƒ½ï¼Œå°±ä¸è¦åœ¨ .c 文件里é¢ä½¿ç”¨é¢„处ç†æ¡ä»¶ (#if, #ifdef)ï¼›è¿™æ ·åšè®©ä»£ç æ›´éš¾ +阅读并且更难去跟踪逻辑。替代方案是,在头文件ä¸ç”¨é¢„处ç†æ¡ä»¶æ供给那些 .c 文件 +使用,å†ç»™ #else æ供一个空桩 (no-op stub) 版本,然åŽåœ¨ .c æ–‡ä»¶å†…æ— æ¡ä»¶åœ°è°ƒç”¨ +那些 (定义在头文件内的) å‡½æ•°ã€‚è¿™æ ·åšï¼Œç¼–译器会é¿å…为桩函数 (stub) çš„è°ƒç”¨ç”Ÿæˆ +任何代ç ,产生的结果是相åŒçš„ï¼Œä½†é€»è¾‘å°†æ›´åŠ æ¸…æ™°ã€‚ + +最好倾å‘于编译整个函数,而ä¸æ˜¯å‡½æ•°çš„一部分或表达å¼çš„一部分。与其放一个 ifdef +在表达å¼å†…,ä¸å¦‚分解出部分或全部表达å¼ï¼Œæ”¾è¿›ä¸€ä¸ªå•ç‹¬çš„è¾…åŠ©å‡½æ•°ï¼Œå¹¶åº”ç”¨é¢„å¤„ç† +æ¡ä»¶åˆ°è¿™ä¸ªè¾…助函数内。 + +å¦‚æžœä½ æœ‰ä¸€ä¸ªåœ¨ç‰¹å®šé…ç½®ä¸ï¼Œå¯èƒ½å˜æˆæœªä½¿ç”¨çš„函数或å˜é‡ï¼Œç¼–译器会è¦å‘Šå®ƒå®šä¹‰äº†ä½† +æœªä½¿ç”¨ï¼ŒæŠŠå®ƒæ ‡è®°ä¸º __maybe_unused 而ä¸æ˜¯å°†å®ƒåŒ…å«åœ¨ä¸€ä¸ªé¢„处ç†æ¡ä»¶ä¸ã€‚(然而,如 +果一个函数或å˜é‡æ€»æ˜¯æœªä½¿ç”¨ï¼Œå°±ç›´æŽ¥åˆ 除它。) + +在代ç ä¸ï¼Œå°½å¯èƒ½åœ°ä½¿ç”¨ IS_ENABLED å®æ¥è½¬åŒ–æŸä¸ª Kconfig æ ‡è®°ä¸º C 的布尔 +表达å¼ï¼Œå¹¶åœ¨ä¸€èˆ¬çš„ C æ¡ä»¶ä¸ä½¿ç”¨å®ƒï¼š + +.. code-block:: c + + if (IS_ENABLED(CONFIG_SOMETHING)) { + ... + } + +编译器会åšå¸¸é‡æŠ˜å ,然åŽå°±åƒä½¿ç”¨ #ifdef é‚£æ ·åŽ»åŒ…å«æˆ–排除代ç å—,所以这ä¸ä¼šå¸¦ +æ¥ä»»ä½•è¿è¡Œæ—¶å¼€é”€ã€‚然而,这ç§æ–¹æ³•ä¾æ—§å…许 C 编译器查看å—内的代ç ï¼Œå¹¶æ£€æŸ¥å®ƒçš„æ£ +确性 (è¯æ³•ï¼Œç±»åž‹ï¼Œç¬¦å·å¼•ç”¨ï¼Œç‰ç‰)ã€‚å› æ¤ï¼Œå¦‚æžœæ¡ä»¶ä¸æ»¡è¶³ï¼Œä»£ç å—内的引用符å·å°± +ä¸å˜åœ¨æ—¶ï¼Œä½ 还是必须去用 #ifdef。 + +在任何有æ„义的 #if 或 #ifdef å—的末尾 (è¶…è¿‡å‡ è¡Œçš„),在 #endif åŒä¸€è¡Œçš„åŽé¢å†™ä¸‹ +注解,注释这个æ¡ä»¶è¡¨è¾¾å¼ã€‚例如: + +.. code-block:: c + + #ifdef CONFIG_SOMETHING + ... + #endif /* CONFIG_SOMETHING */ + + +附录 I) å‚考 +------------------- + +The C Programming Language, 第二版 +作者:Brian W. Kernighan å’Œ Denni M. Ritchie. +Prentice Hall, Inc., 1988. +ISBN 0-13-110362-8 (软皮), 0-13-110370-9 (硬皮). + +The Practice of Programming +作者:Brian W. Kernighan å’Œ Rob Pike. +Addison-Wesley, Inc., 1999. +ISBN 0-201-61586-X. + +GNU 手册 - éµå¾ª K&R æ ‡å‡†å’Œæ¤æ–‡æœ¬ - cpp, gcc, gcc internals and indent, +都å¯ä»¥ä»Ž http://www.gnu.org/manual/ 找到 + +WG14 是 C è¯è¨€çš„å›½é™…æ ‡å‡†åŒ–å·¥ä½œç»„ï¼ŒURL: http://www.open-std.org/JTC1/SC22/WG14/ + +Kernel process/coding-style.rst,作者 greg@kroah.com å‘表于 OLS 2002: +http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_talk/html/ diff --git a/Documentation/translations/zh_CN/index.rst b/Documentation/translations/zh_CN/index.rst new file mode 100644 index 000000000000..75956d669962 --- /dev/null +++ b/Documentation/translations/zh_CN/index.rst @@ -0,0 +1,12 @@ +.. raw:: latex + + \renewcommand\thesection* + \renewcommand\thesubsection* + +Chinese translations +==================== + +.. toctree:: + :maxdepth: 1 + + coding-style diff --git a/Documentation/usb/power-management.txt b/Documentation/usb/power-management.txt index 0a94ffe17ab6..00e706997130 100644 --- a/Documentation/usb/power-management.txt +++ b/Documentation/usb/power-management.txt @@ -543,7 +543,7 @@ relevant attribute files are usb2_hardware_lpm and usb3_hardware_lpm. When a USB 3.0 lpm-capable device is plugged in to a xHCI host which supports link PM, it will check if U1 and U2 exit latencies have been set in the BOS - descriptor; if the check is is passed and the host + descriptor; if the check is passed and the host supports USB3 hardware LPM, USB3 hardware LPM will be enabled for the device and these files will be created. The files hold a string value (enable or disable) diff --git a/Documentation/vm/transhuge.txt b/Documentation/vm/transhuge.txt index c4171e4519c2..f2e739545e74 100644 --- a/Documentation/vm/transhuge.txt +++ b/Documentation/vm/transhuge.txt @@ -296,7 +296,7 @@ thp_split_page is incremented every time a huge page is split into base reason is that a huge page is old and is being reclaimed. This action implies splitting all PMD the page mapped with. -thp_split_page_failed is is incremented if kernel fails to split huge +thp_split_page_failed is incremented if kernel fails to split huge page. This can happen if the page was pinned by somebody. thp_deferred_split_page is incremented when a huge page is put onto split |