diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2017-07-05 12:31:59 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2017-07-05 12:31:59 -0700 |
commit | 5518b69b76680a4f2df96b1deca260059db0c2de (patch) | |
tree | f33cd1519c8efb4590500f2f9617400be233238c /Documentation | |
parent | 8ad06e56dcbc1984ef0ff8f6e3c19982c5809f73 (diff) | |
parent | 0e72582270c07850b92cac351c8b97d4f9c123b9 (diff) | |
download | linux-5518b69b76680a4f2df96b1deca260059db0c2de.tar.bz2 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
"Reasonably busy this cycle, but perhaps not as busy as in the 4.12
merge window:
1) Several optimizations for UDP processing under high load from
Paolo Abeni.
2) Support pacing internally in TCP when using the sch_fq packet
scheduler for this is not practical. From Eric Dumazet.
3) Support mutliple filter chains per qdisc, from Jiri Pirko.
4) Move to 1ms TCP timestamp clock, from Eric Dumazet.
5) Add batch dequeueing to vhost_net, from Jason Wang.
6) Flesh out more completely SCTP checksum offload support, from
Davide Caratti.
7) More plumbing of extended netlink ACKs, from David Ahern, Pablo
Neira Ayuso, and Matthias Schiffer.
8) Add devlink support to nfp driver, from Simon Horman.
9) Add RTM_F_FIB_MATCH flag to RTM_GETROUTE queries, from Roopa
Prabhu.
10) Add stack depth tracking to BPF verifier and use this information
in the various eBPF JITs. From Alexei Starovoitov.
11) Support XDP on qed device VFs, from Yuval Mintz.
12) Introduce BPF PROG ID for better introspection of installed BPF
programs. From Martin KaFai Lau.
13) Add bpf_set_hash helper for TC bpf programs, from Daniel Borkmann.
14) For loads, allow narrower accesses in bpf verifier checking, from
Yonghong Song.
15) Support MIPS in the BPF selftests and samples infrastructure, the
MIPS eBPF JIT will be merged in via the MIPS GIT tree. From David
Daney.
16) Support kernel based TLS, from Dave Watson and others.
17) Remove completely DST garbage collection, from Wei Wang.
18) Allow installing TCP MD5 rules using prefixes, from Ivan
Delalande.
19) Add XDP support to Intel i40e driver, from Björn Töpel
20) Add support for TC flower offload in nfp driver, from Simon
Horman, Pieter Jansen van Vuuren, Benjamin LaHaise, Jakub
Kicinski, and Bert van Leeuwen.
21) IPSEC offloading support in mlx5, from Ilan Tayari.
22) Add HW PTP support to macb driver, from Rafal Ozieblo.
23) Networking refcount_t conversions, From Elena Reshetova.
24) Add sock_ops support to BPF, from Lawrence Brako. This is useful
for tuning the TCP sockopt settings of a group of applications,
currently via CGROUPs"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1899 commits)
net: phy: dp83867: add workaround for incorrect RX_CTRL pin strap
dt-bindings: phy: dp83867: provide a workaround for incorrect RX_CTRL pin strap
cxgb4: Support for get_ts_info ethtool method
cxgb4: Add PTP Hardware Clock (PHC) support
cxgb4: time stamping interface for PTP
nfp: default to chained metadata prepend format
nfp: remove legacy MAC address lookup
nfp: improve order of interfaces in breakout mode
net: macb: remove extraneous return when MACB_EXT_DESC is defined
bpf: add missing break in for the TCP_BPF_SNDCWND_CLAMP case
bpf: fix return in load_bpf_file
mpls: fix rtm policy in mpls_getroute
net, ax25: convert ax25_cb.refcount from atomic_t to refcount_t
net, ax25: convert ax25_route.refcount from atomic_t to refcount_t
net, ax25: convert ax25_uid_assoc.refcount from atomic_t to refcount_t
net, sctp: convert sctp_ep_common.refcnt from atomic_t to refcount_t
net, sctp: convert sctp_transport.refcnt from atomic_t to refcount_t
net, sctp: convert sctp_chunk.refcnt from atomic_t to refcount_t
net, sctp: convert sctp_datamsg.refcnt from atomic_t to refcount_t
net, sctp: convert sctp_auth_bytes.refcnt from atomic_t to refcount_t
...
Diffstat (limited to 'Documentation')
26 files changed, 673 insertions, 233 deletions
diff --git a/Documentation/ABI/testing/sysfs-class-net b/Documentation/ABI/testing/sysfs-class-net index 668604fc8e06..6856da99b6f7 100644 --- a/Documentation/ABI/testing/sysfs-class-net +++ b/Documentation/ABI/testing/sysfs-class-net @@ -251,3 +251,11 @@ Contact: netdev@vger.kernel.org Description: Indicates the unique physical switch identifier of a switch this port belongs to, as a string. + +What: /sys/class/net/<iface>/phydev +Date: May 2017 +KernelVersion: 4.13 +Contact: netdev@vger.kernel.org +Description: + Symbolic link to the PHY device this network device is attached + to. diff --git a/Documentation/ABI/testing/sysfs-class-net-phydev b/Documentation/ABI/testing/sysfs-class-net-phydev new file mode 100644 index 000000000000..6ebabfb27912 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-class-net-phydev @@ -0,0 +1,36 @@ +What: /sys/class/mdio_bus/<bus>/<device>/attached_dev +Date: May 2017 +KernelVersion: 4.13 +Contact: netdev@vger.kernel.org +Description: + Symbolic link to the network device this PHY device is + attached to. + +What: /sys/class/mdio_bus/<bus>/<device>/phy_has_fixups +Date: February 2014 +KernelVersion: 3.15 +Contact: netdev@vger.kernel.org +Description: + Boolean value indicating whether the PHY device has + any fixups registered against it (phy_register_fixup) + +What: /sys/class/mdio_bus/<bus>/<device>/phy_id +Date: November 2012 +KernelVersion: 3.8 +Contact: netdev@vger.kernel.org +Description: + 32-bit hexadecimal value corresponding to the PHY device's OUI, + model and revision number. + +What: /sys/class/mdio_bus/<bus>/<device>/phy_interface +Date: February 2014 +KernelVersion: 3.15 +Contact: netdev@vger.kernel.org +Description: + String value indicating the PHY interface, possible + values are:. + <empty> (not available), mii, gmii, sgmii, tbi, rev-mii, + rmii, rgmii, rgmii-id, rgmii-rxid, rgmii-txid, rtbi, smii + xgmii, moca, qsgmii, trgmii, 1000base-x, 2500base-x, rxaui, + xaui, 10gbase-kr, unknown + diff --git a/Documentation/devicetree/bindings/misc/allwinner,syscon.txt b/Documentation/devicetree/bindings/misc/allwinner,syscon.txt new file mode 100644 index 000000000000..31494a24fe69 --- /dev/null +++ b/Documentation/devicetree/bindings/misc/allwinner,syscon.txt @@ -0,0 +1,20 @@ +* Allwinner sun8i system controller + +This file describes the bindings for the system controller present in +Allwinner SoC H3, A83T and A64. +The principal function of this syscon is to control EMAC PHY choice and +config. + +Required properties for the system controller: +- reg: address and length of the register for the device. +- compatible: should be "syscon" and one of the following string: + "allwinner,sun8i-h3-system-controller" + "allwinner,sun8i-v3s-system-controller" + "allwinner,sun50i-a64-system-controller" + "allwinner,sun8i-a83t-system-controller" + +Example: +syscon: syscon@1c00000 { + compatible = "allwinner,sun8i-h3-system-controller", "syscon"; + reg = <0x01c00000 0x1000>; +}; diff --git a/Documentation/devicetree/bindings/net/cortina.txt b/Documentation/devicetree/bindings/net/cortina.txt new file mode 100644 index 000000000000..40d0bd984113 --- /dev/null +++ b/Documentation/devicetree/bindings/net/cortina.txt @@ -0,0 +1,21 @@ +Cortina Phy Driver Device Tree Bindings +--------------------------------------- + +CORTINA is a registered trademark of Cortina Systems, Inc. + +The driver supports the Cortina Electronic Dispersion Compensation (EDC) +devices, equipped with clock and data recovery (CDR) circuits. These +devices make use of registers that are not compatible with Clause 45 or +Clause 22, therefore they need to be described using the +"ethernet-phy-id" compatible. + +Since the driver only implements polling mode support, interrupts info +can be skipped. + +Example (CS4340 phy): + mdio { + cs4340_phy@10 { + compatible = "ethernet-phy-id13e5.1002"; + reg = <0x10>; + }; + }; diff --git a/Documentation/devicetree/bindings/net/dsa/b53.txt b/Documentation/devicetree/bindings/net/dsa/b53.txt index 8ec2ca21adeb..8acf51a4dfa8 100644 --- a/Documentation/devicetree/bindings/net/dsa/b53.txt +++ b/Documentation/devicetree/bindings/net/dsa/b53.txt @@ -13,6 +13,9 @@ Required properties: "brcm,bcm5397" "brcm,bcm5398" + For the BCM11360 SoC, must be: + "brcm,bcm11360-srab" and the mandatory "brcm,cygnus-srab" string + For the BCM5310x SoCs with an integrated switch, must be one of: "brcm,bcm53010-srab" "brcm,bcm53011-srab" diff --git a/Documentation/devicetree/bindings/net/dsa/ksz.txt b/Documentation/devicetree/bindings/net/dsa/ksz.txt new file mode 100644 index 000000000000..0ab8b39d0b30 --- /dev/null +++ b/Documentation/devicetree/bindings/net/dsa/ksz.txt @@ -0,0 +1,72 @@ +Microchip KSZ Series Ethernet switches +================================== + +Required properties: + +- compatible: For external switch chips, compatible string must be exactly one + of: "microchip,ksz9477" + +See Documentation/devicetree/bindings/dsa/dsa.txt for a list of additional +required and optional properties. + +Examples: + +Ethernet switch connected via SPI to the host, CPU port wired to eth0: + + eth0: ethernet@10001000 { + fixed-link { + speed = <1000>; + full-duplex; + }; + }; + + spi1: spi@f8008000 { + pinctrl-0 = <&pinctrl_spi_ksz>; + cs-gpios = <&pioC 25 0>; + id = <1>; + status = "okay"; + + ksz9477: ksz9477@0 { + compatible = "microchip,ksz9477"; + reg = <0>; + + spi-max-frequency = <44000000>; + spi-cpha; + spi-cpol; + + status = "okay"; + ports { + #address-cells = <1>; + #size-cells = <0>; + port@0 { + reg = <0>; + label = "lan1"; + }; + port@1 { + reg = <1>; + label = "lan2"; + }; + port@2 { + reg = <2>; + label = "lan3"; + }; + port@3 { + reg = <3>; + label = "lan4"; + }; + port@4 { + reg = <4>; + label = "lan5"; + }; + port@5 { + reg = <5>; + label = "cpu"; + ethernet = <ð0>; + fixed-link { + speed = <1000>; + full-duplex; + }; + }; + }; + }; + }; diff --git a/Documentation/devicetree/bindings/net/dwmac-sun8i.txt b/Documentation/devicetree/bindings/net/dwmac-sun8i.txt new file mode 100644 index 000000000000..725f3b187886 --- /dev/null +++ b/Documentation/devicetree/bindings/net/dwmac-sun8i.txt @@ -0,0 +1,84 @@ +* Allwinner sun8i GMAC ethernet controller + +This device is a platform glue layer for stmmac. +Please see stmmac.txt for the other unchanged properties. + +Required properties: +- compatible: should be one of the following string: + "allwinner,sun8i-a83t-emac" + "allwinner,sun8i-h3-emac" + "allwinner,sun8i-v3s-emac" + "allwinner,sun50i-a64-emac" +- reg: address and length of the register for the device. +- interrupts: interrupt for the device +- interrupt-names: should be "macirq" +- clocks: A phandle to the reference clock for this device +- clock-names: should be "stmmaceth" +- resets: A phandle to the reset control for this device +- reset-names: should be "stmmaceth" +- phy-mode: See ethernet.txt +- phy-handle: See ethernet.txt +- #address-cells: shall be 1 +- #size-cells: shall be 0 +- syscon: A phandle to the syscon of the SoC with one of the following + compatible string: + - allwinner,sun8i-h3-system-controller + - allwinner,sun8i-v3s-system-controller + - allwinner,sun50i-a64-system-controller + - allwinner,sun8i-a83t-system-controller + +Optional properties: +- allwinner,tx-delay-ps: TX clock delay chain value in ps. Range value is 0-700. Default is 0) +- allwinner,rx-delay-ps: RX clock delay chain value in ps. Range value is 0-3100. Default is 0) +Both delay properties need to be a multiple of 100. They control the delay for +external PHY. + +Optional properties for the following compatibles: + - "allwinner,sun8i-h3-emac", + - "allwinner,sun8i-v3s-emac": +- allwinner,leds-active-low: EPHY LEDs are active low + +Required child node of emac: +- mdio bus node: should be named mdio + +Required properties of the mdio node: +- #address-cells: shall be 1 +- #size-cells: shall be 0 + +The device node referenced by "phy" or "phy-handle" should be a child node +of the mdio node. See phy.txt for the generic PHY bindings. + +Required properties of the phy node with the following compatibles: + - "allwinner,sun8i-h3-emac", + - "allwinner,sun8i-v3s-emac": +- clocks: a phandle to the reference clock for the EPHY +- resets: a phandle to the reset control for the EPHY + +Example: + +emac: ethernet@1c0b000 { + compatible = "allwinner,sun8i-h3-emac"; + syscon = <&syscon>; + reg = <0x01c0b000 0x104>; + interrupts = <GIC_SPI 82 IRQ_TYPE_LEVEL_HIGH>; + interrupt-names = "macirq"; + resets = <&ccu RST_BUS_EMAC>; + reset-names = "stmmaceth"; + clocks = <&ccu CLK_BUS_EMAC>; + clock-names = "stmmaceth"; + #address-cells = <1>; + #size-cells = <0>; + + phy-handle = <&int_mii_phy>; + phy-mode = "mii"; + allwinner,leds-active-low; + mdio: mdio { + #address-cells = <1>; + #size-cells = <0>; + int_mii_phy: ethernet-phy@1 { + reg = <1>; + clocks = <&ccu CLK_BUS_EPHY>; + resets = <&ccu RST_BUS_EPHY>; + }; + }; +}; diff --git a/Documentation/devicetree/bindings/net/ethernet.txt b/Documentation/devicetree/bindings/net/ethernet.txt index 3a6916909d90..edd7fd2bbbf9 100644 --- a/Documentation/devicetree/bindings/net/ethernet.txt +++ b/Documentation/devicetree/bindings/net/ethernet.txt @@ -11,6 +11,7 @@ The following properties are common to the Ethernet controllers: the maximum frame size (there's contradiction in ePAPR). - phy-mode: string, operation mode of the PHY interface. This is now a de-facto standard property; supported values are: + * "internal" * "mii" * "gmii" * "sgmii" @@ -32,6 +33,8 @@ The following properties are common to the Ethernet controllers: * "2000base-x", * "2500base-x", * "rxaui" + * "xaui" + * "10gbase-kr" (10GBASE-KR, XFI, SFI) - phy-connection-type: the same as "phy-mode" property but described in ePAPR; - phy-handle: phandle, specifies a reference to a node representing a PHY device; this property is described in ePAPR and so preferred; diff --git a/Documentation/devicetree/bindings/net/macb.txt b/Documentation/devicetree/bindings/net/macb.txt index 1506e948610c..27966ae741e0 100644 --- a/Documentation/devicetree/bindings/net/macb.txt +++ b/Documentation/devicetree/bindings/net/macb.txt @@ -22,6 +22,7 @@ Required properties: Required elements: 'pclk', 'hclk' Optional elements: 'tx_clk' Optional elements: 'rx_clk' applies to cdns,zynqmp-gem + Optional elements: 'tsu_clk' - clocks: Phandles to input clocks. Optional properties for PHY child node: diff --git a/Documentation/devicetree/bindings/net/marvell-orion-mdio.txt b/Documentation/devicetree/bindings/net/marvell-orion-mdio.txt index ccdabdcc8618..42cd81090a2c 100644 --- a/Documentation/devicetree/bindings/net/marvell-orion-mdio.txt +++ b/Documentation/devicetree/bindings/net/marvell-orion-mdio.txt @@ -1,12 +1,14 @@ * Marvell MDIO Ethernet Controller interface The Ethernet controllers of the Marvel Kirkwood, Dove, Orion5x, -MV78xx0, Armada 370 and Armada XP have an identical unit that provides -an interface with the MDIO bus. This driver handles this MDIO -interface. +MV78xx0, Armada 370, Armada XP, Armada 7k and Armada 8k have an +identical unit that provides an interface with the MDIO bus. +Additionally, Armada 7k and Armada 8k has a second unit which +provides an interface with the xMDIO bus. This driver handles +these interfaces. Required properties: -- compatible: "marvell,orion-mdio" +- compatible: "marvell,orion-mdio" or "marvell,xmdio" - reg: address and length of the MDIO registers. When an interrupt is not present, the length is the size of the SMI register (4 bytes) otherwise it must be 0x84 bytes to cover the interrupt control diff --git a/Documentation/devicetree/bindings/net/nfc/trf7970a.txt b/Documentation/devicetree/bindings/net/nfc/trf7970a.txt index c627bbb3009e..60c833d62181 100644 --- a/Documentation/devicetree/bindings/net/nfc/trf7970a.txt +++ b/Documentation/devicetree/bindings/net/nfc/trf7970a.txt @@ -13,14 +13,10 @@ Optional SoC Specific Properties: - pinctrl-names: Contains only one value - "default". - pintctrl-0: Specifies the pin control groups used for this controller. - autosuspend-delay: Specify autosuspend delay in milliseconds. -- vin-voltage-override: Specify voltage of VIN pin in microvolts. - irq-status-read-quirk: Specify that the trf7970a being used has the "IRQ Status Read" erratum. - en2-rf-quirk: Specify that the trf7970a being used has the "EN2 RF" erratum. -- t5t-rmb-extra-byte-quirk: Specify that the trf7970a has the erratum - where an extra byte is returned by Read Multiple Block commands issued - to Type 5 tags. - vdd-io-supply: Regulator specifying voltage for vdd-io - clock-frequency: Set to specify that the input frequency to the trf7970a is 13560000Hz or 27120000Hz @@ -37,15 +33,13 @@ Example (for ARM-based BeagleBone with TRF7970A on SPI1): spi-max-frequency = <2000000>; interrupt-parent = <&gpio2>; interrupts = <14 0>; - ti,enable-gpios = <&gpio2 2 GPIO_ACTIVE_LOW>, - <&gpio2 5 GPIO_ACTIVE_LOW>; + ti,enable-gpios = <&gpio2 2 GPIO_ACTIVE_HIGH>, + <&gpio2 5 GPIO_ACTIVE_HIGH>; vin-supply = <&ldo3_reg>; - vin-voltage-override = <5000000>; vdd-io-supply = <&ldo2_reg>; autosuspend-delay = <30000>; irq-status-read-quirk; en2-rf-quirk; - t5t-rmb-extra-byte-quirk; clock-frequency = <27120000>; status = "okay"; }; diff --git a/Documentation/devicetree/bindings/net/qca,qca7000.txt b/Documentation/devicetree/bindings/net/qca,qca7000.txt new file mode 100644 index 000000000000..6d9efb2eb9a5 --- /dev/null +++ b/Documentation/devicetree/bindings/net/qca,qca7000.txt @@ -0,0 +1,88 @@ +* Qualcomm QCA7000 + +The QCA7000 is a serial-to-powerline bridge with a host interface which could +be configured either as SPI or UART slave. This configuration is done by +the QCA7000 firmware. + +(a) Ethernet over SPI + +In order to use the QCA7000 as SPI device it must be defined as a child of a +SPI master in the device tree. + +Required properties: +- compatible : Should be "qca,qca7000" +- reg : Should specify the SPI chip select +- interrupts : The first cell should specify the index of the source + interrupt and the second cell should specify the trigger + type as rising edge +- spi-cpha : Must be set +- spi-cpol : Must be set + +Optional properties: +- interrupt-parent : Specify the pHandle of the source interrupt +- spi-max-frequency : Maximum frequency of the SPI bus the chip can operate at. + Numbers smaller than 1000000 or greater than 16000000 + are invalid. Missing the property will set the SPI + frequency to 8000000 Hertz. +- local-mac-address : see ./ethernet.txt +- qca,legacy-mode : Set the SPI data transfer of the QCA7000 to legacy mode. + In this mode the SPI master must toggle the chip select + between each data word. In burst mode these gaps aren't + necessary, which is faster. This setting depends on how + the QCA7000 is setup via GPIO pin strapping. If the + property is missing the driver defaults to burst mode. + +SPI Example: + +/* Freescale i.MX28 SPI master*/ +ssp2: spi@80014000 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "fsl,imx28-spi"; + pinctrl-names = "default"; + pinctrl-0 = <&spi2_pins_a>; + status = "okay"; + + qca7000: ethernet@0 { + compatible = "qca,qca7000"; + reg = <0x0>; + interrupt-parent = <&gpio3>; /* GPIO Bank 3 */ + interrupts = <25 0x1>; /* Index: 25, rising edge */ + spi-cpha; /* SPI mode: CPHA=1 */ + spi-cpol; /* SPI mode: CPOL=1 */ + spi-max-frequency = <8000000>; /* freq: 8 MHz */ + local-mac-address = [ A0 B0 C0 D0 E0 F0 ]; + }; +}; + +(b) Ethernet over UART + +In order to use the QCA7000 as UART slave it must be defined as a child of a +UART master in the device tree. It is possible to preconfigure the UART +settings of the QCA7000 firmware, but it's not possible to change them during +runtime. + +Required properties: +- compatible : Should be "qca,qca7000" + +Optional properties: +- local-mac-address : see ./ethernet.txt +- current-speed : current baud rate of QCA7000 which defaults to 115200 + if absent, see also ../serial/slave-device.txt + +UART Example: + +/* Freescale i.MX28 UART */ +auart0: serial@8006a000 { + compatible = "fsl,imx28-auart", "fsl,imx23-auart"; + reg = <0x8006a000 0x2000>; + pinctrl-names = "default"; + pinctrl-0 = <&auart0_2pins_a>; + status = "okay"; + + qca7000: ethernet { + compatible = "qca,qca7000"; + local-mac-address = [ A0 B0 C0 D0 E0 F0 ]; + current-speed = <38400>; + }; +}; diff --git a/Documentation/devicetree/bindings/net/qca-qca7000-spi.txt b/Documentation/devicetree/bindings/net/qca-qca7000-spi.txt deleted file mode 100644 index c74989c0d8ac..000000000000 --- a/Documentation/devicetree/bindings/net/qca-qca7000-spi.txt +++ /dev/null @@ -1,47 +0,0 @@ -* Qualcomm QCA7000 (Ethernet over SPI protocol) - -Note: The QCA7000 is useable as a SPI device. In this case it must be defined -as a child of a SPI master in the device tree. - -Required properties: -- compatible : Should be "qca,qca7000" -- reg : Should specify the SPI chip select -- interrupts : The first cell should specify the index of the source interrupt - and the second cell should specify the trigger type as rising edge -- spi-cpha : Must be set -- spi-cpol: Must be set - -Optional properties: -- interrupt-parent : Specify the pHandle of the source interrupt -- spi-max-frequency : Maximum frequency of the SPI bus the chip can operate at. - Numbers smaller than 1000000 or greater than 16000000 are invalid. Missing - the property will set the SPI frequency to 8000000 Hertz. -- local-mac-address: 6 bytes, MAC address -- qca,legacy-mode : Set the SPI data transfer of the QCA7000 to legacy mode. - In this mode the SPI master must toggle the chip select between each data - word. In burst mode these gaps aren't necessary, which is faster. - This setting depends on how the QCA7000 is setup via GPIO pin strapping. - If the property is missing the driver defaults to burst mode. - -Example: - -/* Freescale i.MX28 SPI master*/ -ssp2: spi@80014000 { - #address-cells = <1>; - #size-cells = <0>; - compatible = "fsl,imx28-spi"; - pinctrl-names = "default"; - pinctrl-0 = <&spi2_pins_a>; - status = "okay"; - - qca7000: ethernet@0 { - compatible = "qca,qca7000"; - reg = <0x0>; - interrupt-parent = <&gpio3>; /* GPIO Bank 3 */ - interrupts = <25 0x1>; /* Index: 25, rising edge */ - spi-cpha; /* SPI mode: CPHA=1 */ - spi-cpol; /* SPI mode: CPOL=1 */ - spi-max-frequency = <8000000>; /* freq: 8 MHz */ - local-mac-address = [ A0 B0 C0 D0 E0 F0 ]; - }; -}; diff --git a/Documentation/devicetree/bindings/net/ti,dp83867.txt b/Documentation/devicetree/bindings/net/ti,dp83867.txt index afe9630a5e7d..02c4353b5cf2 100644 --- a/Documentation/devicetree/bindings/net/ti,dp83867.txt +++ b/Documentation/devicetree/bindings/net/ti,dp83867.txt @@ -18,6 +18,13 @@ Optional property: - ti,max-output-impedance - MAC Interface Impedance control to set the programmable output impedance to maximum value (70 ohms). + - ti,dp83867-rxctrl-strap-quirk - This denotes the fact that the + board has RX_DV/RX_CTRL pin strapped in + mode 1 or 2. To ensure PHY operation, + there are specific actions that + software needs to take when this pin is + strapped in these modes. See data manual + for details. Note: ti,min-output-impedance and ti,max-output-impedance are mutually exclusive. When both properties are present ti,max-output-impedance diff --git a/Documentation/devicetree/bindings/net/ti,wilink-st.txt b/Documentation/devicetree/bindings/net/ti,wilink-st.txt index cbad73a84ac4..1649c1f66b07 100644 --- a/Documentation/devicetree/bindings/net/ti,wilink-st.txt +++ b/Documentation/devicetree/bindings/net/ti,wilink-st.txt @@ -14,6 +14,12 @@ Required properties: - compatible: should be one of the following: "ti,wl1271-st" "ti,wl1273-st" + "ti,wl1281-st" + "ti,wl1283-st" + "ti,wl1285-st" + "ti,wl1801-st" + "ti,wl1805-st" + "ti,wl1807-st" "ti,wl1831-st" "ti,wl1835-st" "ti,wl1837-st" @@ -22,6 +28,10 @@ Optional properties: - enable-gpios : GPIO signal controlling enabling of BT. Active high. - vio-supply : Vio input supply (1.8V) - vbat-supply : Vbat input supply (2.9-4.8V) + - clocks : Must contain an entry, for each entry in clock-names. + See ../clocks/clock-bindings.txt for details. + - clock-names : Must include the following entry: + "ext_clock" (External clock provided to the TI combo chip). Example: @@ -31,5 +41,7 @@ Example: bluetooth { compatible = "ti,wl1835-st"; enable-gpios = <&gpio1 7 GPIO_ACTIVE_HIGH>; + clocks = <&clk32k_wl18xx>; + clock-names = "ext_clock"; }; }; diff --git a/Documentation/devicetree/bindings/net/wireless/ti,wlcore.txt b/Documentation/devicetree/bindings/net/wireless/ti,wlcore.txt index 2a3d90de18ee..7b2cbb14113e 100644 --- a/Documentation/devicetree/bindings/net/wireless/ti,wlcore.txt +++ b/Documentation/devicetree/bindings/net/wireless/ti,wlcore.txt @@ -10,6 +10,7 @@ Required properties: * "ti,wl1273" * "ti,wl1281" * "ti,wl1283" + * "ti,wl1285" * "ti,wl1801" * "ti,wl1805" * "ti,wl1807" diff --git a/Documentation/devicetree/bindings/ptp/brcm,ptp-dte.txt b/Documentation/devicetree/bindings/ptp/brcm,ptp-dte.txt new file mode 100644 index 000000000000..07590bcdad15 --- /dev/null +++ b/Documentation/devicetree/bindings/ptp/brcm,ptp-dte.txt @@ -0,0 +1,13 @@ +* Broadcom Digital Timing Engine(DTE) based PTP clock driver + +Required properties: +- compatible: should be "brcm,ptp-dte" +- reg: address and length of the DTE block's NCO registers + +Example: + +ptp_dte: ptp_dte@180af650 { + compatible = "brcm,ptp-dte"; + reg = <0x180af650 0x10>; + status = "okay"; +}; diff --git a/Documentation/devicetree/bindings/serial/slave-device.txt b/Documentation/devicetree/bindings/serial/slave-device.txt index f66037928f5f..40110e019620 100644 --- a/Documentation/devicetree/bindings/serial/slave-device.txt +++ b/Documentation/devicetree/bindings/serial/slave-device.txt @@ -21,6 +21,15 @@ Optional Properties: can support. For example, a particular board has some signal quality issue or the host processor can't support higher baud rates. +- current-speed : The current baud rate the device operates at. This should + only be present in case a driver has no chance to know + the baud rate of the slave device. + Examples: + * device supports auto-baud + * the rate is setup by a bootloader and there is no + way to reset the device + * device baud rate is configured by its firmware but + there is no way to request the actual settings Example: diff --git a/Documentation/networking/checksum-offloads.txt b/Documentation/networking/checksum-offloads.txt index 56e36861245f..d52d191bbb0c 100644 --- a/Documentation/networking/checksum-offloads.txt +++ b/Documentation/networking/checksum-offloads.txt @@ -35,6 +35,9 @@ This interface only allows a single checksum to be offloaded. Where encapsulation is used, the packet may have multiple checksum fields in different header layers, and the rest will have to be handled by another mechanism such as LCO or RCO. +CRC32c can also be offloaded using this interface, by means of filling + skb->csum_start and skb->csum_offset as described above, and setting + skb->csum_not_inet: see skbuff.h comment (section 'D') for more details. No offloading of the IP header checksum is performed; it is always done in software. This is OK because when we build the IP header, we obviously have it in cache, so summing it isn't expensive. It's also rather short. @@ -49,9 +52,9 @@ A driver declares its offload capabilities in netdev->hw_features; see and csum_offset given in the SKB; if it tries to deduce these itself in hardware (as some NICs do) the driver should check that the values in the SKB match those which the hardware will deduce, and if not, fall back to - checksumming in software instead (with skb_checksum_help or one of the - skb_csum_off_chk* functions as mentioned in include/linux/skbuff.h). This - is a pain, but that's what you get when hardware tries to be clever. + checksumming in software instead (with skb_csum_hwoffload_help() or one of + the skb_checksum_help() / skb_crc32c_csum_help functions, as mentioned in + include/linux/skbuff.h). The stack should, for the most part, assume that checksum offload is supported by the underlying device. The only place that should check is @@ -60,7 +63,7 @@ The stack should, for the most part, assume that checksum offload is may include other offloads besides TX Checksum Offload) and, if they are not supported or enabled on the device (determined by netdev->features), performs the corresponding offload in software. In the case of TX - Checksum Offload, that means calling skb_checksum_help(skb). + Checksum Offload, that means calling skb_csum_hwoffload_help(skb, features). LCO: Local Checksum Offload diff --git a/Documentation/networking/i40evf.txt b/Documentation/networking/i40evf.txt index 21e41271af79..e9b3035b95d0 100644 --- a/Documentation/networking/i40evf.txt +++ b/Documentation/networking/i40evf.txt @@ -1,8 +1,8 @@ Linux* Base Driver for Intel(R) Network Connection ================================================== -Intel XL710 X710 Virtual Function Linux driver. -Copyright(c) 2013 Intel Corporation. +Intel Ethernet Adaptive Virtual Function Linux driver. +Copyright(c) 2013-2017 Intel Corporation. Contents ======== @@ -11,19 +11,26 @@ Contents - Known Issues/Troubleshooting - Support -This file describes the i40evf Linux* Base Driver for the Intel(R) XL710 -X710 Virtual Function. +This file describes the i40evf Linux* Base Driver. -The i40evf driver supports XL710 and X710 virtual function devices that -can only be activated on kernels with CONFIG_PCI_IOV enabled. +The i40evf driver supports the below mentioned virtual function +devices and can only be activated on kernels running the i40e or +newer Physical Function (PF) driver compiled with CONFIG_PCI_IOV. +The i40evf driver requires CONFIG_PCI_MSI to be enabled. The guest OS loading the i40evf driver must support MSI-X interrupts. +Supported Hardware +================== +Intel XL710 X710 Virtual Function +Intel Ethernet Adaptive Virtual Function +Intel X722 Virtual Function + Identifying Your Adapter ======================== -For more information on how to identify your adapter, go to the Adapter & -Driver ID Guide at: +For more information on how to identify your adapter, go to the +Adapter & Driver ID Guide at: http://support.intel.com/support/go/network/adapter/idguide.htm diff --git a/Documentation/networking/ipvlan.txt b/Documentation/networking/ipvlan.txt index 24196cef7c91..1fe42a874aae 100644 --- a/Documentation/networking/ipvlan.txt +++ b/Documentation/networking/ipvlan.txt @@ -22,9 +22,9 @@ The driver can be built into the kernel (CONFIG_IPVLAN=y) or as a module There are no module parameters for this driver and it can be configured using IProute2/ip utility. - ip link add link <master-dev> <slave-dev> type ipvlan mode { l2 | l3 | l3s } + ip link add link <master-dev> name <slave-dev> type ipvlan mode { l2 | l3 | l3s } - e.g. ip link add link ipvl0 eth0 type ipvlan mode l2 + e.g. ip link add link eth0 name ipvl0 type ipvlan mode l2 4. Operating modes: diff --git a/Documentation/networking/phy.txt b/Documentation/networking/phy.txt index 16f90d817224..bdec0f700bc1 100644 --- a/Documentation/networking/phy.txt +++ b/Documentation/networking/phy.txt @@ -295,7 +295,6 @@ Doing it all yourself settings in the PHY. int phy_ethtool_sset(struct phy_device *phydev, struct ethtool_cmd *cmd); - int phy_ethtool_gset(struct phy_device *phydev, struct ethtool_cmd *cmd); Ethtool convenience functions. diff --git a/Documentation/networking/policy-routing.txt b/Documentation/networking/policy-routing.txt deleted file mode 100644 index 36f6936d7f21..000000000000 --- a/Documentation/networking/policy-routing.txt +++ /dev/null @@ -1,150 +0,0 @@ -Classes -------- - - "Class" is a complete routing table in common sense. - I.e. it is tree of nodes (destination prefix, tos, metric) - with attached information: gateway, device etc. - This tree is looked up as specified in RFC1812 5.2.4.3 - 1. Basic match - 2. Longest match - 3. Weak TOS. - 4. Metric. (should not be in kernel space, but they are) - 5. Additional pruning rules. (not in kernel space). - - We have two special type of nodes: - REJECT - abort route lookup and return an error value. - THROW - abort route lookup in this class. - - - Currently the number of classes is limited to 255 - (0 is reserved for "not specified class") - - Three classes are builtin: - - RT_CLASS_LOCAL=255 - local interface addresses, - broadcasts, nat addresses. - - RT_CLASS_MAIN=254 - all normal routes are put there - by default. - - RT_CLASS_DEFAULT=253 - if ip_fib_model==1, then - normal default routes are put there, if ip_fib_model==2 - all gateway routes are put there. - - -Rules ------ - Rule is a record of (src prefix, src interface, tos, dst prefix) - with attached information. - - Rule types: - RTP_ROUTE - lookup in attached class - RTP_NAT - lookup in attached class and if a match is found, - translate packet source address. - RTP_MASQUERADE - lookup in attached class and if a match is found, - masquerade packet as sourced by us. - RTP_DROP - silently drop the packet. - RTP_REJECT - drop the packet and send ICMP NET UNREACHABLE. - RTP_PROHIBIT - drop the packet and send ICMP COMM. ADM. PROHIBITED. - - Rule flags: - RTRF_LOG - log route creations. - RTRF_VALVE - One way route (used with masquerading) - -Default setup: - -root@amber:/pub/ip-routing # iproute -r -Kernel routing policy rules -Pref Source Destination TOS Iface Cl - 0 default default 00 * 255 - 254 default default 00 * 254 - 255 default default 00 * 253 - - -Lookup algorithm ----------------- - - We scan rules list, and if a rule is matched, apply it. - If a route is found, return it. - If it is not found or a THROW node was matched, continue - to scan rules. - -Applications ------------- - -1. Just ignore classes. All the routes are put into MAIN class - (and/or into DEFAULT class). - - HOWTO: iproute add PREFIX [ tos TOS ] [ gw GW ] [ dev DEV ] - [ metric METRIC ] [ reject ] ... (look at iproute utility) - - or use route utility from current net-tools. - -2. Opposite case. Just forget all that you know about routing - tables. Every rule is supplied with its own gateway, device - info. record. This approach is not appropriate for automated - route maintenance, but it is ideal for manual configuration. - - HOWTO: iproute addrule [ from PREFIX ] [ to PREFIX ] [ tos TOS ] - [ dev INPUTDEV] [ pref PREFERENCE ] route [ gw GATEWAY ] - [ dev OUTDEV ] ..... - - Warning: As of now the size of the routing table in this - approach is limited to 256. If someone likes this model, I'll - relax this limitation. - -3. OSPF classes (see RFC1583, RFC1812 E.3.3) - Very clean, stable and robust algorithm for OSPF routing - domains. Unfortunately, it is not widely used in the Internet. - - Proposed setup: - 255 local addresses - 254 interface routes - 253 ASE routes with external metric - 252 ASE routes with internal metric - 251 inter-area routes - 250 intra-area routes for 1st area - 249 intra-area routes for 2nd area - etc. - - Rules: - iproute addrule class 253 - iproute addrule class 252 - iproute addrule class 251 - iproute addrule to a-prefix-for-1st-area class 250 - iproute addrule to another-prefix-for-1st-area class 250 - ... - iproute addrule to a-prefix-for-2nd-area class 249 - ... - - Area classes must be terminated with reject record. - iproute add default reject class 250 - iproute add default reject class 249 - ... - -4. The Variant Router Requirements Algorithm (RFC1812 E.3.2) - Create 16 classes for different TOS values. - It is a funny, but pretty useless algorithm. - I listed it just to show the power of new routing code. - -5. All the variety of combinations...... - - -GATED ------ - - Gated does not understand classes, but it will work - happily in MAIN+DEFAULT. All policy routes can be set - and maintained manually. - -IMPORTANT NOTE --------------- - route.c has a compilation time switch CONFIG_IP_LOCAL_RT_POLICY. - If it is set, locally originated packets are routed - using all the policy list. This is not very convenient and - pretty ambiguous when used with NAT and masquerading. - I set it to FALSE by default. - - -Alexey Kuznetov -kuznet@ms2.inr.ac.ru diff --git a/Documentation/networking/rxrpc.txt b/Documentation/networking/rxrpc.txt index 1b63bbc6b94f..8c70ba5dee4d 100644 --- a/Documentation/networking/rxrpc.txt +++ b/Documentation/networking/rxrpc.txt @@ -325,6 +325,9 @@ calls, to invoke certain actions and to report certain conditions. These are: RXRPC_LOCAL_ERROR -rt error num Local error encountered RXRPC_NEW_CALL -r- n/a New call received RXRPC_ACCEPT s-- n/a Accept new call + RXRPC_EXCLUSIVE_CALL s-- n/a Make an exclusive client call + RXRPC_UPGRADE_SERVICE s-- n/a Client call can be upgraded + RXRPC_TX_LENGTH s-- data len Total length of Tx data (SRT = usable in Sendmsg / delivered by Recvmsg / Terminal message) @@ -387,6 +390,40 @@ calls, to invoke certain actions and to report certain conditions. These are: return error ENODATA. If the user ID is already in use by another call, then error EBADSLT will be returned. + (*) RXRPC_EXCLUSIVE_CALL + + This is used to indicate that a client call should be made on a one-off + connection. The connection is discarded once the call has terminated. + + (*) RXRPC_UPGRADE_SERVICE + + This is used to make a client call to probe if the specified service ID + may be upgraded by the server. The caller must check msg_name returned to + recvmsg() for the service ID actually in use. The operation probed must + be one that takes the same arguments in both services. + + Once this has been used to establish the upgrade capability (or lack + thereof) of the server, the service ID returned should be used for all + future communication to that server and RXRPC_UPGRADE_SERVICE should no + longer be set. + + (*) RXRPC_TX_LENGTH + + This is used to inform the kernel of the total amount of data that is + going to be transmitted by a call (whether in a client request or a + service response). If given, it allows the kernel to encrypt from the + userspace buffer directly to the packet buffers, rather than copying into + the buffer and then encrypting in place. This may only be given with the + first sendmsg() providing data for a call. EMSGSIZE will be generated if + the amount of data actually given is different. + + This takes a parameter of __s64 type that indicates how much will be + transmitted. This may not be less than zero. + +The symbol RXRPC__SUPPORTED is defined as one more than the highest control +message type supported. At run time this can be queried by means of the +RXRPC_SUPPORTED_CMSG socket option (see below). + ============== SOCKET OPTIONS @@ -433,6 +470,18 @@ AF_RXRPC sockets support a few socket options at the SOL_RXRPC level: Encrypted checksum plus entire packet padded and encrypted, including actual packet length. + (*) RXRPC_UPGRADEABLE_SERVICE + + This is used to indicate that a service socket with two bindings may + upgrade one bound service to the other if requested by the client. optval + must point to an array of two unsigned short ints. The first is the + service ID to upgrade from and the second the service ID to upgrade to. + + (*) RXRPC_SUPPORTED_CMSG + + This is a read-only option that writes an int into the buffer indicating + the highest control message type supported. + ======== SECURITY @@ -542,6 +591,9 @@ A client would issue an operation by: MSG_MORE should be set in msghdr::msg_flags on all but the last part of the request. Multiple requests may be made simultaneously. + An RXRPC_TX_LENGTH control message can also be specified on the first + sendmsg() call. + If a call is intended to go to a destination other than the default specified through connect(), then msghdr::msg_name should be set on the first request message of that call. @@ -559,6 +611,17 @@ A client would issue an operation by: buffer instead, and MSG_EOR will be flagged to indicate the end of that call. +A client may ask for a service ID it knows and ask that this be upgraded to a +better service if one is available by supplying RXRPC_UPGRADE_SERVICE on the +first sendmsg() of a call. The client should then check srx_service in the +msg_name filled in by recvmsg() when collecting the result. srx_service will +hold the same value as given to sendmsg() if the upgrade request was ignored by +the service - otherwise it will be altered to indicate the service ID the +server upgraded to. Note that the upgraded service ID is chosen by the server. +The caller has to wait until it sees the service ID in the reply before sending +any more calls (further calls to the same destination will be blocked until the +probe is concluded). + ==================== EXAMPLE SERVER USAGE @@ -588,7 +651,7 @@ A server would be set up to accept operations in the following manner: The keyring can be manipulated after it has been given to the socket. This permits the server to add more keys, replace keys, etc. whilst it is live. - (2) A local address must then be bound: + (3) A local address must then be bound: struct sockaddr_rxrpc srx = { .srx_family = AF_RXRPC, @@ -600,11 +663,26 @@ A server would be set up to accept operations in the following manner: }; bind(server, &srx, sizeof(srx)); - (3) The server is then set to listen out for incoming calls: + More than one service ID may be bound to a socket, provided the transport + parameters are the same. The limit is currently two. To do this, bind() + should be called twice. + + (4) If service upgrading is required, first two service IDs must have been + bound and then the following option must be set: + + unsigned short service_ids[2] = { from_ID, to_ID }; + setsockopt(server, SOL_RXRPC, RXRPC_UPGRADEABLE_SERVICE, + service_ids, sizeof(service_ids)); + + This will automatically upgrade connections on service from_ID to service + to_ID if they request it. This will be reflected in msg_name obtained + through recvmsg() when the request data is delivered to userspace. + + (5) The server is then set to listen out for incoming calls: listen(server, 100); - (4) The kernel notifies the server of pending incoming connections by sending + (6) The kernel notifies the server of pending incoming connections by sending it a message for each. This is received with recvmsg() on the server socket. It has no data, and has a single dataless control message attached: @@ -616,13 +694,13 @@ A server would be set up to accept operations in the following manner: the time it is accepted - in which case the first call still on the queue will be accepted. - (5) The server then accepts the new call by issuing a sendmsg() with two + (7) The server then accepts the new call by issuing a sendmsg() with two pieces of control data and no actual data: RXRPC_ACCEPT - indicate connection acceptance RXRPC_USER_CALL_ID - specify user ID for this call - (6) The first request data packet will then be posted to the server socket for + (8) The first request data packet will then be posted to the server socket for recvmsg() to pick up. At that point, the RxRPC address for the call can be read from the address fields in the msghdr struct. @@ -634,7 +712,7 @@ A server would be set up to accept operations in the following manner: RXRPC_USER_CALL_ID - specifies the user ID for this call - (8) The reply data should then be posted to the server socket using a series + (9) The reply data should then be posted to the server socket using a series of sendmsg() calls, each with the following control messages attached: RXRPC_USER_CALL_ID - specifies the user ID for this call @@ -642,7 +720,7 @@ A server would be set up to accept operations in the following manner: MSG_MORE should be set in msghdr::msg_flags on all but the last message for a particular call. - (9) The final ACK from the client will be posted for retrieval by recvmsg() +(10) The final ACK from the client will be posted for retrieval by recvmsg() when it is received. It will take the form of a dataless message with two control messages attached: @@ -652,7 +730,7 @@ A server would be set up to accept operations in the following manner: MSG_EOR will be flagged to indicate that this is the final message for this call. -(10) Up to the point the final packet of reply data is sent, the call can be +(11) Up to the point the final packet of reply data is sent, the call can be aborted by calling sendmsg() with a dataless message with the following control messages attached: @@ -703,6 +781,7 @@ The kernel interface functions are as follows: struct sockaddr_rxrpc *srx, struct key *key, unsigned long user_call_ID, + s64 tx_total_len, gfp_t gfp); This allocates the infrastructure to make a new RxRPC call and assigns @@ -719,6 +798,11 @@ The kernel interface functions are as follows: control data buffer. It is entirely feasible to use this to point to a kernel data structure. + tx_total_len is the amount of data the caller is intending to transmit + with this call (or -1 if unknown at this point). Setting the data size + allows the kernel to encrypt directly to the packet buffers, thereby + saving a copy. The value may not be less than -1. + If this function is successful, an opaque reference to the RxRPC call is returned. The caller now holds a reference on this and it must be properly ended. @@ -870,6 +954,17 @@ The kernel interface functions are as follows: This is used to find the remote peer address of a call. + (*) Set the total transmit data size on a call. + + void rxrpc_kernel_set_tx_length(struct socket *sock, + struct rxrpc_call *call, + s64 tx_total_len); + + This sets the amount of data that the caller is intending to transmit on a + call. It's intended to be used for setting the reply size as the request + size should be set when the call is begun. tx_total_len may not be less + than zero. + ======================= CONFIGURABLE PARAMETERS diff --git a/Documentation/networking/timestamping.txt b/Documentation/networking/timestamping.txt index 96f50694a748..196ba17cc344 100644 --- a/Documentation/networking/timestamping.txt +++ b/Documentation/networking/timestamping.txt @@ -193,6 +193,24 @@ SOF_TIMESTAMPING_OPT_STATS: the transmit timestamps, such as how long a certain block of data was limited by peer's receiver window. +SOF_TIMESTAMPING_OPT_PKTINFO: + + Enable the SCM_TIMESTAMPING_PKTINFO control message for incoming + packets with hardware timestamps. The message contains struct + scm_ts_pktinfo, which supplies the index of the real interface which + received the packet and its length at layer 2. A valid (non-zero) + interface index will be returned only if CONFIG_NET_RX_BUSY_POLL is + enabled and the driver is using NAPI. The struct contains also two + other fields, but they are reserved and undefined. + +SOF_TIMESTAMPING_OPT_TX_SWHW: + + Request both hardware and software timestamps for outgoing packets + when SOF_TIMESTAMPING_TX_HARDWARE and SOF_TIMESTAMPING_TX_SOFTWARE + are enabled at the same time. If both timestamps are generated, + two separate messages will be looped to the socket's error queue, + each containing just one timestamp. + New applications are encouraged to pass SOF_TIMESTAMPING_OPT_ID to disambiguate timestamps and SOF_TIMESTAMPING_OPT_TSONLY to operate regardless of the setting of sysctl net.core.tstamp_allow_data. @@ -312,7 +330,7 @@ struct scm_timestamping { }; The structure can return up to three timestamps. This is a legacy -feature. Only one field is non-zero at any time. Most timestamps +feature. At least one field is non-zero at any time. Most timestamps are passed in ts[0]. Hardware timestamps are passed in ts[2]. ts[1] used to hold hardware timestamps converted to system time. @@ -321,6 +339,12 @@ a HW PTP clock source, to allow time conversion in userspace and optionally synchronize system time with a userspace PTP stack such as linuxptp. For the PTP clock API, see Documentation/ptp/ptp.txt. +Note that if the SO_TIMESTAMP or SO_TIMESTAMPNS option is enabled +together with SO_TIMESTAMPING using SOF_TIMESTAMPING_SOFTWARE, a false +software timestamp will be generated in the recvmsg() call and passed +in ts[0] when a real software timestamp is missing. This happens also +on hardware transmit timestamps. + 2.1.1 Transmit timestamps with MSG_ERRQUEUE For transmit timestamps the outgoing packet is looped back to the diff --git a/Documentation/networking/tls.txt b/Documentation/networking/tls.txt new file mode 100644 index 000000000000..77ed00631c12 --- /dev/null +++ b/Documentation/networking/tls.txt @@ -0,0 +1,135 @@ +Overview +======== + +Transport Layer Security (TLS) is a Upper Layer Protocol (ULP) that runs over +TCP. TLS provides end-to-end data integrity and confidentiality. + +User interface +============== + +Creating a TLS connection +------------------------- + +First create a new TCP socket and set the TLS ULP. + + sock = socket(AF_INET, SOCK_STREAM, 0); + setsockopt(sock, SOL_TCP, TCP_ULP, "tls", sizeof("tls")); + +Setting the TLS ULP allows us to set/get TLS socket options. Currently +only the symmetric encryption is handled in the kernel. After the TLS +handshake is complete, we have all the parameters required to move the +data-path to the kernel. There is a separate socket option for moving +the transmit and the receive into the kernel. + + /* From linux/tls.h */ + struct tls_crypto_info { + unsigned short version; + unsigned short cipher_type; + }; + + struct tls12_crypto_info_aes_gcm_128 { + struct tls_crypto_info info; + unsigned char iv[TLS_CIPHER_AES_GCM_128_IV_SIZE]; + unsigned char key[TLS_CIPHER_AES_GCM_128_KEY_SIZE]; + unsigned char salt[TLS_CIPHER_AES_GCM_128_SALT_SIZE]; + unsigned char rec_seq[TLS_CIPHER_AES_GCM_128_REC_SEQ_SIZE]; + }; + + + struct tls12_crypto_info_aes_gcm_128 crypto_info; + + crypto_info.info.version = TLS_1_2_VERSION; + crypto_info.info.cipher_type = TLS_CIPHER_AES_GCM_128; + memcpy(crypto_info.iv, iv_write, TLS_CIPHER_AES_GCM_128_IV_SIZE); + memcpy(crypto_info.rec_seq, seq_number_write, + TLS_CIPHER_AES_GCM_128_REC_SEQ_SIZE); + memcpy(crypto_info.key, cipher_key_write, TLS_CIPHER_AES_GCM_128_KEY_SIZE); + memcpy(crypto_info.salt, implicit_iv_write, TLS_CIPHER_AES_GCM_128_SALT_SIZE); + + setsockopt(sock, SOL_TLS, TLS_TX, &crypto_info, sizeof(crypto_info)); + +Sending TLS application data +---------------------------- + +After setting the TLS_TX socket option all application data sent over this +socket is encrypted using TLS and the parameters provided in the socket option. +For example, we can send an encrypted hello world record as follows: + + const char *msg = "hello world\n"; + send(sock, msg, strlen(msg)); + +send() data is directly encrypted from the userspace buffer provided +to the encrypted kernel send buffer if possible. + +The sendfile system call will send the file's data over TLS records of maximum +length (2^14). + + file = open(filename, O_RDONLY); + fstat(file, &stat); + sendfile(sock, file, &offset, stat.st_size); + +TLS records are created and sent after each send() call, unless +MSG_MORE is passed. MSG_MORE will delay creation of a record until +MSG_MORE is not passed, or the maximum record size is reached. + +The kernel will need to allocate a buffer for the encrypted data. +This buffer is allocated at the time send() is called, such that +either the entire send() call will return -ENOMEM (or block waiting +for memory), or the encryption will always succeed. If send() returns +-ENOMEM and some data was left on the socket buffer from a previous +call using MSG_MORE, the MSG_MORE data is left on the socket buffer. + +Send TLS control messages +------------------------- + +Other than application data, TLS has control messages such as alert +messages (record type 21) and handshake messages (record type 22), etc. +These messages can be sent over the socket by providing the TLS record type +via a CMSG. For example the following function sends @data of @length bytes +using a record of type @record_type. + +/* send TLS control message using record_type */ + static int klts_send_ctrl_message(int sock, unsigned char record_type, + void *data, size_t length) + { + struct msghdr msg = {0}; + int cmsg_len = sizeof(record_type); + struct cmsghdr *cmsg; + char buf[CMSG_SPACE(cmsg_len)]; + struct iovec msg_iov; /* Vector of data to send/receive into. */ + + msg.msg_control = buf; + msg.msg_controllen = sizeof(buf); + cmsg = CMSG_FIRSTHDR(&msg); + cmsg->cmsg_level = SOL_TLS; + cmsg->cmsg_type = TLS_SET_RECORD_TYPE; + cmsg->cmsg_len = CMSG_LEN(cmsg_len); + *CMSG_DATA(cmsg) = record_type; + msg.msg_controllen = cmsg->cmsg_len; + + msg_iov.iov_base = data; + msg_iov.iov_len = length; + msg.msg_iov = &msg_iov; + msg.msg_iovlen = 1; + + return sendmsg(sock, &msg, 0); + } + +Control message data should be provided unencrypted, and will be +encrypted by the kernel. + +Integrating in to userspace TLS library +--------------------------------------- + +At a high level, the kernel TLS ULP is a replacement for the record +layer of a userspace TLS library. + +A patchset to OpenSSL to use ktls as the record layer is here: + +https://github.com/Mellanox/tls-openssl + +An example of calling send directly after a handshake using +gnutls. Since it doesn't implement a full record layer, control +messages are not supported: + +https://github.com/Mellanox/tls-af_ktls_tool |