summaryrefslogtreecommitdiffstats
path: root/Documentation/networking/statistics.rst
diff options
context:
space:
mode:
authorJakub Kicinski <kuba@kernel.org>2020-09-03 16:14:31 -0700
committerJakub Kicinski <kuba@kernel.org>2020-09-07 15:08:05 -0700
commit0db0c34cfbc9838c1a14cb04dd880602abd699a7 (patch)
treee11f1962b1eaf89c966452139bf4d434e80493cf /Documentation/networking/statistics.rst
parent81365af13a5630673c49bfad9b24cf415e9576f6 (diff)
downloadlinux-0db0c34cfbc9838c1a14cb04dd880602abd699a7.tar.bz2
net: tighten the definition of interface statistics
This patch is born out of an investigation into which IEEE statistics correspond to which struct rtnl_link_stats64 members. Turns out that there seems to be reasonable consensus on the matter, among many drivers. To save others the time (and it took more time than I'm comfortable admitting) I'm adding comments referring to IEEE attributes to struct rtnl_link_stats64. Up until now we had two forms of documentation for stats - in Documentation/ABI/testing/sysfs-class-net-statistics and the comments on struct rtnl_link_stats64 itself. While the former is very cautious in defining the expected behavior, the latter feel quite dated and may not be easy to understand for modern day driver author (e.g. rx_over_errors). At the same time modern systems are far more complex and once obvious definitions lost their clarity. For example - does rx_packet count at the MAC layer (aFramesReceivedOK)? packets processed correctly by hardware? received by the driver? or maybe received by the stack? I tried to clarify the expectations, further clarifications from others are very welcome. The part hardest to untangle is rx_over_errors vs rx_fifo_errors vs rx_missed_errors. After much deliberation I concluded that for modern HW only two of the counters will make sense. The distinction between internal FIFO overflow and packets dropped due to back-pressure from the host is likely too implementation (driver and device) specific to expose in the standard stats. Now - which two of those counters we select to use is anyone's pick: sysfs documentation suggests rx_over_errors counts packets which did not fit into buffers due to MTU being too small, which I reused. There don't seem to be many modern drivers using it (well, CAN drivers seem to love this statistic). Of the remaining two I picked rx_missed_errors to report device drops. bnxt reports it and it's folded into "drop"s in procfs (while rx_fifo_errors is an error, and modern devices usually receive the frame OK, they just can't admit it into the pipeline). Of the drivers I looked at only AMD Lance-like and NS8390-like use all three of these counters. rx_missed_errors counts missed frames, rx_over_errors counts overflow events, and rx_fifo_errors counts frames which were truncated because they didn't fit into buffers. This suggests that rx_fifo_errors may be the correct stat for truncated packets, but I'd think a FIFO stat counting truncated packets would be very confusing to a modern reader. v2: - add driver developer notes about ethtool stat count and reset - replace Ethernet with IEEE 802.3 to better indicate source of attrs - mention byte counters don't count FCS - clarify RX counter is from device to host - drop "sightly" from sysfs paragraph - add examples of ethtool stats - s/incoming/received/ s/incoming/transmitted/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Diffstat (limited to 'Documentation/networking/statistics.rst')
-rw-r--r--Documentation/networking/statistics.rst132
1 files changed, 132 insertions, 0 deletions
diff --git a/Documentation/networking/statistics.rst b/Documentation/networking/statistics.rst
new file mode 100644
index 000000000000..d490b535cd14
--- /dev/null
+++ b/Documentation/networking/statistics.rst
@@ -0,0 +1,132 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================
+Interface statistics
+====================
+
+This document is a guide to Linux network interface statistics.
+
+There are two main sources of interface statistics in Linux:
+
+ - standard interface statistics based on
+ :c:type:`struct rtnl_link_stats64 <rtnl_link_stats64>`; and
+ - driver-defined statistics available via ethtool.
+
+There are multiple interfaces to reach the former. Most commonly used
+is the `ip` command from `iproute2`::
+
+ $ ip -s -s link show dev ens4u1u1
+ 6: ens4u1u1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
+ link/ether 48:2a:e3:4c:b1:d1 brd ff:ff:ff:ff:ff:ff
+ RX: bytes packets errors dropped overrun mcast
+ 74327665117 69016965 0 0 0 0
+ RX errors: length crc frame fifo missed
+ 0 0 0 0 0
+ TX: bytes packets errors dropped carrier collsns
+ 21405556176 44608960 0 0 0 0
+ TX errors: aborted fifo window heartbeat transns
+ 0 0 0 0 128
+ altname enp58s0u1u1
+
+Note that `-s` has been specified twice to see all members of
+:c:type:`struct rtnl_link_stats64 <rtnl_link_stats64>`.
+If `-s` is specified once the detailed errors won't be shown.
+
+`ip` supports JSON formatting via the `-j` option.
+
+Ethtool statistics can be dumped using `ethtool -S $ifc`, e.g.::
+
+ $ ethtool -S ens4u1u1
+ NIC statistics:
+ tx_single_collisions: 0
+ tx_multi_collisions: 0
+
+uAPIs
+=====
+
+procfs
+------
+
+The historical `/proc/net/dev` text interface gives access to the list
+of interfaces as well as their statistics.
+
+Note that even though this interface is using
+:c:type:`struct rtnl_link_stats64 <rtnl_link_stats64>`
+internally it combines some of the fields.
+
+sysfs
+-----
+
+Each device directory in sysfs contains a `statistics` directory (e.g.
+`/sys/class/net/lo/statistics/`) with files corresponding to
+members of :c:type:`struct rtnl_link_stats64 <rtnl_link_stats64>`.
+
+This simple interface is convenient especially in constrained/embedded
+environments without access to tools. However, it's inefficient when
+reading multiple stats as it internally performs a full dump of
+:c:type:`struct rtnl_link_stats64 <rtnl_link_stats64>`
+and reports only the stat corresponding to the accessed file.
+
+Sysfs files are documented in
+`Documentation/ABI/testing/sysfs-class-net-statistics`.
+
+
+netlink
+-------
+
+`rtnetlink` (`NETLINK_ROUTE`) is the preferred method of accessing
+:c:type:`struct rtnl_link_stats64 <rtnl_link_stats64>` stats.
+
+Statistics are reported both in the responses to link information
+requests (`RTM_GETLINK`) and statistic requests (`RTM_GETSTATS`,
+when `IFLA_STATS_LINK_64` bit is set in the `.filter_mask` of the request).
+
+ethtool
+-------
+
+Ethtool IOCTL interface allows drivers to report implementation
+specific statistics. Historically it has also been used to report
+statistics for which other APIs did not exist, like per-device-queue
+statistics, or standard-based statistics (e.g. RFC 2863).
+
+Statistics and their string identifiers are retrieved separately.
+Identifiers via `ETHTOOL_GSTRINGS` with `string_set` set to `ETH_SS_STATS`,
+and values via `ETHTOOL_GSTATS`. User space should use `ETHTOOL_GDRVINFO`
+to retrieve the number of statistics (`.n_stats`).
+
+debugfs
+-------
+
+Some drivers expose extra statistics via `debugfs`.
+
+struct rtnl_link_stats64
+========================
+
+.. kernel-doc:: include/uapi/linux/if_link.h
+ :identifiers: rtnl_link_stats64
+
+Notes for driver authors
+========================
+
+Drivers should report all statistics which have a matching member in
+:c:type:`struct rtnl_link_stats64 <rtnl_link_stats64>` exclusively
+via `.ndo_get_stats64`. Reporting such standard stats via ethtool
+or debugfs will not be accepted.
+
+Drivers must ensure best possible compliance with
+:c:type:`struct rtnl_link_stats64 <rtnl_link_stats64>`.
+Please note for example that detailed error statistics must be
+added into the general `rx_error` / `tx_error` counters.
+
+The `.ndo_get_stats64` callback can not sleep because of accesses
+via `/proc/net/dev`. If driver may sleep when retrieving the statistics
+from the device it should do so periodically asynchronously and only return
+a recent copy from `.ndo_get_stats64`. Ethtool interrupt coalescing interface
+allows setting the frequency of refreshing statistics, if needed.
+
+Retrieving ethtool statistics is a multi-syscall process, drivers are advised
+to keep the number of statistics constant to avoid race conditions with
+user space trying to read them.
+
+Statistics must persist across routine operations like bringing the interface
+down and up.