summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2019-02-07s390/net: move pnet constantsUrsula Braun2-8/+8
There is no need to define these PNETID related constants in the pnet.h file, since they are just used locally within pnet.c. Just code cleanup, no functional change. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07net: vxlan: Free a leaked vetoed multicast rdstPetr Machata1-9/+11
When an rdst is rejected by a driver, the current code removes it from the remote list, but neglects to free it. This is triggered by tools/testing/selftests/drivers/net/mlxsw/vxlan_fdb_veto.sh and shows as the following kmemleak trace: unreferenced object 0xffff88817fa3d888 (size 96): comm "softirq", pid 0, jiffies 4372702718 (age 165.252s) hex dump (first 32 bytes): 02 00 00 00 c6 33 64 03 80 f5 a2 61 81 88 ff ff .....3d....a.... 06 df 71 ae ff ff ff ff 0c 00 00 00 04 d2 6a 6b ..q...........jk backtrace: [<00000000296b27ac>] kmem_cache_alloc_trace+0x1ae/0x370 [<0000000075c86dc6>] vxlan_fdb_append.part.12+0x62/0x3b0 [vxlan] [<00000000e0414b63>] vxlan_fdb_update+0xc61/0x1020 [vxlan] [<00000000f330c4bd>] vxlan_fdb_add+0x2e8/0x3d0 [vxlan] [<0000000008f81c2c>] rtnl_fdb_add+0x4c2/0xa10 [<00000000bdc4b270>] rtnetlink_rcv_msg+0x6dd/0x970 [<000000006701f2ce>] netlink_rcv_skb+0x290/0x410 [<00000000c08a5487>] rtnetlink_rcv+0x15/0x20 [<00000000d5f54b1e>] netlink_unicast+0x43f/0x5e0 [<00000000db4336bb>] netlink_sendmsg+0x789/0xcd0 [<00000000e1ee26b6>] sock_sendmsg+0xba/0x100 [<00000000ba409802>] ___sys_sendmsg+0x631/0x960 [<000000003c332113>] __sys_sendmsg+0xea/0x180 [<00000000f4139144>] __x64_sys_sendmsg+0x78/0xb0 [<000000006d1ddc59>] do_syscall_64+0x94/0x410 [<00000000c8defa9a>] entry_SYSCALL_64_after_hwframe+0x49/0xbe Move vxlan_dst_free() up and schedule a call thereof to plug this leak. Fixes: 61f46fe8c646 ("vxlan: Allow vetoing of FDB notifications") Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07Merge branch 'devlink-health'David S. Miller11-165/+1755
Eran Ben Elisha says: ==================== Devlink health reporting and recovery system The health mechanism is targeted for Real Time Alerting, in order to know when something bad had happened to a PCI device - Provide alert debug information - Self healing - If problem needs vendor support, provide a way to gather all needed debugging information. The main idea is to unify and centralize driver health reports in the generic devlink instance and allow the user to set different attributes of the health reporting and recovery procedures. The devlink health reporter: Device driver creates a "health reporter" per each error/health type. Error/Health type can be a known/generic (eg pci error, fw error, rx/tx error) or unknown (driver specific). For each registered health reporter a driver can issue error/health reports asynchronously. All health reports handling is done by devlink. Device driver can provide specific callbacks for each "health reporter", e.g. - Recovery procedures - Diagnostics and object dump procedures - OOB initial attributes Different parts of the driver can register different types of health reporters with different handlers. Once an error is reported, devlink health will do the following actions: * A log is being send to the kernel trace events buffer * Health status and statistics are being updated for the reporter instance * Object dump is being taken and saved at the reporter instance (as long as there is no other dump which is already stored) * Auto recovery attempt is being done. Depends on: - Auto-recovery configuration - Grace period vs. time passed since last recover The user interface: User can access/change each reporter attributes and driver specific callbacks via devlink, e.g per error type (per health reporter) - Configure reporter's generic attributes (like: Disable/enable auto recovery) - Invoke recovery procedure - Run diagnostics - Object dump The devlink health interface (via netlink): DEVLINK_CMD_HEALTH_REPORTER_GET Retrieves status and configuration info per DEV and reporter. DEVLINK_CMD_HEALTH_REPORTER_SET Allows reporter-related configuration setting. DEVLINK_CMD_HEALTH_REPORTER_RECOVER Triggers a reporter's recovery procedure. DEVLINK_CMD_HEALTH_REPORTER_DIAGNOSE Retrieves diagnostics data from a reporter on a device. DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET Retrieves the last stored dump. Devlink health saves a single dump. If an dump is not already stored by the devlink for this reporter, devlink generates a new dump. dump output is defined by the reporter. DEVLINK_CMD_HEALTH_REPORTER_DUMP_CLEAR Clears the last saved dump file for the specified reporter. netlink +--------------------------+ | | | + | | | | +--------------------------+ |request for ops |(diagnose, mlx5_core devlink |recover, |dump) +--------+ +--------------------------+ | | | reporter| | | | | +---------v----------+ | | | ops execution | | | | | <----------------------------------+ | | | | | | | | | | | + ^------------------+ | | | | | request for ops | | | | | (recover, dump) | | | | | | | | | +-+------------------+ | | | health report | | health handler | | | +-------------------------------> | | | | | +--------------------+ | | | health reporter create | | | +----------------------------> | +--------+ +--------------------------+ In this patchset, mlx5e TX reporter is implemented. Cmdline format: devlink health show [DEV reporter REPORTE_NAME] devlink health recover DEV reporter REPORTER_NAME devlink health diagnose DEV reporter REPORTER_NAME devlink health dump show DEV reporter REPORTER_NAME devlink health dump clear DEV reporter REPORTER_NAME devlink health set DEV reporter REPORTER_NAME NAME VALUE Cmdline examples: $devlink health show pci/0000:00:09.0: name tx state healthy #err 1 #recover 0 last_dump_ts N/A parameters: grace_period 500 auto_recover false $devlink health diagnose pci/0000:00:09.0 reporter tx -j -p { "SQs": [ { "sqn": 138, "HW state": 1, "stopped": false },{ "sqn": 142, "HW state": 1, "stopped": false } ] } $devlink health diagnose pci/0000:00:09.0 reporter tx SQs: sqn: 138 HW state: 1 stopped: false sqn: 142 HW state: 1 stopped: false $devlink health recover pci/0000:00:09 reporter tx $devlink health set pci/0000:00:09.0 reporter tx grace_period 3500 $devlink health set pci/0000:00:09.0 reporter tx auto_recover false Changelog: v4: - Rebase on latest net-next - Remove trace_devlink_health signature exposure in case CONFIG_NET_DEVLINK is not defined as it shall only be used from devlink. v3: - Redesign of devlink <-> driver fmsg API - Various bug fixes v2: - Remove FW* reporters to decrease the amount of patches in the patchset ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07devlink: Add Documentation/networking/devlink-health.txtAya Levin1-0/+86
This patch adds a new file to add information about devlink health mechanism. Signed-off-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07net/mlx5e: Add tx timeout support for mlx5e tx reporterEran Ben Elisha3-37/+55
With this patch, ndo_tx_timeout callback will be redirected to the tx reporter in order to detect a tx timeout error and report it to the devlink health. (The watchdog detects tx timeouts, but the driver verify the issue still exists before launching any recover method). In addition, recover from tx timeout in case of lost interrupt was added to the tx reporter recover method. The tx timeout recover from lost interrupt is not a new feature in the driver, this patch re-organize the functionality and move it to the tx reporter recovery flow. tx timeout example: (with auto_recover set to false, if set to true, the manual recover and diagnose sections are irrelevant) $cat /sys/kernel/debug/tracing/trace ... devlink_health_report: bus_name=pci dev_name=0000:00:09.0 driver_name=mlx5_core reporter_name=tx: TX timeout on queue: 0, SQ: 0x8a, CQ: 0x35, SQ Cons: 0x2 SQ Prod: 0x2, usecs since last trans: 14912000 $devlink health show pci/0000:00:09.0: name tx state healthy #err 1 #recover 0 last_dump_ts N/A parameters: grace_period 500 auto_recover false $devlink health diagnose pci/0000:00:09.0 reporter tx -j -p { "SQs": [ { "sqn": 138, "HW state": 1, "stopped": true },{ "sqn": 142, "HW state": 1, "stopped": false } ] } $devlink health diagnose pci/0000:00:09.0 reporter tx SQs: sqn: 138 HW state: 1 stopped: true sqn: 142 HW state: 1 stopped: false $devlink health recover pci/0000:00:09 reporter tx $devlink health show pci/0000:00:09.0: name tx state healthy #err 1 #recover 1 last_dump_ts N/A parameters: grace_period 500 auto_recover false Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07net/mlx5e: Add tx reporter supportEran Ben Elisha6-128/+306
Add mlx5e tx reporter to devlink health reporters. This reporter will be responsible for diagnosing, reporting and recovering of tx errors. This patch declares the TX reporter operations and creates it using the devlink health API. Currently, this reporter supports reporting and recovering from send error CQE only. In addition, it adds diagnose information for the open SQs. For a local SQ recover (due to driver error report), in case of SQ recover failure, the recover operation will be considered as a failure. For a full tx recover, an attempt to close and open the channels will be done. If this one passed successfully, it will be considered as a successful recover. The SQ recover from error CQE flow is not a new feature in the driver, this patch re-organize the functions and adapt them for the devlink health API. For this purpose, move code from en_main.c to a new file named reporter_tx.c. Diagnose output: $devlink health diagnose pci/0000:00:09.0 reporter tx -j -p { "SQs": [ { "sqn": 138, "HW state": 1, "stopped": false },{ "sqn": 142, "HW state": 1, "stopped": false } ] } $devlink health diagnose pci/0000:00:09.0 reporter tx SQs: sqn: 138 HW state: 1 stopped: false sqn: 142 HW state: 1 stopped: false Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07devlink: Add health dump {get,clear} commandsEran Ben Elisha2-0/+65
Add devlink health dump commands, in order to run an dump operation over a specific reporter. The supported operations are dump_get in order to get last saved dump (if not exist, dump now) and dump_clear to clear last saved dump. It is expected from driver's callback for diagnose command to fill it via the devlink fmsg API. Devlink will parse it and convert it to netlink nla API in order to pass it to the user. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07devlink: Add health diagnose commandEran Ben Elisha2-0/+47
Add devlink health diagnose command, in order to run a diagnose operation over a specific reporter. It is expected from driver's callback for diagnose command to fill it via the devlink fmsg API. Devlink will parse it and convert it to netlink nla API in order to pass it to the user. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07devlink: Add health recover commandEran Ben Elisha2-0/+21
Add devlink health recover command to the uapi, in order to allow the user to execute a recover operation over a specific reporter. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07devlink: Add health set commandEran Ben Elisha2-0/+37
Add devlink health set command, in order to set configuration parameters for a specific reporter. Supported parameters are: - graceful_period: Time interval between auto recoveries (in msec) - auto_recover: Determines if the devlink shall execute recover upon receiving error for the reporter Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07devlink: Add health get commandEran Ben Elisha2-0/+160
Add devlink health get command to provide reporter/s data for user space. Add the ability to get data per reporter or dump data from all available reporters. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07devlink: Add health report functionalityEran Ben Elisha3-0/+193
Upon error discover, every driver can report it to the devlink health mechanism via devlink_health_report function, using the appropriate reporter registered to it. Driver can pass error specific context which will be delivered to it as part of the dump / recovery callbacks. Once an error is reported, devlink health will do the following actions: * A log is being send to the kernel trace events buffer * Health status and statistics are being updated for the reporter instance * Object dump is being taken and stored at the reporter instance (as long as there is no other dump which is already stored) * Auto recovery attempt is being done. Depends on: - Auto Recovery configuration - Grace period vs. Time since last recover Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07devlink: Add health reporter create/destroy functionalityEran Ben Elisha2-0/+145
Devlink health reporter is an instance for reporting, diagnosing and recovering from run time errors discovered by the reporters. Define it's data structure and supported operations. In addition, expose devlink API to create and destroy a reporter. Each devlink instance will hold it's own reporters list. As part of the allocation, driver shall provide a set of callbacks which will be used by devlink in order to handle health reports and user commands related to this reporter. In addition, driver is entitled to provide some priv pointer, which can be fetched from the reporter by devlink_health_reporter_priv function. For each reporter, devlink will hold a metadata of statistics, dump msg and status. For passing dumps and diagnose data to the user-space, it will use devlink fmsg API. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07devlink: Add devlink formatted message (fmsg) APIEran Ben Elisha3-0/+640
Devlink fmsg is a mechanism to pass descriptors between drivers and devlink, in json-like format. The API allows the driver to add nested attributes such as object, object pair and value array, in addition to attributes such as name and value. Driver can use this API to fill the fmsg context in a format which will be translated by the devlink to the netlink message later. There is no memory allocation in advance (other than the initial list head), and it dynamically allocates messages descriptors and add them to the list on the fly. When it needs to send the data using SKBs to the netlink layer, it fragments the data between different SKBs. In order to do this fragmentation, it uses virtual nests attributes, to avoid actual nesting use which cannot be divided between different SKBs. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07net: phy: fixed_phy: Fix fixed_phy not checking GPIOMoritz Fischer1-3/+3
Fix fixed_phy not checking GPIO if no link_update callback is registered. In the original version all users registered a link_update callback so the issue was masked. Fixes: a5597008dbc2 ("phy: fixed_phy: Add gpio to determine link up/down.") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Moritz Fischer <mdf@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06net: emac: remove IBM_EMAC_RX_SKB_HEADROOMChristian Lamparter3-39/+47
The EMAC driver had a custom IBM_EMAC_RX_SKB_HEADROOM Kconfig option that reserved additional skb headroom for RX. This patch removes the option and migrates the code to use napi_alloc_skb() and netdev_alloc_skb_ip_align() in its place. Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06net: phy: improve genphy_c45_read_linkHeiner Kallweit2-11/+6
Let's make genphy_c45_read_link behave the same as genphy_update_link and set phydev->link in the function directly. This allows to simplify the callers. In addition don't check further devices once we detect that at least one device reports link as down. v2: - remove an unused variable Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06net: stmmac: fix ptp timestamping on Rx on gmac4Ilias Apalodimas2-18/+16
The current driver only enables Pdelay_Req and Pdelay_Resp when HWTSTAMP_FILTER_PTP_V2_EVENT, HWTSTAMP_FILTER_PTP_V1_L4_EVENT or HWTSTAMP_FILTER_PTP_V2_L4_EVENT is requested. This results in ptp sync on slave mode to report 'received SYNC without timestamp' when using ptp4l. Although the hardware can support Sync, Pdelay_Req and Pdelay_resp by setting bit14 annd bits 17/16 to 01 this leaves Delay_Req timestamps out. Fix this by enabling all event and general messages timestamps. This includes SYNC, Follow_Up, Delay_Req, Delay_Resp, Pdelay_Req, Pdelay_Resp and Pdelay_Resp_Follow_Up messages. Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Jose Abreu <joabreu@synopsys.com> Tested-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06net: dsa: mv88e6xxx: Prevent suspend to RAMMiquel Raynal1-0/+16
On one hand, the mv88e6xxx driver has a work queue called in loop which will attempt register accesses after MDIO bus suspension, that entirely freezes the platform during suspend. On the other hand, the DSA core is not ready yet to support suspend to RAM operation because so far there is no way to recover reliably the switch configuration. To avoid the kernel to freeze when suspending with a switch driven by the mv88e6xxx driver, we choose to prevent the driver suspension and in the same way, the whole platform. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06Merge branch 'for_net-next-5.1/rds-tos-v4' of ↵David S. Miller16-52/+166
git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux Santosh Shilimkar says: ==================== rds: add tos support RDS applications make use of tos to classify database traffic. This feature has been used in shipping products from 2.6.32 based kernels. Its tied with RDS v4.1 protocol version and the compatibility gets negotiated as part of connections setup. Patchset keeps full backward compatibility using existing connection negotiation scheme. Currently the feature is exploited by RDMA transport and for TCP transport the user tos values are mapped to same default class (0). For RDMA transports, RDMA CM service type API is used to set up different SL(service lanes) and the IB fabric is configured for tos mapping using Subnet Manager(SL to VL mappings). Similarly for ROCE fabric, user priority is mapped with different DSCP code points which are associated with different switch queues in the fabric. The original code was developed by Bang Nguyen in downstream kernel back in 2.6.32 kernel days and it has evolved significantly over period of time. Thanks to Yanjun for doing testing with various combinations of host like v3.1<->v4.1, v4.1.<->v3.1, v4.1 upstream to shipping v4.1 etc etc ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller34-391/+4353
Daniel Borkmann says: ==================== pull-request: bpf-next 2019-02-07 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Add a riscv64 JIT for BPF, from Björn. 2) Implement BTF deduplication algorithm for libbpf which takes BTF type information containing duplicate per-compilation unit information and reduces it to an equivalent set of BTF types with no duplication and without loss of information, from Andrii. 3) Offloaded and native BPF XDP programs can coexist today, enable also offloaded and generic ones as well, from Jakub. 4) Expose various BTF related helper functions in libbpf as API which are in particular helpful for JITed programs, from Yonghong. 5) Fix the recently added JMP32 code emission in s390x JIT, from Heiko. 6) Fix BPF kselftests' tcp_{server,client}.py to be able to run inside a network namespace, also add a fix for libbpf to get libbpf_print() working, from Stanislav. 7) Fixes for bpftool documentation, from Prashant. 8) Type cleanup in BPF kselftests' test_maps.c to silence a gcc8 warning, from Breno. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06Merge branch 'mlxsw-blackhole-routes'David S. Miller2-2/+225
Ido Schimmel says: ==================== mlxsw: Offload blackhole routes Blackhole routes are routes that cause matching packets to be silently dropped. This is in contrast to unreachable routes that generate an ICMP host unreachable packet in response. The driver currently programs both route types with a trap action and lets the kernel drop matching packets. This is sub-optimal as packets routed using a blackhole route can be directly dropped by the ASIC. Patch #1 alters mlxsw to program blackhole routes with a discard action. Patch #2 adds a matching test. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06selftests: mlxsw: Add a test for blackhole routesIdo Schimmel1-0/+200
Use a simple topology consisting of two hosts directly connected to a router. Make sure IPv4/IPv6 ping works and then add blackhole routes. Test that ping fails and that the routes are marked as offloaded. Use a simple tc filter to test that packets were dropped by the ASIC and not trapped to the CPU. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06mlxsw: spectrum_router: Offload blackhole routesIdo Schimmel1-2/+25
Create a new FIB entry type for blackhole routes and set it in case the type of the notified route is 'RTN_BLACKHOLE'. Program such routes with a discard action and mark them as offloaded since the device is dropping the packets instead of the kernel. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06Merge branch 'net-Introduce-ndo_get_port_parent_id'David S. Miller30-260/+214
Florian Fainelli says: ==================== net: Introduce ndo_get_port_parent_id() Based on discussion with Ido and feedback from Jakub there are clearly two classes of users that implement SWITCHDEV_ATTR_ID_PORT_PARENT_ID: - PF/VF drivers which typically only implement return the port's parent ID, yet have to implement switchdev_port_attr_get() just for that - Ethernet switch drivers: mlxsw, ocelot, DSA, etc. which implement more attributes which we want to be able to eventually veto in the context of the caller, thus making them candidates for using a blocking notifier chain Changes in v4: - remove superfluous net/switchdev.h inclusions in a few files - added Jiri's Acked-by where given - removed err = -EOPNOTSUPP initializations - changed according to Jiri's suggestion in net/ipv4/ipmr.c Changes in v3: - keep ethsw's switchdev_ops assignment - remove inclusion of net/switchdev.h in netdevsim which is no longer necesary Changes in v2: - resolved build failures spotted by kbuild test robot - added helpers functions into the core network device layer: dev_get_port_parent_id() and netdev_port_same_parent_id(); - added support for recursion to lower devices Changes from RFC: - introduce a ndo_get_port_parent_id() and convert all relevant drivers to use it - get rid of SWITCHDEV_ATTR_ID_PORT_PARENT_ID A subsequent set of patches will convert switchdev_port_attr_set() to use a blocking notifier call, and still get rid of switchdev_port_attr_get() altogether. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06net: Get rid of SWITCHDEV_ATTR_ID_PORT_PARENT_IDFlorian Fainelli6-84/+15
Now that we have a dedicated NDO for getting a port's parent ID, get rid of SWITCHDEV_ATTR_ID_PORT_PARENT_ID and convert all callers to use the NDO exclusively. This is a preliminary change to getting rid of switchdev_ops eventually. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06net: dsa: Implement ndo_get_port_parent_id()Florian Fainelli1-6/+12
DSA implements SWITCHDEV_ATTR_ID_PORT_PARENT_ID and we want to get rid of switchdev_ops eventually, ease that migration by implementing a ndo_get_port_parent_id() function which returns what switchdev_port_attr_get() would do. Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06staging: fsl-dpaa2: ethsw: Implement ndo_get_port_parent_id()Florian Fainelli1-4/+12
ethsw implements SWITCHDEV_ATTR_ID_PORT_PARENT_ID and we want to get rid of switchdev_ops eventually, ease that migration by implementing a ndo_get_port_parent_id() function which returns what switchdev_port_attr_get() would do. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06netdevsim: Implement ndo_get_port_parent_id()Florian Fainelli1-17/+6
netdevsim only supports SWITCHDEV_ATTR_ID_PORT_PARENT_ID, which makes it a great candidate to be converted to use the ndo_get_port_parent_id() NDO instead of implementing switchdev_port_attr_get(). Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06rocker: Implement ndo_get_port_parent_id()Florian Fainelli1-5/+13
mlxsw implements SWITCHDEV_ATTR_ID_PORT_PARENT_ID and we want to get rid of switchdev_ops eventually, ease that migration by implementing a ndo_get_port_parent_id() function which returns what switchdev_port_attr_get() would do. Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06nfp: Implement ndo_get_port_parent_id()Florian Fainelli5-26/+11
NFP only supports SWITCHDEV_ATTR_ID_PORT_PARENT_ID, which makes it a great candidate to be converted to use the ndo_get_port_parent_id() NDO instead of implementing switchdev_port_attr_get(). Since NFP uses switchdev_port_same_parent_id() convert it to use netdev_port_same_parent_id(). Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06mscc: ocelot: Implement ndo_get_port_parent_id()Florian Fainelli1-20/+13
Ocelot only supports SWITCHDEV_ATTR_ID_PORT_PARENT_ID as a valid switchdev attribute getter, convert it to use ndo_get_port_parent_id() and get rid of the switchdev_ops::switchdev_port_attr_get altogether. Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06mlxsw: Implement ndo_get_port_parent_id()Florian Fainelli3-29/+26
mlxsw implements SWITCHDEV_ATTR_ID_PORT_PARENT_ID and we want to get rid of switchdev_ops eventually, ease that migration by implementing a ndo_get_port_parent_id() function which returns what switchdev_port_attr_get() would do. Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06net/mlx5e: Implement ndo_get_port_parent_id()Florian Fainelli3-24/+14
mlx5e only supports SWITCHDEV_ATTR_ID_PORT_PARENT_ID, which makes it a great candidate to be converted to use the ndo_get_port_parent_id() NDO instead of implementing switchdev_port_attr_get(). Since mlx5e makes use of switchdev_port_parent_id() convert it to use netdev_port_same_parent_id(). Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06liquidio: Implement ndo_get_port_parent_id()Florian Fainelli2-35/+12
Liquidio only supports SWITCHDEV_ATTR_ID_PORT_PARENT_ID, which makes it a great candidate to be converted to use the ndo_get_port_parent_id() NDO instead of implementing switchdev_port_attr_get(). Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06bnxt: Implement ndo_get_port_parent_id()Florian Fainelli4-31/+15
BNXT only supports SWITCHDEV_ATTR_ID_PORT_PARENT_ID, which makes it a great candidate to be converted to use the ndo_get_port_parent_id() NDO instead of implementing switchdev_port_attr_get(). The conversion is straight forward here since the PF and VF code use the same getter. Since bnxt makes uses of switchdev_port_same_parent_id() convert it to use netdev_port_same_parent_id(). Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06net: Introduce ndo_get_port_parent_id()Florian Fainelli6-5/+91
In preparation for getting rid of switchdev_ops, create a dedicated NDO operation for getting the port's parent identifier. There are essentially two classes of drivers that need to implement getting the port's parent ID which are VF/PF drivers with a built-in switch, and pure switchdev drivers such as mlxsw, ocelot, dsa etc. We introduce a helper function: dev_get_port_parent_id() which supports recursion into the lower devices to obtain the first port's parent ID. Convert the bridge, core and ipv4 multicast routing code to check for such ndo_get_port_parent_id() and call the helper function when valid before falling back to switchdev_port_attr_get(). This will allow us to convert all relevant drivers in one go instead of having to implement both switchdev_port_attr_get() and ndo_get_port_parent_id() operations, then get rid of switchdev_port_attr_get(). Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06cxgb4: Update 1.22.9.0 as the latest firmware supported.Vishal Kulkarni1-6/+6
Change t4fw_version.h to update latest firmware version number to 1.22.9.0. Signed-off-by: Vishal Kulkarni <vishal@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06cxgb4: Add new T6 PCI device ids 0x608bVishal Kulkarni1-0/+1
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06r8169: Avoid pointer aliasingThierry Reding1-3/+13
Read MAC address 32-bit at a time and manually extract the individual bytes. This avoids pointer aliasing and gives the compiler a better chance of optimizing the operation. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06r8169: Load MAC address from device tree if presentThierry Reding1-13/+23
If the system was booted using a device tree and if the device tree contains a MAC address, use it instead of reading one from the EEPROM. This is useful in situations where the EEPROM isn't properly programmed or where the firmware wants to override the existing MAC address. Signed-off-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06Merge branch 'mlxsw-core-Trace-EMAD-errors'David S. Miller3-1/+39
Ido Schimmel says: ==================== mlxsw: core: Trace EMAD errors Nir says: This patchset adds a trace for EMAD errors to the existing EMAD payload traces. This tracepoint is useful to track user or firmware errors during tests execution. Patch #1 defines the devlink tracepoint. Patch #2 uses it for reporting mlxsw EMAD errors. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06mlxsw: core: Trace EMAD errorsNir Dotan1-1/+5
Trace EMAD errors returned from HW. Signed-off-by: Nir Dotan <nird@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06devlink: add hardware errors tracing facilityNir Dotan2-0/+34
Define a tracepoint and allow user to trace messages in case of an hardware error code for hardware associated with devlink instance. Signed-off-by: Nir Dotan <nird@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06Merge branch 'dpaa2-eth-Driver-updates'David S. Miller2-55/+83
Ioana Ciocoi Radulescu says: ==================== dpaa2-eth: Driver updates First patch moves the driver to a page-per-frame memory model. The others are minor tweaks and optimizations. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06dpaa2-eth: Update buffer pool refill thresholdIoana Ciocoi Radulescu1-1/+2
Add more buffers to the Rx buffer pool as soon as 7 of them get consumed, instead of waiting for their number to drop below a fixed threshold. 7 is the number of buffers that can be released in the pool via a single DPIO command. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06dpaa2-eth: Use FQ-based DPIO enqueue APIIoana Ciocoi Radulescu2-6/+39
Starting with MC10.14.0, dpaa2_io_service_enqueue_fq() API is functional. Since there are a number of cases where it offers better performance compared to the currently used enqueue function, switch to it for firmware versions that support it. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06dpaa2-eth: Use napi_consume_skb()Ioana Ciocoi Radulescu1-4/+4
While in NAPI context, free skbs by calling napi_consume_skb() instead of dev_kfree_skb(), to take advantage of the bulk freeing mechanism. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06dpaa2-eth: Use a single page per Rx bufferIoana Ciocoi Radulescu2-44/+38
Instead of allocating page fragments via the network stack, use the page allocator directly. For now, we consume one page for each Rx buffer. With the new memory model we are free to consider adding more XDP support. Performance decreases slightly in some IP forwarding cases. No visible effect on termination traffic. The driver memory footprint increases as a result of this change, but it is still small enough to not really matter. Another side effect is that now Rx buffer alignment requirements are naturally satisfied without any additional actions needed. Remove alignment related code, except in the buffer layout information conveyed to MC, as hardware still needs to know the alignment value we guarantee. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06Merge branch 'add-flow_rule-infrastructure'David S. Miller22-2009/+2416
Pablo Neira Ayuso says: ==================== add flow_rule infrastructure This patchset, as is, allows us to reuse the driver codebase to configure ACL hardware offloads for the ethtool_rxnfc and the TC flower interfaces. A few clients for this infrastructure are presented, such as the bcm_sf2 and the qede drivers, for reference. Moreover all of the existing drivers in the tree are converted to use this infrastructure. This patchset is re-using the existing flow dissector infrastructure that was introduced by Jiri Pirko et al. so the amount of abstractions that this patchset adds are minimal. Well, just a few wrapper structures for the selector side of the rules. And, in order to express actions, this patchset exposes an action API that is based on the existing TC action infrastructure and what existing drivers already support on that front. v7: This patchset is a rebase on top of the net-next tree, after addressing questions and feedback from driver developers in the last batch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>