summaryrefslogtreecommitdiffstats
path: root/drivers/net
AgeCommit message (Collapse)AuthorFilesLines
2020-01-20net: dsa: mv88e6xxx: Add SERDES stats counters to all 6390 family membersAndrew Lunn1-0/+15
The SERDES statistics are valid for all members of the 6390 family, not just the 6390 itself. Add the needed callbacks to all members of the family. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-20net: phylink: allow in-band AN for USXGMIIAlex Marginean1-0/+1
USXGMII supports passing link information in-band between PHY and MAC PCS, add it to the list of protocols that support in-band AN mode. Being a MAC-PHY protocol that can auto-negotiate link speeds up to 10 Gbps, we populate the initial supported mask with the entire spectrum of link modes up to 10G that PHYLINK supports, and we let the driver reduce that mask in its .phylink_validate method. Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-20net: phy: don't crash in phy_read/_write_mmd without a PHY driverAlex Marginean1-2/+2
The APIs can be used by Ethernet drivers without actually loading a PHY driver. This may become more widespread in the future with 802.3z compatible MAC PCS devices being locally driven by the MAC driver when configuring for a PHY mode with in-band negotiation. Check that drv is not NULL before reading from it. Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-20net: phylink: Allow 2.5BASE-T, 5GBASE-T and 10GBASE-T for the 10G link modesVladimir Oltean1-0/+4
For some reason, PHYLINK does not put the copper modes for 802.3bz (NBASE-T) and 802.3an-2006 (10GBASE-T) in the PHY's supported mask, when the PHY-MAC connection is a 10G-capable one (10GBase-KR, 10GBase-R, USXGMII). One possible way through which the cable side can work at the lower speed is by having the PHY emit PAUSE frames towards the MAC. So fix that omission. Also include the 2500Base-X fiber mode in this list while we're at it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-20net: stmmac: modified pcs mode support for RGMIIDejin Zheng1-11/+6
snps databook noted that physical coding sublayer (PCS) interface that can be used when the MAC is configured for the TBI, RTBI, or SGMII PHY interface. we have RGMII and SGMII in a SoC and it also has the PCS block. it needs stmmac_init_phy and stmmac_mdio_register function for initializing phy when it used RGMII interface. Signed-off-by: Dejin Zheng <zhengdejin5@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/netDavid S. Miller44-192/+452
2020-01-19mlxsw: Add OVERLAY_SMAC_MC trapAmit Cohen2-0/+4
Add a trap for NVE packets that the device decided to drop because their overlay source MAC is multicast. Signed-off-by: Amit Cohen <amitc@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19mlxsw: Add tunnel devlink-trap supportAmit Cohen4-3/+19
Add the trap IDs and trap group used to report tunnel drops. Register tunnel packet traps and associated tunnel trap group with devlink during driver initialization. Signed-off-by: Amit Cohen <amitc@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19mlxsw: spectrum_trap: Reorder cases according to enum orderAmit Cohen1-2/+2
Move L3_DROPS case to appear after L2_DROPS case. Signed-off-by: Amit Cohen <amitc@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19mlxsw: Add ECN configurations with IPinIP tunnelsAmit Cohen3-0/+73
Initialize ECN mapping registers during router init according to INET_ECN_encapsulate() and INET_ECN_decapsulate(). For IP-in-IP encapsulation, this is required to ensure that ECN bits in the underlay are set in accordance with the kernel. For decapsulation, this is required to ensure that packets with invalid ECN combination in underlay and overlay are trapped to the kernel and not forwarded. Signed-off-by: Amit Cohen <amitc@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19mlxsw: reg: Add Tunneling IPinIP Decapsulation ECN Mapping RegisterAmit Cohen1-0/+57
This register configures the actions that are done during IPinIP decapsulation based on the ECN bits. Signed-off-by: Amit Cohen <amitc@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19mlxsw: reg: Add Tunneling IPinIP Encapsulation ECN Mapping RegisterAmit Cohen1-0/+31
This register performs mapping from overlay ECN to underlay ECN during IPinIP encapsulation. Signed-off-by: Amit Cohen <amitc@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19mlxsw: Add NON_ROUTABLE trapAmit Cohen2-0/+4
Add a trap for packets that the device decided to drop because they are not supposed to be routed. For example, IGMP queries can be flooded by the device in layer 2 and reach the router. Such packets should not be routed and instead dropped. Signed-off-by: Amit Cohen <amitc@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19mlxsw: Add irif and erif disabled trapsAmit Cohen2-0/+28
IRIF_DISABLED and ERIF_DISABLED are driver specific traps. Packets are dropped for these reasons when they need to be routed through/from existing router interfaces (RIF) which are disabled. Add devlink driver-specific traps and mlxsw trap IDs used to report these traps. Signed-off-by: Amit Cohen <amitc@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19Merge branch 'for-net-next' of ↵David S. Miller15-340/+980
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 E-Switch chains and prios This series has two parts, 1) A merge commit with mlx5-next branch that include updates for mlx5 HW layouts needed for this and upcoming submissions. 2) From Paul, Increase the number of chains and prios Currently the Mellanox driver supports offloading tc rules that are defined on the first 4 chains and the first 16 priorities. The restriction stems from the firmware flow level enforcement requiring a flow table of a certain level to point to a flow table of a higher level. This limitation may be ignored by setting the ignore_flow_level bit when creating flow table entries. Use unmanaged tables and ignore flow level to create more tables than declared by fs_core steering. Manually manage the connections between the tables themselves. HW table is instantiated for every tc <chain,prio> tuple. The miss rule of every table either jumps to the next <chain,prio> table, or continues to slow_fdb. This logic is realized by following this sequence: 1. Create an auto-grouped flow table for the specified priority with reserved entries Reserved entries are allocated at the end of the flow table. Flow groups are evaluated in sequence and therefore it is guaranteed that the flow group defined on the last FTEs will be the last to evaluate. Define a "match all" flow group on the reserved entries, providing the platform to add table miss actions. 2. Set the miss rule action to jump to the next <chain,prio> table or the slow_fdb. 3. Link the previous priority table to point to the new table by updating its miss rule. Please pull and let me know if there's any problem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19cxgb4: reject overlapped queues in TC-MQPRIO offloadRahul Lakkireddy1-1/+27
A queue can't belong to multiple traffic classes. So, reject any such configuration that results in overlapped queues for a traffic class. Fixes: b1396c2bd675 ("cxgb4: parse and configure TC-MQPRIO offload") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19cxgb4: fix Tx multi channel port rate limitRahul Lakkireddy4-3/+96
T6 can support 2 egress traffic management channels per port to double the total number of traffic classes that can be configured. In this configuration, if the class belongs to the other channel, then all the queues must be bound again explicitly to the new class, for the rate limit parameters on the other channel to take effect. So, always explicitly bind all queues to the port rate limit traffic class, regardless of the traffic management channel that it belongs to. Also, only bind queues to port rate limit traffic class, if all the queues don't already belong to an existing different traffic class. Fixes: 4ec4762d8ec6 ("cxgb4: add TC-MATCHALL classifier egress offload") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19net: phy: adin: fix a warning about msleepDejin Zheng1-1/+1
found a warning by the following command: ./scripts/checkpatch.pl -f drivers/net/phy/adin.c WARNING: msleep < 20ms can sleep for up to 20ms; see Documentation/timers/timers-howto.rst #628: FILE: drivers/net/phy/adin.c:628: + msleep(10); Signed-off-by: Dejin Zheng <zhengdejin5@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19net: dsa: felix: Allow PHY to AN 10/100/1000 with 2500 serdes linkAlex Marginean1-5/+4
If the serdes link is set to 2500 using interfce type 2500base-X, lower link speeds over on the line side should still be supported. Rate adaptation is done out of band, in our case using AQR PHYs this is done using flow control. Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19net: dsa: felix: Handle PAUSE RX regardless of AN resultAlex Marginean1-2/+6
Flow control is used with 2500Base-X and AQR PHYs to do rate adaptation between line side 100/1000 links and MAC running at 2.5G. This is independent of the flow control configuration settled on line side though AN. In general, allowing the MAC to handle flow control even if not negotiated with the link partner should not be a problem, so the patch just enables it in all cases. Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-18bnxt_en: Do not treat DSN (Digital Serial Number) read failure as fatal.Michael Chan3-4/+7
DSN read can fail, for example on a kdump kernel without PCIe extended config space support. If DSN read fails, don't set the BNXT_FLAG_DSN_VALID flag and continue loading. Check the flag to see if the stored DSN is valid before using it. Only VF reps creation should fail without valid DSN. Fixes: 03213a996531 ("bnxt: move bp->switch_id initialization to PF probe") Reported-by: Marc Smith <msmith626@gmail.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-18bnxt_en: Fix ipv6 RFS filter matching logic.Michael Chan1-5/+17
Fix bnxt_fltr_match() to match ipv6 source and destination addresses. The function currently only checks ipv4 addresses and will not work corrently on ipv6 filters. Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-18bnxt_en: Fix NTUPLE firmware command failures.Michael Chan1-3/+0
The NTUPLE related firmware commands are sent to the wrong firmware channel, causing all these commands to fail on new firmware that supports the new firmware channel. Fix it by excluding the 3 NTUPLE firmware commands from the list for the new firmware channel. Fixes: 760b6d33410c ("bnxt_en: Add support for 2nd firmware message channel.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-17ice: remove redundant assignment to variable xmit_doneColin Ian King1-1/+1
The variable xmit_done is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-17ice: Removing hung_queue variable to use txqueue function parameterJulio Faracco1-30/+11
The scope of function .ndo_tx_timeout was changed to include the hang queue when a TX timeout event occurs. See commit 0290bd291cc0 ("netdev: pass the stuck queue to the timeout handler") for more details. Now, drivers don't need to identify which queue is stopped. Drivers can simply use the queue index provided by dev_watchdog and execute all actions needed to restore network traffic. This commit do some cleanups into Intel ice driver to remove a redundant loop to find stopped queue. Signed-off-by: Julio Faracco <jcfaracco@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-17i40e: Removing hung_queue variable to use txqueue function parameterJulio Faracco1-30/+11
The scope of function .ndo_tx_timeout was changed to include the hang queue when a TX timeout event occurs. See commit 0290bd291cc0 ("netdev: pass the stuck queue to the timeout handler") for more details. Now, drivers don't need to identify which queue is stopped. Drivers can simply use the queue index provided by dev_watchdog and execute all actions needed to restore network traffic. This commit do some cleanups into Intel i40e driver to remove a redundant loop to find stopped queue. Signed-off-by: Julio Faracco <jcfaracco@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-17fm10k: use txqueue parameter in fm10k_tx_timeoutJacob Keller1-7/+10
Make use of the new txqueue parameter to the .ndo_tx_timeout function. In fm10k_tx_timeout, remove the now unnecessary loop to determine which Tx queue is stuck. Instead, just double check the specified queue This could be improved further to attempt resetting only the specific queue that got stuck. However, that is a much larger refactor and has been left as a future improvement. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-17igc: Add PHY power management controlSasha Neftin3-1/+17
PHY power management control should provide a reliable and accurate indication of PHY reset completion and decrease the delay time after a PHY reset Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-17igc: Add support for TSOSasha Neftin2-1/+116
TCP segmentation offload allows a device to segment a single frame into multiple frames with a data payload size specified in socket buffer. As a result we can now send data approximately up to seven percents fast than was previously possible on my system. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-17igc: Add SKU for i225 deviceSasha Neftin3-0/+3
Add support for blank NVM SKU Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-17igc: Remove unused definitionSasha Neftin1-2/+0
Remove the unused IGC_FUNC_0 definition and make the code cleaner Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-17igc: Fix typo in a commentSasha Neftin1-1/+1
Fix typo in a context descriptor comment Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-17net: systemport: Fixed queue mapping in internal ring mapFlorian Fainelli1-3/+4
We would not be transmitting using the correct SYSTEMPORT transmit queue during ndo_select_queue() which looks up the internal TX ring map because while establishing the mapping we would be off by 4, so for instance, when we populate switch port mappings we would be doing: switch port 0, queue 0 -> ring index #0 switch port 0, queue 1 -> ring index #1 ... switch port 0, queue 3 -> ring index #3 switch port 1, queue 0 -> ring index #8 (4 + 4 * 1) ... instead of using ring index #4. This would cause our ndo_select_queue() to use the fallback queue mechanism which would pick up an incorrect ring for that switch port. Fix this by using the correct switch queue number instead of SYSTEMPORT queue number. Fixes: 25c440704661 ("net: systemport: Simplify queue mapping logic") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-17net: dsa: bcm_sf2: Configure IMP port for 2Gb/secFlorian Fainelli1-1/+1
With the implementation of the system reset controller we lost a setting that is currently applied by the bootloader and which configures the IMP port for 2Gb/sec, the default is 1Gb/sec. This is needed given the number of ports and applications we expect to run so bring back that setting. Fixes: 01b0ac07589e ("net: dsa: bcm_sf2: Add support for optional reset controller line") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-17net: dsa: sja1105: Don't error out on disabled ports with no phy-modeVladimir Oltean1-1/+1
The sja1105_parse_ports_node function was tested only on device trees where all ports were enabled. Fix this check so that the driver continues to probe only with the ports where status is not "disabled", as expected. Fixes: 8aa9ebccae87 ("net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-17net: dsa: felix: Don't error out on disabled ports with no phy-modeVladimir Oltean1-1/+1
The felix_parse_ports_node function was tested only on device trees where all ports were enabled. Fix this check so that the driver continues to probe only with the ports where status is not "disabled", as expected. Fixes: bdeced75b13f ("net: dsa: felix: Add PCS operations for PHYLINK") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-17net: dsa: felix: Don't restart PCS SGMII AN if not neededAlex Marginean1-0/+21
Some PHYs like VSC8234 don't like it when AN restarts on their system side and they restart line side AN too, going into an endless link up/down loop. Don't restart PCS AN if link is up already. Although in theory this feedback loop should be possible with the other in-band AN modes too, for some reason it was not seen with the VSC8514 QSGMII and AQR412 USXGMII PHYs. So keep this logic only for SGMII where the problem was found. Fixes: bdeced75b13f ("net: dsa: felix: Add PCS operations for PHYLINK") Suggested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-17net: dsa: felix: Set USXGMII link based on BMSR, not LPAAlex Marginean1-1/+0
At least some PHYs (AQR412) don't advertise copper-side link status during system side AN. So remove this duplicate assignment to pcs->link and rely on the previous one for link state: the local indication from the MAC PCS. Fixes: bdeced75b13f ("net: dsa: felix: Add PCS operations for PHYLINK") Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-17enetc: Don't print from enetc_sched_speed_set when link goes downVladimir Oltean1-1/+0
It is not an error to unplug a cable from the ENETC port even with TSN offloads, so don't spam the log with link-related messages from the tc-taprio offload subsystem, a single notification is sufficient: [10972.351859] fsl_enetc 0000:00:00.0 eno0: Qbv PSPEED set speed link down. [10972.360241] fsl_enetc 0000:00:00.0 eno0: Link is Down Fixes: 2e47cb415f0a ("enetc: update TSN Qbv PSPEED set according to adjust link speed") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-17net: phy: adin: const-ify static dataAlexandru Ardelean1-5/+5
Some bits of static data should have been made const from the start. This change adds the const qualifier where appropriate. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-17net: phy: dp83867: Set FORCE_LINK_GOOD to default after resetMichael Grzeschik1-1/+7
According to the Datasheet this bit should be 0 (Normal operation) in default. With the FORCE_LINK_GOOD bit set, it is not possible to get a link. This patch sets FORCE_LINK_GOOD to the default value after resetting the phy. Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-17drivers/net: netdevsim depends on INETHongbo Yao1-0/+1
If CONFIG_INET is not set and CONFIG_NETDEVSIM=y. Building drivers/net/netdevsim/fib.o will get the following error: drivers/net/netdevsim/fib.o: In function `nsim_fib4_rt_hw_flags_set': fib.c:(.text+0x12b): undefined reference to `fib_alias_hw_flags_set' drivers/net/netdevsim/fib.o: In function `nsim_fib4_rt_destroy': fib.c:(.text+0xb11): undefined reference to `free_fib_info' Correct the Kconfig for netdevsim. Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: 48bb9eb47b270 ("netdevsim: fib: Add dummy implementation for FIB offload") Signed-off-by: Hongbo Yao <yaohongbo@huawei.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-17net: hns: fix soft lockup when there is not enough memoryYonglong Liu1-3/+1
When there is not enough memory and napi_alloc_skb() return NULL, the HNS driver will print error message, and than try again, if the memory is not enough for a while, huge error message and the retry operation will cause soft lockup. When napi_alloc_skb() return NULL because of no memory, we can get a warn_alloc() call trace, so this patch deletes the error message. We already use polling mode to handle irq, but the retry operation will render the polling weight inactive, this patch just return budget when the rx is not completed to avoid dead loop. Fixes: 36eedfde1a36 ("net: hns: Optimize hns_nic_common_poll for better performance") Fixes: b5996f11ea54 ("net: add Hisilicon Network Subsystem basic ethernet support") Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-17net: phy: Maintain MDIO device and bus statisticsFlorian Fainelli1-2/+249
We maintain global statistics for an entire MDIO bus, as well as broken down, per MDIO bus address statistics. Given that it is possible for MDIO devices such as switches to access MDIO bus addresses for which there is not a mdio_device instance created (therefore not a a corresponding device directory in sysfs either), we also maintain per-address statistics under the statistics folder. The layout looks like this: /sys/class/mdio_bus/../statistics/ transfers errrors writes reads transfers_<addr> errors_<addr> writes_<addr> reads_<addr> When a mdio_device instance is registered, a statistics/ folder is created with the tranfers, errors, writes and reads attributes which point to the appropriate MDIO bus statistics structure. Statistics are 64-bit unsigned quantities and maintained through the u64_stats_sync.h helper functions. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-17netdevsim: fix nsim_fib6_rt_create() error pathEric Dumazet1-3/+3
It seems nsim_fib6_rt_create() intent was to return either a valid pointer or an embedded error code. BUG: unable to handle page fault for address: fffffffffffffff4 PGD 9870067 P4D 9870067 PUD 9872067 PMD 0 Oops: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 22851 Comm: syz-executor.1 Not tainted 5.5.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:jhash2 include/linux/jhash.h:125 [inline] RIP: 0010:rhashtable_jhash2+0x76/0x2c0 lib/rhashtable.c:963 Code: b9 00 00 00 00 00 fc ff df 48 c1 e8 03 0f b6 14 08 4c 89 f0 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 30 02 00 00 49 8d 7e 04 <41> 8b 06 48 be 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 0f b6 RSP: 0018:ffffc90016127190 EFLAGS: 00010246 RAX: 0000000000000007 RBX: 00000000dfb3ab49 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: ffffffff839ba7c8 RDI: fffffffffffffff8 RBP: ffffc900161271c0 R08: ffff8880951f8640 R09: ffffed1015d0703d R10: ffffed1015d0703c R11: ffff8880ae8381e3 R12: 00000000dfb3ab49 R13: 00000000dfb3ab49 R14: fffffffffffffff4 R15: 0000000000000007 FS: 00007f40bfbc6700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: fffffffffffffff4 CR3: 0000000093660000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: rht_key_get_hash include/linux/rhashtable.h:133 [inline] rht_key_hashfn include/linux/rhashtable.h:159 [inline] rht_head_hashfn include/linux/rhashtable.h:174 [inline] __rhashtable_insert_fast.constprop.0+0xe15/0x1180 include/linux/rhashtable.h:723 rhashtable_insert_fast include/linux/rhashtable.h:832 [inline] nsim_fib6_rt_add drivers/net/netdevsim/fib.c:603 [inline] nsim_fib6_rt_insert drivers/net/netdevsim/fib.c:658 [inline] nsim_fib6_event drivers/net/netdevsim/fib.c:719 [inline] nsim_fib_event drivers/net/netdevsim/fib.c:744 [inline] nsim_fib_event_nb+0x1b16/0x2600 drivers/net/netdevsim/fib.c:772 notifier_call_chain+0xc2/0x230 kernel/notifier.c:83 __atomic_notifier_call_chain+0xa6/0x1a0 kernel/notifier.c:173 atomic_notifier_call_chain+0x2e/0x40 kernel/notifier.c:183 call_fib_notifiers+0x173/0x2a0 net/core/fib_notifier.c:35 call_fib6_notifiers+0x4b/0x60 net/ipv6/fib6_notifier.c:22 call_fib6_entry_notifiers+0xfb/0x150 net/ipv6/ip6_fib.c:399 fib6_add_rt2node net/ipv6/ip6_fib.c:1216 [inline] fib6_add+0x20cd/0x3ec0 net/ipv6/ip6_fib.c:1471 __ip6_ins_rt+0x54/0x80 net/ipv6/route.c:1315 ip6_ins_rt+0x96/0xd0 net/ipv6/route.c:1325 __ipv6_dev_ac_inc+0x76f/0xb20 net/ipv6/anycast.c:324 ipv6_sock_ac_join+0x4c1/0x790 net/ipv6/anycast.c:139 do_ipv6_setsockopt.isra.0+0x3908/0x4290 net/ipv6/ipv6_sockglue.c:670 ipv6_setsockopt+0xff/0x180 net/ipv6/ipv6_sockglue.c:944 udpv6_setsockopt+0x68/0xb0 net/ipv6/udp.c:1564 sock_common_setsockopt+0x94/0xd0 net/core/sock.c:3149 __sys_setsockopt+0x261/0x4c0 net/socket.c:2130 __do_sys_setsockopt net/socket.c:2146 [inline] __se_sys_setsockopt net/socket.c:2143 [inline] __x64_sys_setsockopt+0xbe/0x150 net/socket.c:2143 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x45aff9 Fixes: 48bb9eb47b27 ("netdevsim: fib: Add dummy implementation for FIB offload") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-17net: xen-netback: hash.c: Use built-in RCU list checkingMadhuparna Bhowmik1-2/+4
list_for_each_entry_rcu has built-in RCU and lock checking. Pass cond argument to list_for_each_entry_rcu. Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik04@gmail.com> Acked-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-16xdp: Use bulking for non-map XDP_REDIRECT and consolidate code pathsToke Høiland-Jørgensen3-4/+4
Since the bulk queue used by XDP_REDIRECT now lives in struct net_device, we can re-use the bulking for the non-map version of the bpf_redirect() helper. This is a simple matter of having xdp_do_redirect_slow() queue the frame on the bulk queue instead of sending it out with __bpf_tx_xdp(). Unfortunately we can't make the bpf_redirect() helper return an error if the ifindex doesn't exit (as bpf_redirect_map() does), because we don't have a reference to the network namespace of the ingress device at the time the helper is called. So we have to leave it as-is and keep the device lookup in xdp_do_redirect_slow(). Since this leaves less reason to have the non-map redirect code in a separate function, so we get rid of the xdp_do_redirect_slow() function entirely. This does lose us the tracepoint disambiguation, but fortunately the xdp_redirect and xdp_redirect_map tracepoints use the same tracepoint entry structures. This means both can contain a map index, so we can just amend the tracepoint definitions so we always emit the xdp_redirect(_err) tracepoints, but with the map ID only populated if a map is present. This means we retire the xdp_redirect_map(_err) tracepoints entirely, but keep the definitions around in case someone is still listening for them. With this change, the performance of the xdp_redirect sample program goes from 5Mpps to 8.4Mpps (a 68% increase). Since the flush functions are no longer map-specific, rename the flush() functions to drop _map from their names. One of the renamed functions is the xdp_do_flush_map() callback used in all the xdp-enabled drivers. To keep from having to update all drivers, use a #define to keep the old name working, and only update the virtual drivers in this patch. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/157918768505.1458396.17518057312953572912.stgit@toke.dk
2020-01-16net/mlx5: E-Switch, Increase number of chains and prioritiesPaul Blakey3-12/+232
Increase the number of chains and priorities to support the whole range available in tc. We use unmanaged tables and ignore flow level to create more tables than what we declared to fs_core steering, and we manage the connections between the tables themselves. To support that we need FW with ignore_flow_level capability. Otherwise the old behaviour will be used, where we are limited by the number of levels we declared (4 chains, 16 prios). Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16net/mlx5: E-Switch, Refactor chains and prioritiesPaul Blakey7-292/+637
To support the entire chain and prio range (32bit + 16bit), instead of a using a static array of chains/prios of limited size, create them dynamically, and use a rhashtable to search for existing chains/prio combinations. This will be used in next patch to actually increase the number using unamanged tables support and ignore flow level capability. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16net/mlx5: ft: Check prio and chain sanity for ft offloadPaul Blakey1-7/+20
Before changing the chain from original chain to ft offload chain, make sure user doesn't actually use chains. While here, normalize the prio range to that which we support. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>