summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2021-02-16net: phy: Add is_on_sfp_module flag and phy_on_sfp helperRobert Hancock2-0/+13
Add a flag and helper function to indicate that a PHY device is part of an SFP module, which is set on attach. This can be used by PHY drivers to handle SFP-specific quirks or behavior. Signed-off-by: Robert Hancock <robert.hancock@calian.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: phy: broadcom: Set proper 1000BaseX/SGMII interface mode for BCM54616SRobert Hancock2-12/+76
The default configuration for the BCM54616S PHY may not match the desired mode when using 1000BaseX or SGMII interface modes, such as when it is on an SFP module. Add code to explicitly set the correct mode using programming sequences provided by Bel-Fuse: https://www.belfuse.com/resources/datasheets/powersolutions/ds-bps-sfp-1gbt-05-series.pdf https://www.belfuse.com/resources/datasheets/powersolutions/ds-bps-sfp-1gbt-06-series.pdf Signed-off-by: Robert Hancock <robert.hancock@calian.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16lan743x: sync only the received area of an rx ring bufferSven Van Asbroeck1-9/+26
On cpu architectures w/o dma cache snooping, dma_unmap() is a is a very expensive operation, because its resulting sync needs to invalidate cpu caches. Increase efficiency/performance by syncing only those sections of the lan743x's rx ring buffers that are actually in use. Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com> Reviewed-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16lan743x: boost performance on cpu archs w/o dma cache snoopingSven Van Asbroeck2-181/+148
The buffers in the lan743x driver's receive ring are always 9K, even when the largest packet that can be received (the mtu) is much smaller. This performs particularly badly on cpu archs without dma cache snooping (such as ARM): each received packet results in a 9K dma_{map|unmap} operation, which is very expensive because cpu caches need to be invalidated. Careful measurement of the driver rx path on armv7 reveals that the cpu spends the majority of its time waiting for cache invalidation. Optimize by keeping the rx ring buffer size as close as possible to the mtu. This limits the amount of cache that requires invalidation. This optimization would normally force us to re-allocate all ring buffers when the mtu is changed - a disruptive event, because it can only happen when the network interface is down. Remove the need to re-allocate all ring buffers by adding support for multi-buffer frames. Now any combination of mtu and ring buffer size will work. When the mtu changes from mtu1 to mtu2, consumed buffers of size mtu1 are lazily replaced by newly allocated buffers of size mtu2. These optimizations double the rx performance on armv7. Third parties report 3x rx speedup on armv8. Tested with iperf3 on a freescale imx6qp + lan7430, both sides set to mtu 1500 bytes, measure rx performance: Before: [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-20.00 sec 550 MBytes 231 Mbits/sec 0 After: [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-20.00 sec 1.33 GBytes 570 Mbits/sec 0 Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com> Reviewed-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: enetc: fix destroyed phylink dereference during unbindVladimir Oltean1-2/+3
The following call path suggests that calling unregister_netdev on an interface that is up will first bring it down. enetc_pf_remove -> unregister_netdev -> unregister_netdevice_queue -> unregister_netdevice_many -> dev_close_many -> __dev_close_many -> enetc_close -> enetc_stop -> phylink_stop However, enetc first destroys the phylink instance, then calls unregister_netdev. This is already dissimilar to the setup (and error path teardown path) from enetc_pf_probe, but more than that, it is buggy because it is invalid to call phylink_stop after phylink_destroy. So let's first unregister the netdev (and let the .ndo_stop events consume themselves), then destroy the phylink instance, then free the netdev. Fixes: 71b77a7a27a3 ("enetc: Migrate to PHYLINK and PCS_LYNX") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16Merge branch 'net-mvneta-implement-basic-MQPrio-support'David S. Miller1-1/+69
Maxime Chevallier says: ==================== net: mvneta: implement basic MQPrio support This is V2 for the MQPrio support in mvneta. This small series adds basic support for mqprio offloading, by having the rx queueing mirroring the TCs based on VLAN prio fields. This was tested on Armada 3700, and proves useful to make sure high-priority traffic has a better chance not getting dropped when there's lots of packets incoming. The first patch of the series deals with the per-cpu interrupts on the armada 3700. Since they don't work, there were already some patches applied to keep all queue mappings to CPU0, but there still were some remaining mappings left to be dealt with. The second patch implements the MQPrio offloading for the receive path. Changes in V2 : - Add a Fixes tag for the first patch - Fix some warnings and the xmas tree in the second patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: mvneta: Implement mqprio supportMaxime Chevallier1-0/+61
Implement a basic MQPrio support, inserting rules in RX that translate the TC to prio mapping into vlan prio to queues. The TX logic stays the same as when we don't offload the qdisc. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: mvneta: Remove per-cpu queue mapping for Armada 3700Maxime Chevallier1-1/+8
According to Errata #23 "The per-CPU GbE interrupt is limited to Core 0", we can't use the per-cpu interrupt mechanism on the Armada 3700 familly. This is correctly checked for RSS configuration, but the initial queue mapping is still done by having the queues spread across all the CPUs in the system, both in the init path and in the cpu_hotplug path. Fixes: 2636ac3cc2b4 ("net: mvneta: Add network support for Armada 3700 SoC") Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: sched: fix police ext initializationVlad Buslov3-1/+3
When police action is created by cls API tcf_exts_validate() first conditional that calls tcf_action_init_1() directly, the action idr is not updated according to latest changes in action API that require caller to commit newly created action to idr with tcf_idr_insert_many(). This results such action not being accessible through act API and causes crash reported by syzbot: ================================================================== BUG: KASAN: null-ptr-deref in instrument_atomic_read include/linux/instrumented.h:71 [inline] BUG: KASAN: null-ptr-deref in atomic_read include/asm-generic/atomic-instrumented.h:27 [inline] BUG: KASAN: null-ptr-deref in __tcf_idr_release net/sched/act_api.c:178 [inline] BUG: KASAN: null-ptr-deref in tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598 Read of size 4 at addr 0000000000000010 by task kworker/u4:5/204 CPU: 0 PID: 204 Comm: kworker/u4:5 Not tainted 5.11.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x107/0x163 lib/dump_stack.c:120 __kasan_report mm/kasan/report.c:400 [inline] kasan_report.cold+0x5f/0xd5 mm/kasan/report.c:413 check_memory_region_inline mm/kasan/generic.c:179 [inline] check_memory_region+0x13d/0x180 mm/kasan/generic.c:185 instrument_atomic_read include/linux/instrumented.h:71 [inline] atomic_read include/asm-generic/atomic-instrumented.h:27 [inline] __tcf_idr_release net/sched/act_api.c:178 [inline] tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598 tc_action_net_exit include/net/act_api.h:151 [inline] police_exit_net+0x168/0x360 net/sched/act_police.c:390 ops_exit_list+0x10d/0x160 net/core/net_namespace.c:190 cleanup_net+0x4ea/0xb10 net/core/net_namespace.c:604 process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421 kthread+0x3b1/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296 ================================================================== Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 204 Comm: kworker/u4:5 Tainted: G B 5.11.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x107/0x163 lib/dump_stack.c:120 panic+0x306/0x73d kernel/panic.c:231 end_report+0x58/0x5e mm/kasan/report.c:100 __kasan_report mm/kasan/report.c:403 [inline] kasan_report.cold+0x67/0xd5 mm/kasan/report.c:413 check_memory_region_inline mm/kasan/generic.c:179 [inline] check_memory_region+0x13d/0x180 mm/kasan/generic.c:185 instrument_atomic_read include/linux/instrumented.h:71 [inline] atomic_read include/asm-generic/atomic-instrumented.h:27 [inline] __tcf_idr_release net/sched/act_api.c:178 [inline] tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598 tc_action_net_exit include/net/act_api.h:151 [inline] police_exit_net+0x168/0x360 net/sched/act_police.c:390 ops_exit_list+0x10d/0x160 net/core/net_namespace.c:190 cleanup_net+0x4ea/0xb10 net/core/net_namespace.c:604 process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421 kthread+0x3b1/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296 Kernel Offset: disabled Fix the issue by calling tcf_idr_insert_many() after successful action initialization. Fixes: 0fedc63fadf0 ("net_sched: commit action insertions together") Reported-by: syzbot+151e3e714d34ae4ce7e8@syzkaller.appspotmail.com Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16Merge branch 'mlx5-next' of ↵David S. Miller11-129/+501
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== pull-request: mlx5-next 2021-02-16 The patches in this pr are already submitted and reviewed through the netdev and rdma mailing lists. The series includes mlx5 HW bits and definitions for mlx5 real time clock translation and handling in the mlx5 driver clock module to enable and support such mode [1] [1] https://patchwork.kernel.org/project/netdevbpf/patch/20210212223042.449816-7-saeed@kernel.org/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16drivers: net: xilinx_emaclite: remove arch limitationGary Guo2-3/+2
The changes made in eccd540 is enough for xilinx_emaclite to run without problem on 64-bit systems. I have tested it on a Xilinx FPGA with RV64 softcore. The architecture limitation in Kconfig seems no longer necessary. A small change is included to print address with %lx instead of casting to int and print with %x. Signed-off-by: Gary Guo <gary@garyguo.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16Merge branch 'bridge-mrp-Extend-br_mrp_switchdev_'David S. Miller17-104/+715
Horatiu Vulturv says: ==================== bridge: mrp: Extend br_mrp_switchdev_* This patch series extends MRP switchdev to allow the SW to have a better understanding if the HW can implement the MRP functionality or it needs to help the HW to run it. There are 3 cases: - when HW can't implement at all the functionality. - when HW can implement a part of the functionality but needs the SW implement the rest. For example if it can't detect when it stops receiving MRP Test frames but it can copy the MRP frames to CPU to allow the SW to determine this. Another example is generating the MRP Test frames. If HW can't do that then the SW is used as backup. - when HW can implement completely the functionality. So, initially the SW tries to offload the entire functionality in HW, if that fails it tries offload parts of the functionality in HW and use the SW as helper and if also this fails then MRP can't run on this HW. Based on these new calls, implement the switchdev for Ocelot driver. This is an example where the HW can't run completely the functionality but it can help the SW to run it, by trapping all MRP frames to CPU. Also this patch series adds MRP support to DSA and implements the Felix driver which just reuse the Ocelot functions. This part was just compiled tested because I don't have any HW on which to do the actual tests. v4: - remove ifdef MRP from include/net/switchdev.h - move MRP implementation for Ocelot in a different file such that Felix driver can use it. - extend DSA with MRP support - implement MRP support for Felix. v3: - implement the switchdev calls needed by Ocelot driver. v2: - fix typos in comments and in commit messages - remove some of the comments - move repeated code in helper function - fix issue when deleting a node when sw_backup was true ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: dsa: felix: Add support for MRPHoratiu Vultur2-0/+46
Implement functions 'port_mrp_add', 'port_mrp_del', 'port_mrp_add_ring_role' and 'port_mrp_del_ring_role' to call the mrp functions from ocelot. Also all MRP frames that arrive to CPU on queue number OCELOT_MRP_CPUQ will be forward by the SW. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: dsa: add MRP supportHoratiu Vultur5-0/+213
Add support for offloading MRP in HW. Currently implement the switchdev calls 'SWITCHDEV_OBJ_ID_MRP', 'SWITCHDEV_OBJ_ID_RING_ROLE_MRP', to allow to create MRP instances and to set the role of these instances. Add DSA_NOTIFIER_MRP_ADD/DEL and DSA_NOTIFIER_MRP_ADD/DEL_RING_ROLE which calls to .port_mrp_add/del and .port_mrp_add/del_ring_role in the DSA driver for the switch. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: mscc: ocelot: Add support for MRPHoratiu Vultur6-1/+295
Add basic support for MRP. The HW will just trap all MRP frames on the ring ports to CPU and allow the SW to process them. In this way it is possible to for this node to behave both as MRM and MRC. Current limitations are: - it doesn't support Interconnect roles. - it supports only a single ring. - the HW should be able to do forwarding of MRP Test frames so the SW will not need to do this. So it would be able to have the role MRC without SW support. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16bridge: mrp: Update br_mrp to use new return values of br_mrp_switchdevHoratiu Vultur1-16/+27
Check the return values of the br_mrp_switchdev function. In case of: - BR_MRP_NONE, return the error to userspace, - BR_MRP_SW, continue with SW implementation, - BR_MRP_HW, continue without SW implementation, Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16bridge: mrp: Extend br_mrp_switchdev to detect better the errorsHoratiu Vultur2-77/+118
This patch extends the br_mrp_switchdev functions to be able to have a better understanding what cause the issue and if the SW needs to be used as a backup. There are the following cases: - when the code is compiled without CONFIG_NET_SWITCHDEV. In this case return success so the SW can continue with the protocol. Depending on the function, it returns 0 or BR_MRP_SW. - when code is compiled with CONFIG_NET_SWITCHDEV and the driver doesn't implement any MRP callbacks. In this case the HW can't run MRP so it just returns -EOPNOTSUPP. So the SW will stop further to configure the node. - when code is compiled with CONFIG_NET_SWITCHDEV and the driver fully supports any MRP functionality. In this case the SW doesn't need to do anything. The functions will return 0 or BR_MRP_HW. - when code is compiled with CONFIG_NET_SWITCHDEV and the HW can't run completely the protocol but it can help the SW to run it. For example, the HW can't support completely MRM role(can't detect when it stops receiving MRP Test frames) but it can redirect these frames to CPU. In this case it is possible to have a SW fallback. The SW will try initially to call the driver with sw_backup set to false, meaning that the HW should implement completely the role. If the driver returns -EOPNOTSUPP, the SW will try again with sw_backup set to false, meaning that the SW will detect when it stops receiving the frames but it needs HW support to redirect the frames to CPU. In case the driver returns 0 then the SW will continue to configure the node accordingly. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16bridge: mrp: Add 'enum br_mrp_hw_support'Horatiu Vultur1-0/+14
Add the enum br_mrp_hw_support that is used by the br_mrp_switchdev functions to allow the SW to detect the cases where HW can't implement the functionality or when SW is used as a backup. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16switchdev: mrp: Extend ring_role_mrp and in_role_mrpHoratiu Vultur1-0/+2
Add the member sw_backup to the structures switchdev_obj_ring_role_mrp and switchdev_obj_in_role_mrp. In this way the SW can call the driver in 2 ways, once when sw_backup is set to false, meaning that the driver should implement this completely in HW. And if that is not supported the SW will call again but with sw_backup set to true, meaning that the HW should help or allow the SW to run the protocol. For example when role is MRM, if the HW can't detect when it stops receiving MRP Test frames but it can trap these frames to CPU, then it needs to return -EOPNOTSUPP when sw_backup is false and return 0 when sw_backup is true. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16switchdev: mrp: Remove CONFIG_BRIDGE_MRPHoratiu Vultur1-10/+0
Remove #IS_ENABLED(CONFIG_BRIDGE_MRP) from switchdev.h. This will simplify the code implements MRP callbacks and will be similar with the vlan filtering. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: phy: marvell: Ensure SGMII auto-negotiation is enabled for 88E1111Robert Hancock1-5/+8
When 88E1111 is operating in SGMII mode, auto-negotiation should be enabled on the SGMII side so that the link will come up properly with PCSes which normally have auto-negotiation enabled. This is normally the case when the PHY defaults to SGMII mode at power-up, however if we switched it from some other mode like 1000Base-X, as may happen in some SFP module situations, it may not be, particularly for modules which have 1000Base-X auto-negotiation defaulting to disabled. Call genphy_check_and_restart_aneg on the fiber page to ensure that auto- negotiation is properly enabled on the SGMII interface. Signed-off-by: Robert Hancock <robert.hancock@calian.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16Merge branch 'Add-5gbase-r-PHY-interface-mode'David S. Miller5-0/+18
Marek Behún says: ====================- Add 5gbase-r PHY interface mode there is still some testing needed for Amethyst patches, so I have split the part adding support for 5gbase-r interface mode and am sending it alone. The first two patches are already reviewed. Changes since last patches (Amethyst v16): - added phylink 5gbase-r handler - added SFP support for 5gbase-r mode ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16sfp: add support for 5gbase-t SFPsMarek Behún1-0/+3
The sfp_parse_support() function is setting 5000baseT_Full in some cases. Now that we have PHY_INTERFACE_MODE_5GBASER interface mode available, change sfp_select_interface() to return PHY_INTERFACE_MODE_5GBASER if 5000baseT_Full is set in the link mode mask. Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: phylink: Add 5gbase-r supportMarek Behún1-0/+4
Add 5GBASER interface type and speed to phylink. Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: phy: Add 5GBASER interface modePavana Sharma2-0/+10
Add 5GBASE-R phy interface mode Signed-off-by: Pavana Sharma <pavana.sharma@digi.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16dt-bindings: net: Add 5GBASER phy interfacePavana Sharma1-0/+1
Add 5gbase-r PHY interface mode. Signed-off-by: Pavana Sharma <pavana.sharma@digi.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16tg3: Remove unused PHY_BRCM flagsFlorian Fainelli2-12/+3
The tg3 driver tried to communicate towards the PHY driver whether it wanted RGMII in-band signaling enabled or disabled however there is nothing that looks at those flags in drivers/net/phy/broadcom.c so this does do not anything. Suggested-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16Merge branch 'amd-xgbe-fixes'David S. Miller4-3/+54
Shyam Sundar S K says: ==================== Bug fixes to amd-xgbe driver General fixes on amd-xgbe driver are addressed in this series, mostly on the mailbox communication failures and improving the link stability of the amd-xgbe device. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: amd-xgbe: Fix network fluctuations when using 1G BELFUSE SFPShyam Sundar S K1-0/+3
Frequent link up/down events can happen when a Bel Fuse SFP part is connected to the amd-xgbe device. Try to avoid the frequent link issues by resetting the PHY as documented in Bel Fuse SFP datasheets. Fixes: e722ec82374b ("amd-xgbe: Update the BelFuse quirk to support SGMII") Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: amd-xgbe: Reset link when the link never comes backShyam Sundar S K2-1/+9
Normally, auto negotiation and reconnect should be automatically done by the hardware. But there seems to be an issue where auto negotiation has to be restarted manually. This happens because of link training and so even though still connected to the partner the link never "comes back". This needs an auto-negotiation restart. Also, a change in xgbe-mdio is needed to get ethtool to recognize the link down and get the link change message. This change is only required in a backplane connection mode. Fixes: abf0a1c2b26a ("amd-xgbe: Add support for SFP+ modules") Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: amd-xgbe: Fix NETDEV WATCHDOG transmit queue timeout warningShyam Sundar S K2-1/+1
The current driver calls netif_carrier_off() late in the link tear down which can result in a netdev watchdog timeout. Calling netif_carrier_off() immediately after netif_tx_stop_all_queues() avoids the warning. ------------[ cut here ]------------ NETDEV WATCHDOG: enp3s0f2 (amd-xgbe): transmit queue 0 timed out WARNING: CPU: 3 PID: 0 at net/sched/sch_generic.c:461 dev_watchdog+0x20d/0x220 Modules linked in: amd_xgbe(E) amd-xgbe 0000:03:00.2 enp3s0f2: Link is Down CPU: 3 PID: 0 Comm: swapper/3 Tainted: G E Hardware name: AMD Bilby-RV2/Bilby-RV2, BIOS RBB1202A 10/18/2019 RIP: 0010:dev_watchdog+0x20d/0x220 Code: 00 49 63 4e e0 eb 92 4c 89 e7 c6 05 c6 e2 c1 00 01 e8 e7 ce fc ff 89 d9 48 RSP: 0018:ffff90cfc28c3e88 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006 RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffff90cfc28d63c0 RBP: ffff90cfb977845c R08: 0000000000000050 R09: 0000000000196018 R10: ffff90cfc28c3ef8 R11: 0000000000000000 R12: ffff90cfb9778000 R13: 0000000000000003 R14: ffff90cfb9778480 R15: 0000000000000010 FS: 0000000000000000(0000) GS:ffff90cfc28c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f240ff2d9d0 CR3: 00000001e3e0a000 CR4: 00000000003406e0 Call Trace: <IRQ> ? pfifo_fast_reset+0x100/0x100 call_timer_fn+0x2b/0x130 run_timer_softirq+0x3e8/0x440 ? enqueue_hrtimer+0x39/0x90 Fixes: e722ec82374b ("amd-xgbe: Update the BelFuse quirk to support SGMII") Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: amd-xgbe: Reset the PHY rx data path when mailbox command timeoutShyam Sundar S K2-1/+41
Sometimes mailbox commands timeout when the RX data path becomes unresponsive. This prevents the submission of new mailbox commands to DXIO. This patch identifies the timeout and resets the RX data path so that the next message can be submitted properly. Fixes: 549b32af9f7c ("amd-xgbe: Simplify mailbox interface rate change code") Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16Merge branch 'Fixes-applied-to-VCS8514'David S. Miller5-255/+1063
Bjarni Jonasson says: ==================== Fixes applied to VCS8514 3 different fixes applied to VSC8514: LCPLL reset, serdes calibration and coma mode disabled. Especially the serdes calibration is large and is now placed in a new file 'mscc_serdes.c' which can act as a placeholder for future serdes configuration. v1 -> v2: Preserved reversed christmas tree Removed forward definitions Fixed build issues Changed net to net-next v2 -> v3: Added cover letter. Removed ena_clk_bypass from function call Created mscc_serdes.c and .h for serdes configuration Modified coma register config. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: phy: mscc: coma mode disabled for VSC8514Bjarni Jonasson2-0/+20
The 'coma mode' (configurable through sw or hw) provides an optional feature that may be used to control when the PHYs become active. The typical usage is to synchronize the link-up time across all PHY instances. This patch releases coma mode if not done by hardware, otherwise the phys will not link-up. Fixes: e4f9ba642f0b ("net: phy: mscc: add support for VSC8514 PHY.") Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: Bjarni Jonasson <bjarni.jonasson@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: phy: mscc: improved serdes calibration applied to VSC8514Bjarni Jonasson5-166/+840
The current IB serdes calibration algorithm (performed by the onboard 8051) has proven to be unstable for the VSC8514 QSGMII phy. A new algorithm has been developed based on 'Frequency-offset Jittered-Injection' or 'FoJi' method which solves all known issues. This patch disables the 8051 algorithm and replaces it with the new FoJi algorithm. The calibration is now performed in a new file (mscc_serdes.c), which can act as an placeholder for future serdes configurations. Fixes: e4f9ba642f0b ("net: phy: mscc: add support for VSC8514 PHY.") Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: Bjarni Jonasson <bjarni.jonasson@microchip.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: phy: mscc: adding LCPLL reset to VSC8514Bjarni Jonasson2-122/+236
At Power-On Reset, transients may cause the LCPLL to lock onto a clock that is momentarily unstable. This is normally seen in QSGMII setups where the higher speed 6G SerDes is being used. This patch adds an initial LCPLL Reset to the PHY (first instance) to avoid this issue. Fixes: e4f9ba642f0b ("net: phy: mscc: add support for VSC8514 PHY.") Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: Bjarni Jonasson <bjarni.jonasson@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net/mlx5: Add cyc2time HW translation mode supportAya Levin8-38/+241
Device timestamp can be in real time mode (cycles to time translation is offloaded into the Hardware). With real time mode, HW provides timestamp which is already translated into nanoseconds. With this mode, driver adjusts both the HW and timecounter (to keep clock_info_page updated) using callbacks: adjfreq, adjtime and settime. HW clock modifications are done via MTUTC access reg commands. Driver is allowed to modify HW real time clock only if MCAM ptpcyc2realtime_modify capability is set. Add MTUTC set function to be used for configuring the HW real time clock. Modify existing code to support both internal timer (with conversion via timecounter_cyc2time() and real time (no conversions). Align the signatures of the helpers converting from timestamp to nanoseconds. With that, when allocating a queue assign the corresponding callback with respect to the capability. Adjust 1PPS timestamp calculation flows based on the timestamp mode. Cyc2time offload brings two major advantages: - Improve MTAE (Max Time Absolute Error) for HW TS by up to 160 ns over a 100% loaded CPU. - Faster data-path timestamp to nanoseconds, as translation is lock-less and done in HW. On real time mode, timestamp format is 32 high bits of seconds and 32 low bits of nanoseconds. On some flows, driver shall convert this format into nanoseconds wall-clock with REAL_TIME_TO_NS macro. HW supports a single clock, and it is shared by all functions on a device. In case real time clock is used, it is recommended to use a single GM to all device's functions. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-16net/mlx5: Move some PPS logic into helper functionsEran Ben Elisha1-40/+73
Some of PPS logic (timestamp calculations) fits only internal timer timestamp mode. Move these logics into helper functions. Later in the patchset cyc2time HW translation mode will expose its own PPS timestamp calculations. With this change, main flow will only hold calling PPS logic based on run time mode. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-16net/mlx5: Move all internal timer metadata into a dedicated structEran Ben Elisha3-51/+71
Internal timer mode (SW clock) requires some PTP clock related metadata structs. Real time mode (HW clock) will not need these metadata structs. This separation emphasize the different interfaces for HW clock and SW clock. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-16net/mlx5: Refactor init clock functionEran Ben Elisha1-23/+53
Function mlx5_init_clock() is responsible for internal PTP related metadata initializations. Break mlx5_init_clock() to sub functions, each takes care of its own logic. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-16net/mlx5: Add register layout to support real-time time-stampEran Ben Elisha3-3/+33
Add needed structure layouts and defines for MTUTC (Management UTC) register. MTUTC will be used for cyc2time HW translation. In addition, add cyc2time modify capability bit and init segment HCA real time address. Finally, add capability bits indicating which time-stamping format is supported per SQ and RQ. Add ts_format in the queue's context layout to allow configuration. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-16Merge branch 'Fix-buggy-brport-flags-offload-for-SJA1105-DSA'David S. Miller2-42/+77
Vladimir Oltean says: ==================== Fix buggy brport flags offload for SJA1105 DSA I am resending this series because the title and the patches were mixed up and these patches were lost. This series' cover letter was used as the merge commit for the unrelated "Fixing build breakage after "Merge branch 'Propagate-extack-for-switchdev-LANs-from-DSA'"" series, as can be seen below: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=ca04422afd6998611a81d0ea1b61d5a5f4923f84 while the actual patches from the "Fix buggy brport flags offload for SJA1105 DSA" series were marked as superseded and not applied: https://patchwork.kernel.org/project/netdevbpf/cover/20210214155704.1784220-1-olteanv@gmail.com/ which they should have. I know with so many bugs I introduced it's hard to keep track, I'm sorry. Original series description: While testing software bridging on sja1105, I discovered that I managed to introduce two bugs in a single patch submitted recently to net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: dsa: sja1105: fix leakage of flooded frames outside bridging domainVladimir Oltean2-41/+76
Quite embarrasingly, I managed to fool myself into thinking that the flooding domain of sja1105 source ports is restricted by the forwarding domain, which it isn't. Frames which match an FDB entry are forwarded towards that entry's DESTPORTS restricted by REACH_PORT[SRC_PORT], while frames that don't match any FDB entry are forwarded towards FL_DOMAIN[SRC_PORT] or BC_DOMAIN[SRC_PORT]. This means we can't get away with doing the simple thing, and we must manage the flooding domain ourselves such that it is restricted by the forwarding domain. This new function must be called from the .port_bridge_join and .port_bridge_leave methods too, not just from .port_bridge_flags as we did before. Fixes: 4d9423549501 ("net: dsa: sja1105: offload bridge port flags to device") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: dsa: sja1105: fix configuration of source address learningVladimir Oltean1-1/+1
Due to a mistake, the driver always sets the address learning flag to the previously stored value, and not to the currently configured one. The bug is visible only in standalone ports mode, because when the port is bridged, the issue is masked by .port_stp_state_set which overwrites the address learning state to the proper value. Fixes: 4d9423549501 ("net: dsa: sja1105: offload bridge port flags to device") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16octeontx2-af: cn10k: Fixes CN10K RPM reference issueGeetha sowjanya4-6/+12
This patch fixes references to uninitialized variables and debugfs entry name for CN10K platform and HW_TSO flag check. Fixes: 3ad3f8f93c81 ("octeontx2-af: cn10k: MAC internal loopback support"). Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> v1-v2 - Clear HW_TSO flag for 96xx B0 version. This patch fixes the bug introduced by the commit 3ad3f8f93c81 ("octeontx2-af: cn10k: MAC internal loopback support"). These changes are not yet merged into net branch, hence submitting to net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: dsa: felix: perform teardown on error in felix_setupVladimir Oltean1-2/+19
If the driver fails to probe, it would be nice to not leak memory. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net: dsa: felix: don't deinitialize unused portsVladimir Oltean1-1/+5
ocelot_init_port is called only if dsa_is_unused_port == false, however ocelot_deinit_port is called unconditionally. This causes a warning in the skb_queue_purge inside ocelot_deinit_port saying that the spin lock protecting ocelot_port->tx_skbs was not initialized. Fixes: e5fb512d81d0 ("net: mscc: ocelot: deinitialize only initialized ports") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16ionic: Remove unused function pointer typedef ionic_reset_cbChen Lin1-2/+0
Remove the 'ionic_reset_cb' typedef as it is not used. Signed-off-by: Chen Lin <chen.lin5@zte.com.cn> Acked-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller156-1489/+5662
Daniel Borkmann says: ==================== pull-request: bpf-next 2021-02-16 The following pull-request contains BPF updates for your *net-next* tree. There's a small merge conflict between 7eeba1706eba ("tcp: Add receive timestamp support for receive zerocopy.") from net-next tree and 9cacf81f8161 ("bpf: Remove extra lock_sock for TCP_ZEROCOPY_RECEIVE") from bpf-next tree. Resolve as follows: [...] lock_sock(sk); err = tcp_zerocopy_receive(sk, &zc, &tss); err = BPF_CGROUP_RUN_PROG_GETSOCKOPT_KERN(sk, level, optname, &zc, &len, err); release_sock(sk); [...] We've added 116 non-merge commits during the last 27 day(s) which contain a total of 156 files changed, 5662 insertions(+), 1489 deletions(-). The main changes are: 1) Adds support of pointers to types with known size among global function args to overcome the limit on max # of allowed args, from Dmitrii Banshchikov. 2) Add bpf_iter for task_vma which can be used to generate information similar to /proc/pid/maps, from Song Liu. 3) Enable bpf_{g,s}etsockopt() from all sock_addr related program hooks. Allow rewriting bind user ports from BPF side below the ip_unprivileged_port_start range, both from Stanislav Fomichev. 4) Prevent recursion on fentry/fexit & sleepable programs and allow map-in-map as well as per-cpu maps for the latter, from Alexei Starovoitov. 5) Add selftest script to run BPF CI locally. Also enable BPF ringbuffer for sleepable programs, both from KP Singh. 6) Extend verifier to enable variable offset read/write access to the BPF program stack, from Andrei Matei. 7) Improve tc & XDP MTU handling and add a new bpf_check_mtu() helper to query device MTU from programs, from Jesper Dangaard Brouer. 8) Allow bpf_get_socket_cookie() helper also be called from [sleepable] BPF tracing programs, from Florent Revest. 9) Extend x86 JIT to pad JMPs with NOPs for helping image to converge when otherwise too many passes are required, from Gary Lin. 10) Verifier fixes on atomics with BPF_FETCH as well as function-by-function verification both related to zero-extension handling, from Ilya Leoshkevich. 11) Better kernel build integration of resolve_btfids tool, from Jiri Olsa. 12) Batch of AF_XDP selftest cleanups and small performance improvement for libbpf's xsk map redirect for newer kernels, from Björn Töpel. 13) Follow-up BPF doc and verifier improvements around atomics with BPF_FETCH, from Brendan Jackman. 14) Permit zero-sized data sections e.g. if ELF .rodata section contains read-only data from local variables, from Yonghong Song. 15) veth driver skb bulk-allocation for ndo_xdp_xmit, from Lorenzo Bianconi. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16net/mlx5: Add new timestamp mode bitsAharon Landau1-5/+49
These fields declare which timestamp mode is supported by the device per RQ/SQ/QP. In addition add the ts_format field to the select the mode for RQ/SQ/QP. Link: https://lore.kernel.org/r/20210209131107.698833-2-leon@kernel.org Signed-off-by: Aharon Landau <aharonl@nvidia.com> Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>