summaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet
AgeCommit message (Collapse)AuthorFilesLines
2021-10-21net: enetc: make sure all traffic classes can send large framesVladimir Oltean1-1/+4
The enetc driver does not implement .ndo_change_mtu, instead it configures the MAC register field PTC{Traffic Class}MSDUR[MAXSDU] statically to a large value during probe time. The driver used to configure only the max SDU for traffic class 0, and that was fine while the driver could only use traffic class 0. But with the introduction of mqprio, sending a large frame into any other TC than 0 is broken. This patch fixes that by replicating per traffic class the static configuration done in enetc_configure_port_mac(). Fixes: cbe9e835946f ("enetc: Enable TC offloading with mqprio") Reported-by: Richie Pearn <richard.pearn@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: <Claudiu Manoil <claudiu.manoil@nxp.com> Link: https://lore.kernel.org/r/20211020173340.1089992-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-21net: enetc: fix ethtool counter name for PM0_TERRVladimir Oltean1-1/+1
There are two counters named "MAC tx frames", one of them is actually incorrect. The correct name for that counter should be "MAC tx error frames", which is symmetric to the existing "MAC rx error frames". Fixes: 16eb4c85c964 ("enetc: Add ethtool statistics") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: <Claudiu Manoil <claudiu.manoil@nxp.com> Link: https://lore.kernel.org/r/20211020165206.1069889-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-21sfc: Don't use netif_info before net_device setupErik Ekman2-3/+3
Use pci_info instead to avoid unnamed/uninitialized noise: [197088.688729] sfc 0000:01:00.0: Solarflare NIC detected [197088.690333] sfc 0000:01:00.0: Part Number : SFN5122F [197088.729061] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): no SR-IOV VFs probed [197088.729071] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): no PTP support Inspired by fa44821a4ddd ("sfc: don't use netif_info et al before net_device is registered") from Heiner Kallweit. Signed-off-by: Erik Ekman <erik@kryo.se> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-21sfc: Export fibre-specific supported link modesErik Ekman1-11/+26
The 1/10GbaseT modes were set up for cards with SFP+ cages in 3497ed8c852a5 ("sfc: report supported link speeds on SFP connections"). 10GbaseT was likely used since no 10G fibre mode existed. The missing fibre modes for 1/10G were added to ethtool.h in 5711a9822144 ("net: ethtool: add support for 1000BaseX and missing 10G link modes") shortly thereafter. The user guide available at https://support-nic.xilinx.com/wp/drivers lists support for the following cable and transceiver types in section 2.9: - QSFP28 100G Direct Attach Cables - QSFP28 100G SR Optical Transceivers (with SR4 modules listed) - SFP28 25G Direct Attach Cables - SFP28 25G SR Optical Transceivers - QSFP+ 40G Direct Attach Cables - QSFP+ 40G Active Optical Cables - QSFP+ 40G SR4 Optical Transceivers - QSFP+ to SFP+ Breakout Direct Attach Cables - QSFP+ to SFP+ Breakout Active Optical Cables - SFP+ 10G Direct Attach Cables - SFP+ 10G SR Optical Transceivers - SFP+ 10G LR Optical Transceivers - SFP 1000BASE‐T Transceivers - 1G Optical Transceivers (From user guide issue 28. Issue 16 which also includes older cards like SFN5xxx/SFN6xxx has matching lists for 1/10/40G transceiver types.) Regarding SFP+ 10GBASE‐T transceivers the latest guide says: "Solarflare adapters do not support 10GBASE‐T transceiver modules." Tested using SFN5122F-R7 (with 2 SFP+ ports). Supported link modes do not change depending on module used (tested with 1000BASE-T, 1000BASE-BX10, 10GBASE-LR). Before: $ ethtool ext Settings for ext: Supported ports: [ FIBRE ] Supported link modes: 1000baseT/Full 10000baseT/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: No Supported FEC modes: Not reported Advertised link modes: Not reported Advertised pause frame use: No Advertised auto-negotiation: No Advertised FEC modes: Not reported Link partner advertised link modes: Not reported Link partner advertised pause frame use: No Link partner advertised auto-negotiation: No Link partner advertised FEC modes: Not reported Speed: 1000Mb/s Duplex: Full Auto-negotiation: off Port: FIBRE PHYAD: 255 Transceiver: internal Current message level: 0x000020f7 (8439) drv probe link ifdown ifup rx_err tx_err hw Link detected: yes After: $ ethtool ext Settings for ext: Supported ports: [ FIBRE ] Supported link modes: 1000baseT/Full 1000baseX/Full 10000baseCR/Full 10000baseSR/Full 10000baseLR/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: No Supported FEC modes: Not reported Advertised link modes: Not reported Advertised pause frame use: No Advertised auto-negotiation: No Advertised FEC modes: Not reported Link partner advertised link modes: Not reported Link partner advertised pause frame use: No Link partner advertised auto-negotiation: No Link partner advertised FEC modes: Not reported Speed: 1000Mb/s Duplex: Full Auto-negotiation: off Port: FIBRE PHYAD: 255 Transceiver: internal Supports Wake-on: g Wake-on: d Current message level: 0x000020f7 (8439) drv probe link ifdown ifup rx_err tx_err hw Link detected: yes Signed-off-by: Erik Ekman <erik@kryo.se> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-21Merge tag 'mlx5-fixes-2021-10-20' of ↵David S. Miller11-54/+85
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-fixes-2021-10-20 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20net/mlx5e: IPsec: Fix work queue entry ethernet segment checksum flagsEmeel Hakim1-9/+11
Current Work Queue Entry (WQE) checksum (csum) flags in the ethernet segment (eseg) in case of IPsec crypto offload datapath are not aligned with PRM/HW expectations. Currently the driver always sets the l3_inner_csum flag in case of IPsec because of the wrong usage of skb->encapsulation as indicator for inner IPsec header since skb->encapsulation is always ON for IPsec packets since IPsec itself is an encapsulation protocol. The above forced a failing attempts of calculating csum of non-existing segments (like in the IP|ESP|TCP packet case which does not have an l3_inner) which led to lots of packet drops hence the low throughput. Fix by using xo->inner_ipproto as indicator for inner IPsec header instead of skb->encapsulation in addition to setting the csum flags as following: * Tunnel Mode: * Pkt: MAC IP ESP IP L4 * CSUM: l3_cs | l3_inner_cs | l4_inner_cs * * Transport Mode: * Pkt: MAC IP ESP L4 * CSUM: l3_cs [ | l4_cs (checksum partial case)] * * Tunnel(VXLAN TCP/UDP) over Transport Mode * Pkt: MAC IP ESP UDP VXLAN IP L4 * CSUM: l3_cs | l3_inner_cs | l4_inner_cs Fixes: f1267798c980 ("net/mlx5: Fix checksum issue of VXLAN and IPsec crypto offload") Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-20net/mlx5e: IPsec: Fix a misuse of the software parser's fieldsEmeel Hakim1-24/+27
IPsec crypto offload current Software Parser (SWP) fields settings in the ethernet segment (eseg) are not aligned with PRM/HW expectations. Among others in case of IP|ESP|TCP packet, current driver sets the offsets for inner_l3 and inner_l4 although there is no inner l3/l4 headers relative to ESP header in such packets. SWP provides the offsets for HW ,so it can be used to find csum fields to offload the checksum, however these are not necessarily used by HW and are used as fallback in case HW fails to parse the packet, e.g when performing IPSec Transport Aware (IP | ESP | TCP) there is no need to add SW parse on inner packet. So in some cases packets csum was calculated correctly , whereas in other cases it failed. The later faced csum errors (caused by wrong packet length calculations) which led to lots of packet drops hence the low throughput. Fix by setting the SWP fields as expected in a IP|ESP|TCP packet. the following describe the expected SWP offsets: * Tunnel Mode: * SWP: OutL3 InL3 InL4 * Pkt: MAC IP ESP IP L4 * * Transport Mode: * SWP: OutL3 OutL4 * Pkt: MAC IP ESP L4 * * Tunnel(VXLAN TCP/UDP) over Transport Mode * SWP: OutL3 InL3 InL4 * Pkt: MAC IP ESP UDP VXLAN IP L4 Fixes: f1267798c980 ("net/mlx5: Fix checksum issue of VXLAN and IPsec crypto offload") Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-20net/mlx5e: Fix vlan data lost during suspend flowMoshe Shemesh3-12/+26
During suspend flow the driver calls mlx5e_destroy_vlan_table() which does not only delete the vlans steering flow rules, but also frees the data on currently active vlans, thus it is not restored during resume flow. This fix keeps the vlan data on suspend flow and frees it only on driver remove flow. Fixes: 6783f0a21a3c ("net/mlx5e: Dynamic alloc vlan table for netdev when needed") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-20net/mlx5: E-switch, Return correct error code on group creation failureDmytro Linkin1-4/+3
Dan Carpenter report: The patch f47e04eb96e0: "net/mlx5: E-switch, Allow setting share/max tx rate limits of rate groups" from May 31, 2021, leads to the following Smatch static checker warning: drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c:483 esw_qos_create_rate_group() warn: passing zero to 'ERR_PTR' If min rate normalization failed then error code may be overwritten to 0 if scheduling element destruction succeed. Ignore this value and always return initial one. Fixes: f47e04eb96e0 ("net/mlx5: E-switch, Allow setting share/max tx rate limits of rate groups") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-20net/mlx5: Lag, change multipath and bonding to be mutually exclusiveMaor Dickman5-5/+18
Both multipath and bonding events are changing the HW LAG state independently. Handling one of the features events while the other is already enabled can cause unwanted behavior, for example handling bonding event while multipath enabled will disable the lag and cause multipath to stop working. Fix it by ignoring bonding event while in multipath and ignoring FIB events while in bonding mode. Fixes: 544fe7c2e654 ("net/mlx5e: Activate HW multipath and handle port affinity based on FIB events") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-20ice: Add missing E810 device idsTony Nguyen3-0/+8
As part of support for E810 XXV devices, some device ids were inadvertently left out. Add those missing ids. Fixes: 195fb97766da ("ice: add additional E810 device id") Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Acked-by: Paul Menzel <pmenzel@molgen.mpg.de>
2021-10-20igc: Update I226_K device IDSasha Neftin1-1/+1
The device ID for I226_K was incorrectly assigned, update the device ID to the correct one. Fixes: bfa5e98c9de4 ("igc: Add new device ID") Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-20e1000e: Fix packet loss on Tiger Lake and laterSasha Neftin2-1/+13
Update the HW MAC initialization flow. Do not gate DMA clock from the modPHY block. Keeping this clock will prevent dropped packets sent in burst mode on the Kumeran interface. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=213651 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=213377 Fixes: fb776f5d57ee ("e1000e: Add support for Tiger Lake") Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Mark Pearson <markpearson@lenovo.com> Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-20e1000e: Separate TGP board type from SPTSasha Neftin3-23/+46
We have the same LAN controller on different PCHs. Separate TGP board type from SPT which will allow for specific fixes to be applied for TGP platforms. Suggested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Mark Pearson <markpearson@lenovo.com> Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-20net: stmmac: Fix E2E delay mechanismKurt Kanzenbach1-1/+1
When utilizing End to End delay mechanism, the following error messages show up: |root@ehl1:~# ptp4l --tx_timestamp_timeout=50 -H -i eno2 -E -m |ptp4l[950.573]: selected /dev/ptp3 as PTP clock |ptp4l[950.586]: port 1: INITIALIZING to LISTENING on INIT_COMPLETE |ptp4l[950.586]: port 0: INITIALIZING to LISTENING on INIT_COMPLETE |ptp4l[952.879]: port 1: new foreign master 001395.fffe.4897b4-1 |ptp4l[956.879]: selected best master clock 001395.fffe.4897b4 |ptp4l[956.879]: port 1: assuming the grand master role |ptp4l[956.879]: port 1: LISTENING to GRAND_MASTER on RS_GRAND_MASTER |ptp4l[962.017]: port 1: received DELAY_REQ without timestamp |ptp4l[962.273]: port 1: received DELAY_REQ without timestamp |ptp4l[963.090]: port 1: received DELAY_REQ without timestamp Commit f2fb6b6275eb ("net: stmmac: enable timestamp snapshot for required PTP packets in dwmac v5.10a") already addresses this problem for the dwmac v5.10. However, same holds true for all dwmacs above version v4.10. Correct the check accordingly. Afterwards everything works as expected. Tested on Intel Atom(R) x6414RE Processor. Fixes: 14f347334bf2 ("net: stmmac: Correctly take timestamp for PTPv2") Fixes: f2fb6b6275eb ("net: stmmac: enable timestamp snapshot for required PTP packets in dwmac v5.10a") Suggested-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20net: hns3: disable sriov before unload hclge layerPeng Li3-0/+23
HNS3 driver includes hns3.ko, hnae3.ko and hclge.ko. hns3.ko includes network stack and pci_driver, hclge.ko includes HW device action, algo_ops and timer task, hnae3.ko includes some register function. When SRIOV is enable and hclge.ko is removed, HW device is unloaded but VF still exists, PF will not reply VF mbx messages, and cause errors. This patch fix it by disable SRIOV before remove hclge.ko. Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support") Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20net: hns3: fix vf reset workqueue cannot exitYufeng Mo1-3/+3
The task of VF reset is performed through the workqueue. It checks the value of hdev->reset_pending to determine whether to exit the loop. However, the value of hdev->reset_pending may also be assigned by the interrupt function hclgevf_misc_irq_handle(), which may cause the loop fail to exit and keep occupying the workqueue. This loop is not necessary, so remove it and the workqueue will be rescheduled if the reset needs to be retried or a new reset occurs. Fixes: 1cc9bc6e5867 ("net: hns3: split hclgevf_reset() into preparing and rebuilding part") Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20net: hns3: schedule the polling again when allocation failsYunsheng Lin1-10/+12
Currently when there is a rx page allocation failure, it is possible that polling may be stopped if there is no more packet to be reveiced, which may cause queue stall problem under memory pressure. This patch makes sure polling is scheduled again when there is any rx page allocation failure, and polling will try to allocate receive buffers until it succeeds. Now the allocation retry is added, it is unnecessary to do the rx page allocation at the end of rx cleaning, so remove it. And reset the unused_count to zero after calling hns3_nic_alloc_rx_buffers() to avoid calling hns3_nic_alloc_rx_buffers() repeatedly under memory pressure. Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20net: hns3: fix for miscalculation of rx unused descYunsheng Lin2-0/+9
rx unused desc is the desc that need attatching new buffer before refilling to hw to receive new packet, the number of desc need attatching new buffer is calculated using next_to_use and next_to_clean. when next_to_use == next_to_clean, currently hns3 driver assumes that all the desc has the buffer attatched, but 'next_to_use == next_to_clean' also means all the desc need attatching new buffer if hw has comsumed all the desc and the driver has not attatched any buffer to the desc yet. This patch adds 'refill' in desc_cb to indicate whether a new buffer has been refilled to a desc. Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20net: hns3: fix the max tx size according to user manualYunsheng Lin2-9/+4
Currently the max tx size supported by the hw is calculated by using the max BD num supported by the hw. According to the hw user manual, the max tx size is fixed value for both non-TSO and TSO skb. This patch updates the max tx size according to the manual. Fixes: 8ae10cfb5089("net: hns3: support tx-scatter-gather-fraglist feature") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20net: hns3: add limit ets dwrr bandwidth cannot be 0Guangbin Huang1-0/+9
If ets dwrr bandwidth of tc is set to 0, the hardware will switch to SP mode. In this case, this tc may occupy all the tx bandwidth if it has huge traffic, so it violates the purpose of the user setting. To fix this problem, limit the ets dwrr bandwidth must greater than 0. Fixes: cacde272dd00 ("net: hns3: Add hclge_dcb module for the support of DCB feature") Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20net: hns3: reset DWRR of unused tc to zeroGuangbin Huang1-0/+2
Currently, DWRR of tc will be initialized to a fixed value when this tc is enabled, but it is not been reset to 0 when this tc is disabled. It cause a problem that the DWRR of unused tc is not 0 after using tc tool to add and delete multi-tc parameters. For examples, after enabling 4 TCs and restoring to 1 TC by follow tc commands: $ tc qdisc add dev eth0 root mqprio num_tc 4 map 0 1 2 3 0 1 2 3 queues \ 8@0 8@8 8@16 8@24 hw 1 mode channel $ tc qdisc del dev eth0 root Now there is just one TC is enabled for eth0, but the tc info querying by debugfs is shown as follow: $ cat /mnt/hns3/0000:7d:00.0/tm/tc_sch_info enabled tc number: 1 weight_offset: 14 TC MODE WEIGHT 0 dwrr 100 1 dwrr 100 2 dwrr 100 3 dwrr 100 4 dwrr 0 5 dwrr 0 6 dwrr 0 7 dwrr 0 This patch fixes it by resetting DWRR of tc to 0 when tc is disabled. Fixes: 848440544b41 ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver") Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20net: hns3: Add configuration of TM QCN error eventJiaran Zhang2-1/+6
Add configuration of interrupt type and fifo interrupt enable of TM QCN error event if enabled, otherwise this event will not be reported when there is error. Fixes: d914971df022 ("net: hns3: remove redundant query in hclge_config_tm_hw_err_int()") Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-19cavium: Fix return values of the probe functionZheyu Ma1-2/+2
During the process of driver probing, the probe function should return < 0 for failure, otherwise, the kernel will treat value > 0 as success. Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18nfp: bpf: silence bitwise vs. logical OR warningNathan Chancellor1-2/+2
A new warning in clang points out two places in this driver where boolean expressions are being used with a bitwise OR instead of a logical one: drivers/net/ethernet/netronome/nfp/nfp_asm.c:199:20: error: use of bitwise '|' with boolean operands [-Werror,-Wbitwise-instead-of-logical] reg->src_lmextn = swreg_lmextn(lreg) | swreg_lmextn(rreg); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ || drivers/net/ethernet/netronome/nfp/nfp_asm.c:199:20: note: cast one or both operands to int to silence this warning drivers/net/ethernet/netronome/nfp/nfp_asm.c:280:20: error: use of bitwise '|' with boolean operands [-Werror,-Wbitwise-instead-of-logical] reg->src_lmextn = swreg_lmextn(lreg) | swreg_lmextn(rreg); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ || drivers/net/ethernet/netronome/nfp/nfp_asm.c:280:20: note: cast one or both operands to int to silence this warning 2 errors generated. The motivation for the warning is that logical operations short circuit while bitwise operations do not. In this case, it does not seem like short circuiting is harmful so implement the suggested fix of changing to a logical operation to fix the warning. Link: https://github.com/ClangBuiltLinux/linux/issues/1479 Reported-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20211018193101.2340261-1-nathan@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-18cavium: Return negative value when pci_alloc_irq_vectors() failsZheyu Ma1-1/+1
During the process of driver probing, the probe function should return < 0 for failure, otherwise, the kernel will treat value > 0 as success. Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18net: mscc: ocelot: Add of_node_put() before gotoWan Jiabing1-0/+1
Fix following coccicheck warning: ./drivers/net/ethernet/mscc/ocelot_vsc7514.c:946:1-33: WARNING: Function for_each_available_child_of_node should have of_node_put() before goto. Early exits from for_each_available_child_of_node should decrement the node reference counter. Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18net: sparx5: Add of_node_put() before gotoWan Jiabing1-0/+1
Fix following coccicheck warning: ./drivers/net/ethernet/microchip/sparx5/s4parx5_main.c:723:1-33: WARNING: Function for_each_available_child_of_node should have of_node_put() before goto Early exits from for_each_available_child_of_node should decrement the node reference counter. Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-14ice: Print the api_patch as part of the fw.mgmt.apiBrett Creeley1-1/+2
Currently when a user uses "devlink dev info", the fw.mgmt.api will be the major.minor numbers as shown below: devlink dev info pci/0000:3b:00.0 pci/0000:3b:00.0: driver ice serial_number 00-01-00-ff-ff-00-00-00 versions: fixed: board.id K91258-000 running: fw.mgmt 6.1.2 fw.mgmt.api 1.7 <--- No patch number included fw.mgmt.build 0xd75e7d06 fw.mgmt.srev 5 fw.undi 1.2992.0 fw.undi.srev 5 fw.psid.api 3.10 fw.bundle_id 0x800085cc fw.app.name ICE OS Default Package fw.app 1.3.27.0 fw.app.bundle_id 0xc0000001 fw.netlist 3.10.2000-3.1e.0 fw.netlist.build 0x2a76e110 stored: fw.mgmt.srev 5 fw.undi 1.2992.0 fw.undi.srev 5 fw.psid.api 3.10 fw.bundle_id 0x800085cc fw.netlist 3.10.2000-3.1e.0 fw.netlist.build 0x2a76e110 There are many features in the driver that depend on the major, minor, and patch version of the FW. Without the patch number in the output for fw.mgmt.api debugging issues related to the FW API version is difficult. Also, using major.minor.patch aligns with the existing firmware version which uses a 3 digit value. Fix this by making the fw.mgmt.api print the major.minor.patch versions. Shown below is the result: devlink dev info pci/0000:3b:00.0 pci/0000:3b:00.0: driver ice serial_number 00-01-00-ff-ff-00-00-00 versions: fixed: board.id K91258-000 running: fw.mgmt 6.1.2 fw.mgmt.api 1.7.9 <--- patch number included fw.mgmt.build 0xd75e7d06 fw.mgmt.srev 5 fw.undi 1.2992.0 fw.undi.srev 5 fw.psid.api 3.10 fw.bundle_id 0x800085cc fw.app.name ICE OS Default Package fw.app 1.3.27.0 fw.app.bundle_id 0xc0000001 fw.netlist 3.10.2000-3.1e.0 fw.netlist.build 0x2a76e110 stored: fw.mgmt.srev 5 fw.undi 1.2992.0 fw.undi.srev 5 fw.psid.api 3.10 fw.bundle_id 0x800085cc fw.netlist 3.10.2000-3.1e.0 fw.netlist.build 0x2a76e110 Fixes: ff2e5c700e08 ("ice: add basic handler for devlink .info_get") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-14ice: fix getting UDP tunnel entryMichal Swiatkowski1-2/+2
Correct parameters order in call to ice_tunnel_idx_to_entry function. Entry in sparse port table is correct when the idx is 0. For idx 1 one correct entry should be skipped, for idx 2 two of them should be skipped etc. Change if condition to be true when idx is 0, which means that previous valid entry of this tunnel type were skipped. Fixes: b20e6c17c468 ("ice: convert to new udp_tunnel infrastructure") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-14ice: Avoid crash from unnecessary IDA freeDave Ertman1-1/+5
In the remove path, there is an attempt to free the aux_idx IDA whether it was allocated or not. This can potentially cause a crash when unloading the driver on systems that do not initialize support for RDMA. But, this free cannot be gated by the status bit for RDMA, since it is allocated if the driver detects support for RDMA at probe time, but the driver can enter into a state where RDMA is not supported after the IDA has been allocated at probe time and this would lead to a memory leak. Initialize aux_idx to an invalid value and check for a valid value when unloading to determine if an IDA free is necessary. Fixes: d25a0fc41c1f9 ("ice: Initialize RDMA support") Reported-by: Jun Miao <jun.miao@windriver.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-14ice: Fix failure to re-add LAN/RDMA Tx queuesBrett Creeley3-0/+23
Currently if the VSI is rebuilt/removed and the RDMA PF driver is active the RDMA Tx queue scheduler node configuration will not be cleaned up. This will cause the rebuild/re-add of the VSI to fail due to the software structures not being correctly cleaned up for the VSI index. Fix this by always calling ice_rm_vsi_rdma_cfg() for all VSI. If there are no RDMA scheduler nodes created, then there is no harm in calling ice_rm_vsi_rdma_cfg(). This change applies to all VSI types, so if RDMA support is added for other VSI types they will also get this change. Fixes: 348048e724a0 ("ice: Implement iidc operations") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Jerzy Wiktor Jurkowski <jerzy.wiktor.jurkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-14mlxsw: thermal: Fix out-of-bounds memory accessesIdo Schimmel1-47/+5
Currently, mlxsw allows cooling states to be set above the maximum cooling state supported by the driver: # cat /sys/class/thermal/thermal_zone2/cdev0/type mlxsw_fan # cat /sys/class/thermal/thermal_zone2/cdev0/max_state 10 # echo 18 > /sys/class/thermal/thermal_zone2/cdev0/cur_state # echo $? 0 This results in out-of-bounds memory accesses when thermal state transition statistics are enabled (CONFIG_THERMAL_STATISTICS=y), as the transition table is accessed with a too large index (state) [1]. According to the thermal maintainer, it is the responsibility of the driver to reject such operations [2]. Therefore, return an error when the state to be set exceeds the maximum cooling state supported by the driver. To avoid dead code, as suggested by the thermal maintainer [3], partially revert commit a421ce088ac8 ("mlxsw: core: Extend cooling device with cooling levels") that tried to interpret these invalid cooling states (above the maximum) in a special way. The cooling levels array is not removed in order to prevent the fans going below 20% PWM, which would cause them to get stuck at 0% PWM. [1] BUG: KASAN: slab-out-of-bounds in thermal_cooling_device_stats_update+0x271/0x290 Read of size 4 at addr ffff8881052f7bf8 by task kworker/0:0/5 CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.15.0-rc3-custom-45935-gce1adf704b14 #122 Hardware name: Mellanox Technologies Ltd. "MSN2410-CB2FO"/"SA000874", BIOS 4.6.5 03/08/2016 Workqueue: events_freezable_power_ thermal_zone_device_check Call Trace: dump_stack_lvl+0x8b/0xb3 print_address_description.constprop.0+0x1f/0x140 kasan_report.cold+0x7f/0x11b thermal_cooling_device_stats_update+0x271/0x290 __thermal_cdev_update+0x15e/0x4e0 thermal_cdev_update+0x9f/0xe0 step_wise_throttle+0x770/0xee0 thermal_zone_device_update+0x3f6/0xdf0 process_one_work+0xa42/0x1770 worker_thread+0x62f/0x13e0 kthread+0x3ee/0x4e0 ret_from_fork+0x1f/0x30 Allocated by task 1: kasan_save_stack+0x1b/0x40 __kasan_kmalloc+0x7c/0x90 thermal_cooling_device_setup_sysfs+0x153/0x2c0 __thermal_cooling_device_register.part.0+0x25b/0x9c0 thermal_cooling_device_register+0xb3/0x100 mlxsw_thermal_init+0x5c5/0x7e0 __mlxsw_core_bus_device_register+0xcb3/0x19c0 mlxsw_core_bus_device_register+0x56/0xb0 mlxsw_pci_probe+0x54f/0x710 local_pci_probe+0xc6/0x170 pci_device_probe+0x2b2/0x4d0 really_probe+0x293/0xd10 __driver_probe_device+0x2af/0x440 driver_probe_device+0x51/0x1e0 __driver_attach+0x21b/0x530 bus_for_each_dev+0x14c/0x1d0 bus_add_driver+0x3ac/0x650 driver_register+0x241/0x3d0 mlxsw_sp_module_init+0xa2/0x174 do_one_initcall+0xee/0x5f0 kernel_init_freeable+0x45a/0x4de kernel_init+0x1f/0x210 ret_from_fork+0x1f/0x30 The buggy address belongs to the object at ffff8881052f7800 which belongs to the cache kmalloc-1k of size 1024 The buggy address is located 1016 bytes inside of 1024-byte region [ffff8881052f7800, ffff8881052f7c00) The buggy address belongs to the page: page:0000000052355272 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1052f0 head:0000000052355272 order:3 compound_mapcount:0 compound_pincount:0 flags: 0x200000000010200(slab|head|node=0|zone=2) raw: 0200000000010200 ffffea0005034800 0000000300000003 ffff888100041dc0 raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8881052f7a80: 00 00 00 00 00 00 04 fc fc fc fc fc fc fc fc fc ffff8881052f7b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff8881052f7b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff8881052f7c00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8881052f7c80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [2] https://lore.kernel.org/linux-pm/9aca37cb-1629-5c67-1895-1fdc45c0244e@linaro.org/ [3] https://lore.kernel.org/linux-pm/af9857f2-578e-de3a-e62b-6baff7e69fd4@linaro.org/ CC: Daniel Lezcano <daniel.lezcano@linaro.org> Fixes: a50c1e35650b ("mlxsw: core: Implement thermal zone") Fixes: a421ce088ac8 ("mlxsw: core: Extend cooling device with cooling levels") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Vadim Pasternak <vadimp@nvidia.com> Link: https://lore.kernel.org/r/20211012174955.472928-1-idosch@idosch.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-14ethernet: s2io: fix setting mac address during resumeArnd Bergmann1-1/+1
After recent cleanups, gcc started warning about a suspicious memcpy() call during the s2io_io_resume() function: In function '__dev_addr_set', inlined from 'eth_hw_addr_set' at include/linux/etherdevice.h:318:2, inlined from 's2io_set_mac_addr' at drivers/net/ethernet/neterion/s2io.c:5205:2, inlined from 's2io_io_resume' at drivers/net/ethernet/neterion/s2io.c:8569:7: arch/x86/include/asm/string_32.h:182:25: error: '__builtin_memcpy' accessing 6 bytes at offsets 0 and 2 overlaps 4 bytes at offset 2 [-Werror=restrict] 182 | #define memcpy(t, f, n) __builtin_memcpy(t, f, n) | ^~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/netdevice.h:4648:9: note: in expansion of macro 'memcpy' 4648 | memcpy(dev->dev_addr, addr, len); | ^~~~~~ What apparently happened is that an old cleanup changed the calling conventions for s2io_set_mac_addr() from taking an ethernet address as a character array to taking a struct sockaddr, but one of the callers was not changed at the same time. Change it to instead call the low-level do_s2io_prog_unicast() function that still takes the old argument type. Fixes: 2fd376884558 ("S2io: Added support set_mac_address driver entry point") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20211013143613.2049096-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-13net: encx24j600: check error in devm_regmap_init_encx24j600Nanyong Sun3-5/+14
devm_regmap_init may return error which caused by like out of memory, this will results in null pointer dereference later when reading or writing register: general protection fault in encx24j600_spi_probe KASAN: null-ptr-deref in range [0x0000000000000090-0x0000000000000097] CPU: 0 PID: 286 Comm: spi-encx24j600- Not tainted 5.15.0-rc2-00142-g9978db750e31-dirty #11 9c53a778c1306b1b02359f3c2bbedc0222cba652 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:regcache_cache_bypass drivers/base/regmap/regcache.c:540 Code: 54 41 89 f4 55 53 48 89 fb 48 83 ec 08 e8 26 94 a8 fe 48 8d bb a0 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 4a 03 00 00 4c 8d ab b0 00 00 00 48 8b ab a0 00 RSP: 0018:ffffc900010476b8 EFLAGS: 00010207 RAX: dffffc0000000000 RBX: fffffffffffffff4 RCX: 0000000000000000 RDX: 0000000000000012 RSI: ffff888002de0000 RDI: 0000000000000094 RBP: ffff888013c9a000 R08: 0000000000000000 R09: fffffbfff3f9cc6a R10: ffffc900010476e8 R11: fffffbfff3f9cc69 R12: 0000000000000001 R13: 000000000000000a R14: ffff888013c9af54 R15: ffff888013c9ad08 FS: 00007ffa984ab580(0000) GS:ffff88801fe00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055a6384136c8 CR3: 000000003bbe6003 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: encx24j600_spi_probe drivers/net/ethernet/microchip/encx24j600.c:459 spi_probe drivers/spi/spi.c:397 really_probe drivers/base/dd.c:517 __driver_probe_device drivers/base/dd.c:751 driver_probe_device drivers/base/dd.c:782 __device_attach_driver drivers/base/dd.c:899 bus_for_each_drv drivers/base/bus.c:427 __device_attach drivers/base/dd.c:971 bus_probe_device drivers/base/bus.c:487 device_add drivers/base/core.c:3364 __spi_add_device drivers/spi/spi.c:599 spi_add_device drivers/spi/spi.c:641 spi_new_device drivers/spi/spi.c:717 new_device_store+0x18c/0x1f1 [spi_stub 4e02719357f1ff33f5a43d00630982840568e85e] dev_attr_store drivers/base/core.c:2074 sysfs_kf_write fs/sysfs/file.c:139 kernfs_fop_write_iter fs/kernfs/file.c:300 new_sync_write fs/read_write.c:508 (discriminator 4) vfs_write fs/read_write.c:594 ksys_write fs/read_write.c:648 do_syscall_64 arch/x86/entry/common.c:50 entry_SYSCALL_64_after_hwframe arch/x86/entry/entry_64.S:113 Add error check in devm_regmap_init_encx24j600 to avoid this situation. Fixes: 04fbfce7a222 ("net: Microchip encx24j600 driver") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Nanyong Sun <sunnanyong@huawei.com> Link: https://lore.kernel.org/r/20211012125901.3623144-1-sunnanyong@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-13Merge tag 'mlx5-fixes-2021-10-12' of ↵Jakub Kicinski4-16/+66
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2021-10-12 * tag 'mlx5-fixes-2021-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Fix division by 0 in mlx5e_select_queue for representors net/mlx5e: Mutually exclude RX-FCS and RX-port-timestamp net/mlx5e: Switchdev representors are not vlan challenged net/mlx5e: Fix memory leak in mlx5_core_destroy_cq() error path net/mlx5e: Allow only complete TXQs partition in MQPRIO channel mode net/mlx5: Fix cleanup of bridge delayed work ==================== Link: https://lore.kernel.org/r/20211012205323.20123-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-13net: korina: select CRC32Vegard Nossum1-0/+1
Fix the following build/link error by adding a dependency on the CRC32 routines: ld: drivers/net/ethernet/korina.o: in function `korina_multicast_list': korina.c:(.text+0x1af): undefined reference to `crc32_le' Fixes: ef11291bcd5f9 ("Add support the Korina (IDT RC32434) Ethernet MAC") Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Acked-by: Florian fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20211012152509.21771-1-vegard.nossum@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-13net: arc: select CRC32Vegard Nossum1-0/+1
Fix the following build/link error by adding a dependency on the CRC32 routines: ld: drivers/net/ethernet/arc/emac_main.o: in function `arc_emac_set_rx_mode': emac_main.c:(.text+0xb11): undefined reference to `crc32_le' The crc32_le() call comes through the ether_crc_le() call in arc_emac_set_rx_mode(). [v2: moved the select to ARC_EMAC_CORE; the Makefile is a bit confusing, but the error comes from emac_main.o, which is part of the arc_emac module, which in turn is enabled by CONFIG_ARC_EMAC_CORE. Note that arc_emac is different from emac_arc...] Fixes: 775dd682e2b0ec ("arc_emac: implement promiscuous mode and multicast filtering") Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Link: https://lore.kernel.org/r/20211012093446.1575-1-vegard.nossum@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-12net: dsa: tag_ocelot: break circular dependency with ocelot switch lib driverVladimir Oltean2-17/+1
As explained here: https://lore.kernel.org/netdev/20210908220834.d7gmtnwrorhharna@skbuf/ DSA tagging protocol drivers cannot depend on symbols exported by switch drivers, because this creates a circular dependency that breaks module autoloading. The tag_ocelot.c file depends on the ocelot_ptp_rew_op() function exported by the common ocelot switch lib. This function looks at OCELOT_SKB_CB(skb) and computes how to populate the REW_OP field of the DSA tag, for PTP timestamping (the command: one-step/two-step, and the TX timestamp identifier). None of that requires deep insight into the driver, it is quite stateless, as it only depends upon the skb->cb. So let's make it a static inline function and put it in include/linux/dsa/ocelot.h, a file that despite its name is used by the ocelot switch driver for populating the injection header too - since commit 40d3f295b5fe ("net: mscc: ocelot: use common tag parsing code with DSA"). With that function declared as static inline, its body is expanded inside each call site, so the dependency is broken and the DSA tagger can be built without the switch library, upon which the felix driver depends. Fixes: 39e5308b3250 ("net: mscc: ocelot: support PTP Sync one-step timestamping") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-12net: mscc: ocelot: cross-check the sequence id from the timestamp FIFO with ↵Vladimir Oltean1-1/+23
the skb PTP header The sad reality is that when a PTP frame with a TX timestamping request is transmitted, it isn't guaranteed that it will make it all the way to the wire (due to congestion inside the switch), and that a timestamp will be taken by the hardware and placed in the timestamp FIFO where an IRQ will be raised for it. The implication is that if enough PTP frames are silently dropped by the hardware such that the timestamp ID has rolled over, it is possible to match a timestamp to an old skb. Furthermore, nobody will match on the real skb corresponding to this timestamp, since we stupidly matched on a previous one that was stale in the queue, and stopped there. So PTP timestamping will be broken and there will be no way to recover. It looks like the hardware parses the sequenceID from the PTP header, and also provides that metadata for each timestamp. The driver currently ignores this, but it shouldn't. As an extra resiliency measure, do the following: - check whether the PTP sequenceID also matches between the skb and the timestamp, treat the skb as stale otherwise and free it - if we see a stale skb, don't stop there and try to match an skb one more time, chances are there's one more skb in the queue with the same timestamp ID, otherwise we wouldn't have ever found the stale one (it is by timestamp ID that we matched it). While this does not prevent PTP packet drops, it at least prevents the catastrophic consequences of incorrect timestamp matching. Since we already call ptp_classify_raw in the TX path, save the result in the skb->cb of the clone, and just use that result in the interrupt code path. Fixes: 4e3b0468e6d7 ("net: mscc: PTP Hardware Clock (PHC) support") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-12net: mscc: ocelot: deny TX timestamping of non-PTP packetsVladimir Oltean1-7/+12
It appears that Ocelot switches cannot timestamp non-PTP frames, I tested this using the isochron program at: https://github.com/vladimiroltean/tsn-scripts with the result that the driver increments the ocelot_port->ts_id counter as expected, puts it in the REW_OP, but the hardware seems to not timestamp these packets at all, since no IRQ is emitted. Therefore check whether we are sending PTP frames, and refuse to populate REW_OP otherwise. Fixes: 4e3b0468e6d7 ("net: mscc: PTP Hardware Clock (PHC) support") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-12net: mscc: ocelot: warn when a PTP IRQ is raised for an unknown skbVladimir Oltean1-3/+3
When skb_match is NULL, it means we received a PTP IRQ for a timestamp ID that the kernel has no idea about, since there is no skb in the timestamping queue with that timestamp ID. This is a grave error and not something to just "continue" over. So print a big warning in case this happens. Also, move the check above ocelot_get_hwtimestamp(), there is no point in reading the full 64-bit current PTP time if we're not going to do anything with it anyway for this skb. Fixes: 4e3b0468e6d7 ("net: mscc: PTP Hardware Clock (PHC) support") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-12net: mscc: ocelot: avoid overflowing the PTP timestamp FIFOVladimir Oltean1-7/+30
PTP packets with 2-step TX timestamp requests are matched to packets based on the egress port number and a 6-bit timestamp identifier. All PTP timestamps are held in a common FIFO that is 128 entry deep. This patch ensures that back-to-back timestamping requests cannot exceed the hardware FIFO capacity. If that happens, simply send the packets without requesting a TX timestamp to be taken (in the case of felix, since the DSA API has a void return code in ds->ops->port_txtstamp) or drop them (in the case of ocelot). I've moved the ts_id_lock from a per-port basis to a per-switch basis, because we need separate accounting for both numbers of PTP frames in flight. And since we need locking to inc/dec the per-switch counter, that also offers protection for the per-port counter and hence there is no reason to have a per-port counter anymore. Fixes: 4e3b0468e6d7 ("net: mscc: PTP Hardware Clock (PHC) support") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-12net: mscc: ocelot: make use of all 63 PTP timestamp identifiersVladimir Oltean1-1/+3
At present, there is a problem when user space bombards a port with PTP event frames which have TX timestamping requests (or when a tc-taprio offload is installed on a port, which delays the TX timestamps by a significant amount of time). The driver will happily roll over the 2-bit timestamp ID and this will cause incorrect matches between an skb and the TX timestamp collected from the FIFO. The Ocelot switches have a 6-bit PTP timestamp identifier, and the value 63 is reserved, so that leaves identifiers 0-62 to be used. The timestamp identifiers are selected by the REW_OP packet field, and are actually shared between CPU-injected frames and frames which match a VCAP IS2 rule that modifies the REW_OP. The hardware supports partitioning between the two uses of the REW_OP field through the PTP_ID_LOW and PTP_ID_HIGH registers, and by default reserves the PTP IDs 0-3 for CPU-injected traffic and the rest for VCAP IS2. The driver does not use VCAP IS2 to set REW_OP for 2-step timestamping, and it also writes 0xffffffff to both PTP_ID_HIGH and PTP_ID_LOW in ocelot_init_timestamp() which makes all timestamp identifiers available to CPU injection. Therefore, we can make use of all 63 timestamp identifiers, which should allow more timestampable packets to be in flight on each port. This is only part of the solution, more issues will be addressed in future changes. Fixes: 4e3b0468e6d7 ("net: mscc: PTP Hardware Clock (PHC) support") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-12nfp: flow_offload: move flow_indr_dev_register from app init to app startBaowen Zheng1-5/+14
In commit 74fc4f828769 ("net: Fix offloading indirect devices dependency on qdisc order creation"), it adds a process to trigger the callback to setup the bo callback when the driver regists a callback. In our current implement, we are not ready to run the callback when nfp call the function flow_indr_dev_register, then there will be error message as: kernel: Oops: 0000 [#1] SMP PTI kernel: CPU: 0 PID: 14119 Comm: kworker/0:0 Tainted: G kernel: Workqueue: events work_for_cpu_fn kernel: RIP: 0010:nfp_flower_indr_setup_tc_cb+0x258/0x410 kernel: RSP: 0018:ffffbc1e02c57bf8 EFLAGS: 00010286 kernel: RAX: 0000000000000000 RBX: ffff9c761fabc000 RCX: 0000000000000001 kernel: RDX: 0000000000000001 RSI: fffffffffffffff0 RDI: ffffffffc0be9ef1 kernel: RBP: ffffbc1e02c57c58 R08: ffffffffc08f33aa R09: ffff9c6db7478800 kernel: R10: 0000009c003f6e00 R11: ffffbc1e02800000 R12: ffffbc1e000d9000 kernel: R13: ffffbc1e000db428 R14: ffff9c6db7478800 R15: ffff9c761e884e80 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: fffffffffffffff0 CR3: 00000009e260a004 CR4: 00000000007706f0 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 kernel: PKRU: 55555554 kernel: Call Trace: kernel: ? flow_indr_dev_register+0xab/0x210 kernel: ? __cond_resched+0x15/0x30 kernel: ? kmem_cache_alloc_trace+0x44/0x4b0 kernel: ? nfp_flower_setup_tc+0x1d0/0x1d0 [nfp] kernel: flow_indr_dev_register+0x158/0x210 kernel: ? tcf_block_unbind+0xe0/0xe0 kernel: nfp_flower_init+0x40b/0x650 [nfp] kernel: nfp_net_pci_probe+0x25f/0x960 [nfp] kernel: ? nfp_rtsym_read_le+0x76/0x130 [nfp] kernel: nfp_pci_probe+0x6a9/0x820 [nfp] kernel: local_pci_probe+0x45/0x80 So we need to call flow_indr_dev_register in app start process instead of init stage. Fixes: 74fc4f828769 ("net: Fix offloading indirect devices dependency on qdisc order creation") Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Link: https://lore.kernel.org/r/20211012124850.13025-1-louis.peens@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-12net/mlx5e: Fix division by 0 in mlx5e_select_queue for representorsMaxim Mikityanskiy1-0/+5
Commit 846d6da1fcdb ("net/mlx5e: Fix division by 0 in mlx5e_select_queue") makes mlx5e_build_nic_params assign a non-zero initial value to priv->num_tc_x_num_ch, so that mlx5e_select_queue doesn't fail with division by 0 if called before the first activation of channels. However, the initialization flow of representors doesn't call mlx5e_build_nic_params, so this bug can still happen with representors. This commit fixes the bug by adding the missing assignment to mlx5e_build_rep_params. Fixes: 846d6da1fcdb ("net/mlx5e: Fix division by 0 in mlx5e_select_queue") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-12net/mlx5e: Mutually exclude RX-FCS and RX-port-timestampAya Levin1-5/+52
Due to current HW arch limitations, RX-FCS (scattering FCS frame field to software) and RX-port-timestamp (improved timestamp accuracy on the receive side) can't work together. RX-port-timestamp is not controlled by the user and it is enabled by default when supported by the HW/FW. This patch sets RX-port-timestamp opposite to RX-FCS configuration. Fixes: 102722fc6832 ("net/mlx5e: Add support for RXFCS feature flag") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-12net/mlx5e: Switchdev representors are not vlan challengedSaeed Mahameed1-1/+0
Before this patch, mlx5 representors advertised the NETIF_F_VLAN_CHALLENGED bit, this could lead to missing features when using reps with vxlan/bridge and maybe other virtual interfaces, when such interfaces inherit this bit and block vlan usage in their topology. Example: $ip link add dev bridge type bridge # add representor interface to the bridge $ip link set dev pf0hpf master $ip link add link bridge name vlan10 type vlan id 10 protocol 802.1q Error: 8021q: VLANs not supported on device. Reps are perfectly capable of handling vlan traffic, although they don't implement vlan_{add,kill}_vid ndos, hence, remove NETIF_F_VLAN_CHALLENGED advertisement. Fixes: cb67b832921c ("net/mlx5e: Introduce SRIOV VF representors") Reported-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com>
2021-10-12net/mlx5e: Fix memory leak in mlx5_core_destroy_cq() error pathValentine Fatiev1-4/+3
Prior to this patch in case mlx5_core_destroy_cq() failed it returns without completing all destroy operations and that leads to memory leak. Instead, complete the destroy flow before return error. Also move mlx5_debug_cq_remove() to the beginning of mlx5_core_destroy_cq() to be symmetrical with mlx5_core_create_cq(). kmemleak complains on: unreferenced object 0xc000000038625100 (size 64): comm "ethtool", pid 28301, jiffies 4298062946 (age 785.380s) hex dump (first 32 bytes): 60 01 48 94 00 00 00 c0 b8 05 34 c3 00 00 00 c0 `.H.......4..... 02 00 00 00 00 00 00 00 00 db 7d c1 00 00 00 c0 ..........}..... backtrace: [<000000009e8643cb>] add_res_tree+0xd0/0x270 [mlx5_core] [<00000000e7cb8e6c>] mlx5_debug_cq_add+0x5c/0xc0 [mlx5_core] [<000000002a12918f>] mlx5_core_create_cq+0x1d0/0x2d0 [mlx5_core] [<00000000cef0a696>] mlx5e_create_cq+0x210/0x3f0 [mlx5_core] [<000000009c642c26>] mlx5e_open_cq+0xb4/0x130 [mlx5_core] [<0000000058dfa578>] mlx5e_ptp_open+0x7f4/0xe10 [mlx5_core] [<0000000081839561>] mlx5e_open_channels+0x9cc/0x13e0 [mlx5_core] [<0000000009cf05d4>] mlx5e_switch_priv_channels+0xa4/0x230 [mlx5_core] [<0000000042bbedd8>] mlx5e_safe_switch_params+0x14c/0x300 [mlx5_core] [<0000000004bc9db8>] set_pflag_tx_port_ts+0x9c/0x160 [mlx5_core] [<00000000a0553443>] mlx5e_set_priv_flags+0xd0/0x1b0 [mlx5_core] [<00000000a8f3d84b>] ethnl_set_privflags+0x234/0x2d0 [<00000000fd27f27c>] genl_family_rcv_msg_doit+0x108/0x1d0 [<00000000f495e2bb>] genl_family_rcv_msg+0xe4/0x1f0 [<00000000646c5c2c>] genl_rcv_msg+0x78/0x120 [<00000000d53e384e>] netlink_rcv_skb+0x74/0x1a0 Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Valentine Fatiev <valentinef@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-12net/mlx5e: Allow only complete TXQs partition in MQPRIO channel modeTariq Toukan1-2/+2
Do not allow configurations of MQPRIO channel mode that do not fully define and utilize the channels txqs. Fixes: ec60c4581bd9 ("net/mlx5e: Support MQPRIO channel mode") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>