summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2021-10-26netfilter: conntrack: skip confirmation and nat hooks in postrouting for vrfFlorian Westphal2-1/+27
The VRF driver invokes netfilter for output+postrouting hooks so that users can create rules that check for 'oif $vrf' rather than lower device name. Afterwards, ip stack calls those hooks again. This is a problem when conntrack is used with IP masquerading. masquerading has an internal check that re-validates the output interface to account for route changes. This check will trigger in the vrf case. If the -j MASQUERADE rule matched on the first iteration, then round 2 finds state->out->ifindex != nat->masq_index: the latter is the vrf index, but out->ifindex is the lower device. The packet gets dropped and the conntrack entry is invalidated. This change makes conntrack postrouting skip the nat hooks. Also skip confirmation. This allows the second round (postrouting invocation from ipv4/ipv6) to create nat bindings. This also prevents the second round from seeing packets that had their source address changed by the nat hook. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26Merge tag 'mlx5-updates-2021-10-25' of ↵David S. Miller27-93/+787
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2021-10-25 Misc updates for mlx5 driver: 1) Misc updates and cleanups: - Don't write directly to netdev->dev_addr, From Jakub Kicinski - Remove unnecessary checks for slow path flag in tc module - Fix unused function warning of mlx5i_flow_type_mask - Bridge, support replacing existing FDB entry 2) Sub Functions, Reduction in memory usage: - Reduce flow counters bulk query buffer size - Implement max_macs devlink parameter - Add devlink vendor params to control Event Queue sizes - Added SF life cycle trace points by Parav/ 3) From Aya, Firmware health buffer reporting improvements - Print health buffer by log level and more missing information - Periodic update of host time to firmware ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26tcp: don't free a FIN sk_buff in tcp_remove_empty_skb()Jon Maxwell1-1/+1
v1: Implement a more general statement as recommended by Eric Dumazet. The sequence number will be advanced, so this check will fix the FIN case and other cases. A customer reported sockets stuck in the CLOSING state. A Vmcore revealed that the write_queue was not empty as determined by tcp_write_queue_empty() but the sk_buff containing the FIN flag had been freed and the socket was zombied in that state. Corresponding pcaps show no FIN from the Linux kernel on the wire. Some instrumentation was added to the kernel and it was found that there is a timing window where tcp_sendmsg() can run after tcp_send_fin(). tcp_sendmsg() will hit an error, for example: 1269 ▹ if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))↩ 1270 ▹ ▹ goto do_error;↩ tcp_remove_empty_skb() will then free the FIN sk_buff as "skb->len == 0". The TCP socket is now wedged in the FIN-WAIT-1 state because the FIN is never sent. If the other side sends a FIN packet the socket will transition to CLOSING and remain that way until the system is rebooted. Fix this by checking for the FIN flag in the sk_buff and don't free it if that is the case. Testing confirmed that fixed the issue. Fixes: fdfc5c8594c2 ("tcp: remove empty skb from write queue in error cases") Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com> Reported-by: Monir Zouaoui <Monir.Zouaoui@mail.schwarz> Reported-by: Simon Stier <simon.stier@mail.schwarz> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25Merge branch 'small-fixes-for-true-expression-checks'Jakub Kicinski2-3/+3
Jean Sacren says: ==================== Small fixes for true expression checks This series fixes checks of true !rc expression. ==================== Link: https://lore.kernel.org/r/cover.1634974124.git.sakiwit@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25net: qed_dev: fix check of true !rc expressionJean Sacren1-1/+1
Remove the check of !rc in (!rc && !resc_lock_params.b_granted) since it is always true. Signed-off-by: Jean Sacren <sakiwit@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25net: qed_ptp: fix check of true !rc expressionJean Sacren1-2/+2
Remove the check of !rc in (!rc && !params.b_granted) since it is always true. We should also use constant 0 for return. Signed-off-by: Jean Sacren <sakiwit@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25Merge branch 'tcp-receive-path-optimizations'Jakub Kicinski11-38/+79
Eric Dumazet says: ==================== tcp: receive path optimizations This series aims to reduce cache line misses in RX path. I am still working on better cache locality in tcp_sock but this will wait few more weeks. ==================== Link: https://lore.kernel.org/r/20211025164825.259415-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25ipv6/tcp: small drop monitor changesEric Dumazet1-2/+2
Two kfree_skb() calls must be replaced by consume_skb() for skbs that are not technically dropped. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25ipv4: guard IP_MINTTL with a static keyEric Dumazet3-8/+20
RFC 5082 IP_MINTTL option is rarely used on hosts. Add a static key to remove from TCP fast path useless code, and potential cache line miss to fetch inet_sk(sk)->min_ttl Note that once ip4_min_ttl static key has been enabled, it stays enabled until next boot. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25ipv4: annotate data races arount inet->min_ttlEric Dumazet2-3/+9
No report yet from KCSAN, yet worth documenting the races. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25ipv6: guard IPV6_MINHOPCOUNT with a static keyEric Dumazet3-8/+20
RFC 5082 IPV6_MINHOPCOUNT is rarely used on hosts. Add a static key to remove from TCP fast path useless code, and potential cache line miss to fetch tcp_inet6_sk(sk)->min_hopcount Note that once ip6_min_hopcount static key has been enabled, it stays enabled until next boot. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25ipv6: annotate data races around np->min_hopcountEric Dumazet2-3/+8
No report yet from KCSAN, yet worth documenting the races. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25net: annotate accesses to sk->sk_rx_queue_mappingEric Dumazet1-3/+7
sk->sk_rx_queue_mapping can be modified locklessly, add a couple of READ_ONCE()/WRITE_ONCE() to document this fact. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25net: avoid dirtying sk->sk_rx_queue_mappingEric Dumazet1-4/+2
sk_rx_queue_mapping is located in a cache line that should be kept read mostly. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25net: avoid dirtying sk->sk_napi_idEric Dumazet1-1/+2
sk_napi_id is located in a cache line that can be kept read mostly. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25ipv6: move inet6_sk(sk)->rx_dst_cookie to sk->sk_rx_dst_cookieEric Dumazet4-6/+7
Increase cache locality by moving rx_dst_coookie next to sk->sk_rx_dst This removes one or two cache line misses in IPv6 early demux (TCP/UDP) Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25tcp: move inet->rx_dst_ifindex to sk->sk_rx_dst_ifindexEric Dumazet4-8/+10
Increase cache locality by moving rx_dst_ifindex next to sk->sk_rx_dst This is part of an effort to reduce cache line misses in TCP fast path. This removes one cache line miss in early demux. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25ax88796c: fix fetching error stats from percpu containersAlexander Lobakin1-4/+4
rx_dropped, tx_dropped, rx_frame_errors and rx_crc_errors are being wrongly fetched from the target container rather than source percpu ones. No idea if that goes from the vendor driver or was brainoed during the refactoring, but fix it either way. Fixes: a97c69ba4f30e ("net: ax88796c: ASIX AX88796C SPI Ethernet Adapter Driver") Signed-off-by: Alexander Lobakin <alobakin@pm.me> Acked-by: Łukasz Stelmach <l.stelmach@samsung.com> Link: https://lore.kernel.org/r/20211023121148.113466-1-alobakin@pm.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25net/mlx5: SF_DEV Add SF device trace pointsParav Pandit6-6/+140
Add SF device add and delete specific trace points. echo mlx5:mlx5_sf_dev_add >> /sys/kernel/debug/tracing/set_event echo mlx5:mlx5_sf_dev_del >> /sys/kernel/debug/tracing/set_event echo mlx5:mlx5_sf_vhca_event >> /sys/kernel/debug/tracing/set_event Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: SF, Add SF trace pointsParav Pandit4-0/+222
Add support for trace events for SFs to improve debugging. This covers (a) port add and free trace points (b) device level trace points (c) SF hardware context add, free trace points. (d) SF function activate/deacticate and state trace points SF events examples: echo mlx5:mlx5_sf_add >> /sys/kernel/debug/tracing/set_event echo mlx5:mlx5_sf_free >> /sys/kernel/debug/tracing/set_event echo mlx5:mlx5_sf_hwc_alloc >> /sys/kernel/debug/tracing/set_event echo mlx5:mlx5_sf_hwc_free >> /sys/kernel/debug/tracing/set_event echo mlx5:mlx5_sf_hwc_deferred_free >> /sys/kernel/debug/tracing/set_event echo mlx5:mlx5_sf_update_state >> /sys/kernel/debug/tracing/set_event echo mlx5:mlx5_sf_activate >> /sys/kernel/debug/tracing/set_event echo mlx5:mlx5_sf_deactivate >> /sys/kernel/debug/tracing/set_event Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Let user configure max_macs paramShay Drory4-1/+92
Currently, max_macs is taking 70Kbytes of memory per function. This size is not needed in all use cases, and is critical with large scale. Hence, allow user to configure the number of max_macs. For example, to reduce the number of max_macs to 1, execute:: $ devlink dev param set pci/0000:00:0b.0 name max_macs value 1 \ cmode driverinit $ devlink dev reload pci/0000:00:0b.0 Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Let user configure event_eq_size paramShay Drory5-3/+31
Event EQ is an EQ which received the notification of almost all the events generated by the NIC. Currently, each event EQ is taking 512KB of memory. This size is not needed in most use cases, and is critical with large scale. Hence, allow user to configure the size of the event EQ. For example to reduce event EQ size to 64, execute:: $ devlink resource set pci/0000:00:0b.0 path /event_eq_size/ size 64 $ devlink dev reload pci/0000:00:0b.0 Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Let user configure io_eq_size paramShay Drory7-6/+85
Currently, each I/O EQ is taking 128KB of memory. This size is not needed in all use cases, and is critical with large scale. Hence, allow user to configure the size of I/O EQs. For example, to reduce I/O EQ size to 64, execute: $ devlink resource set pci/0000:00:0b.0 path /io_eq_size/ size 64 $ devlink dev reload pci/0000:00:0b.0 Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Bridge, support replacing existing FDB entryVlad Buslov1-0/+4
The SWITCHDEV_FDB_ADD_TO_DEVICE is used for both adding new and replacing existing entry. Implement support for replacing existing FDB entries in mlx5 offload code. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Bridge, extract code to lookup and del/notify entryVlad Buslov1-26/+32
Following two patterns in bridge code are used in multiple places where similar code is duplicated: - Lookup FDB entry from hashtable by address+vid pair. - Notify software bridge and then delete existing FDB entry. In order to improve code quality and prepare for following patch series that also uses described patterns, extract the codes to dedicated helper functions. This commit doesn't change functionality. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Add periodic update of host time to firmwareAya Levin3-0/+44
Firmware logs its asserts also to non-volatile memory. In order to reduce drift between the NIC and the host, the driver sets the host epoch-time to the firmware every hour. Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Print health buffer by log levelAya Levin3-19/+44
Add log macro which gets log level as a parameter. Use the severity read from the health buffer and the new log macro to log the health buffer with severity as log level. Prior to this patch, health buffer was printed in error log level regardless of its severity. Now the user may filter dmesg (--level) or change kernel log level to focus on different severity levels of firmware errors. Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Extend health buffer dumpAya Levin3-15/+82
Enhance health buffer to include: - assert_var5: expose the 6'th assert variable. - time: error's time-stamp in seconds (epoch time). - rfr: Recovery Flow Requiered. When set, indicates that the error cannot be recovered without flow involving reset. - severity: error's severity value, ranging from emergency to debug. Expose them in the health buffer dump (dmesg and devlink fw reporter). Health buffer in dmesg: mlx5_core 0000:08:00.0: print_health_info:425:(pid 912): Health issue observed, firmware internal error, severity(3) ERROR: mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[0] 0x08040700 mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[1] 0x00000000 mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[2] 0x00000000 mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[3] 0x00000000 mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[4] 0x00000000 mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[5] 0x00000000 mlx5_core 0000:08:00.0: print_health_info:432:(pid 912): assert_exit_ptr 0x00aaf800 mlx5_core 0000:08:00.0: print_health_info:434:(pid 912): assert_callra 0x00aaf70c mlx5_core 0000:08:00.0: print_health_info:436:(pid 912): fw_ver 16.32.492 mlx5_core 0000:08:00.0: print_health_info:437:(pid 912): time 1634819758 mlx5_core 0000:08:00.0: print_health_info:438:(pid 912): hw_id 0x0000020d mlx5_core 0000:08:00.0: print_health_info:439:(pid 912): rfr 0 mlx5_core 0000:08:00.0: print_health_info:440:(pid 912): severity 3 (ERROR) mlx5_core 0000:08:00.0: print_health_info:441:(pid 912): irisc_index 9 mlx5_core 0000:08:00.0: print_health_info:442:(pid 912): synd 0x1: firmware internal error mlx5_core 0000:08:00.0: print_health_info:444:(pid 912): ext_synd 0x802b mlx5_core 0000:08:00.0: print_health_info:445:(pid 912): raw fw_ver 0x102001ec Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Reduce flow counters bulk query buffer size for SFsAvihai Horon1-2/+7
Currently, the flow counters bulk query buffer takes a little more than 512KB of memory, which is aligned to the next power of 2, to 1MB. The buffer size determines the maximum number of flow counters that can be queried at a time. Thus, having a bigger buffer can improve performance for users that need to query many flow counters. SFs don't use many flow counters and don't need a big buffer. Since this size is critical with large scale, reduce the size of the bulk query buffer for SFs. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Fix unused function warning of mlx5i_flow_type_maskShay Drory1-5/+5
The cited commit is causing unused-function warning[1] when CONFIG_MLX5_EN_RXNFC is not set. Fix this by moving the function into the ifdef, where it's only used [1] warning: ‘mlx5i_flow_type_mask’ defined but not used [-Wunused-function] Fixes: 9fbe1c25ecca ("net/mlx5i: Enable Rx steering for IPoIB via ethtool") Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Remove unnecessary checks for slow path flagPaul Blakey1-16/+1
After previous changes, caller (mlx5e_tc_offload_fdb_rules()) already checks for the slow path flag, and if set won't call offload/unoffload sample. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5e: don't write directly to netdev->dev_addrJakub Kicinski1-2/+6
Use a local buffer and eth_hw_addr_set() Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25Merge branch 'bluetooth-don-t-write-directly-to-netdev-dev_addr'Jakub Kicinski2-2/+4
Jakub Kicinski says: ==================== bluetooth: don't write directly to netdev->dev_addr The usual conversions. ==================== Link: https://lore.kernel.org/r/20211022231834.2710245-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25bluetooth: use dev_addr_set()Jakub Kicinski1-1/+3
Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it go through appropriate helpers. Reviewed-by: Marcel Holtmann <marcel@holtmann.org> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25bluetooth: use eth_hw_addr_set()Jakub Kicinski1-1/+1
Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it go through appropriate helpers. Convert bluetooth from memcpy(... ETH_ADDR) to eth_hw_addr_set(): @@ expression dev, np; @@ - memcpy(dev->dev_addr, np, ETH_ALEN) + eth_hw_addr_set(dev, np) Reviewed-by: Marcel Holtmann <marcel@holtmann.org> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25fddi: defza: add missing pointer type castJakub Kicinski1-1/+1
hw_addr is a uint AKA unsigned int. dev_addr_set() takes a u8 *. drivers/net/fddi/defza.c:1383:27: error: passing argument 2 of 'dev_addr_set' from incompatible pointer type [-Werror=incompatible-pointer-types] Reported-by: kernel test robot <lkp@intel.com> Fixes: 1e9258c389ee ("fddi: defxx,defza: use dev_addr_set()") Acked-by: Maciej W. Rozycki <macro@orcam.me.uk> Link: https://lore.kernel.org/r/20211025160000.2803818-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25net/tls: getsockopt supports complete algorithm listTianjia Zhang1-0/+42
AES_CCM_128 and CHACHA20_POLY1305 are already supported by tls, similar to setsockopt, getsockopt also needs to support these two algorithms. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net/tls: tls_crypto_context add supported algorithms contextTianjia Zhang1-0/+2
tls already supports the SM4 GCM/CCM algorithms. It is also necessary to add support for these two algorithms in tls_crypto_context to avoid potential issues caused by forced type conversion. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25mlxsw: spectrum: Use 'bitmap_zalloc()' when applicableChristophe JAILLET4-27/+16
Use 'bitmap_zalloc()' to simplify code, improve the semantic and avoid some open-coded arithmetic in allocator arguments. Also change the corresponding 'kfree()' into 'bitmap_free()' to keep consistency. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25usbb: catc: use correct API for MAC addressesOliver Neukum1-5/+17
Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. In the case of catc we need a new temporary buffer to conform to the rules for DMA coherency. That in turn necessitates a reworking of error handling in probe(). Signed-off-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25Merge tag 'wireless-drivers-next-2021-10-25' of ↵David S. Miller3-3/+4
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== wireless-drivers-next patches for v5.16 Third set of patches for v5.16. This time we have a small one to quickly fix two mt76 build failures I had missed in my previous pull request. Major changes: mt76 * fix linking when CONFIG_MMC is disabled * fix dev_err() format warning * mt7615: mt7622: fix ibss and meshpoint ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25Merge branch 'gve-jumbo-frame'David S. Miller10-166/+403
Jeroen de Borst says: ==================== gve: Add jumbo-frame support for GQ This patchset introduces jumbo-frame support for the GQ queue format. The device already supports jumbo-frames on TX. This introduces multi-descriptor RX packets using a packet continuation bit. A widely deployed driver has a bug with causes it to fail to load when a MTU greater than 2048 bytes is configured. A jumbo-frame device option is introduced to pass a jumbo-frame MTU only to drivers that support it. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25gve: Add a jumbo-frame device option.Shailend Chand2-4/+68
A widely deployed driver has a bug that will cause the driver not to load when a max_mtu > 2048 is present in the device descriptor. To avoid this bug while still enabling jumbo frames, we present a lower max_mtu in the device descriptor and pass the actual max_mtu in a separate device option. The driver supports 2 different queue formats. To enable features on one queue format, but not the other, a supported_features mask was added to the device options in the device descriptor. Signed-off-by: Shailend Chand <shailend@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25gve: Implement packet continuation for RX.David Awogbemila9-126/+292
This enables the driver to receive RX packets spread across multiple buffers: For a given multi-fragment packet the "packet continuation" bit is set on all descriptors except the last one. These descriptors' payloads are combined into a single SKB before the SKB is handed to the networking stack. This change adds a "packet buffer size" notion for RX queues. The CreateRxQueue AdminQueue command sent to the device now includes the packet_buffer_size. We opt for a packet_buffer_size of PAGE_SIZE / 2 to give the driver the opportunity to flip pages where we can instead of copying. Signed-off-by: David Awogbemila <awogbemila@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Catherine Sullivan <csully@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25gve: Add RX context.David Awogbemila2-37/+44
This refactor moves the skb_head and skb_tail fields into a new gve_rx_ctx struct. This new struct will contain information about the current packet being processed. This is in preparation for multi-descriptor RX packets. Signed-off-by: David Awogbemila <awogbemila@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Catherine Sullivan <csully@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25Merge branch 'mlxsw-selftests-updates'David S. Miller11-53/+112
Ido Schimmel says: ==================== selftests: mlxsw: Various updates This patchset contains various updates to mlxsw selftests. Patch #1 replaces open-coded compatibility checks with dedicated helpers. These helpers are used to skip tests when run on incompatible machines. Patch #2 avoids spurious failures in some tests by using permanent neighbours instead of reachable ones. Patch #3 reduces the run time of a test by not iterating over all the available trap policers. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25selftests: mlxsw: Reduce test run timeIdo Schimmel2-18/+20
Instead of iterating over all the available trap policers, only perform the tests with three policers: The first, the last and the one in the middle of the range. On a Spectrum-3 system, this reduces the run time from almost an hour to a few minutes. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25selftests: mlxsw: Use permanent neighbours instead of reachable onesIdo Schimmel1-11/+11
The nexthop objects tests configure dummy reachable neighbours so that the nexthops will have a MAC address and be programmed to the device. Since these are dummy reachable neighbours, they can be transitioned by the kernel to a failed state if they are around for too long. This can happen, for example, if the "TIMEOUT" variable is configured with a too high value. Make the tests more robust by configuring the neighbours as permanent, so that the tests do not depend on the configured timeout value. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25selftests: mlxsw: Add helpers for skipping selftestsPetr Machata8-24/+81
A number of mlxsw-specific selftests currently detect whether they are run on a compatible machine, and bail out silently when not. These tests are however done in a somewhat impenetrable manner by directly comparing PCI IDs against a blacklist or a whitelist, and bailing out silently if the machine is not compatible. Instead, add a helper, mlxsw_only_on_spectrum(), which allows specifying the supported machines in a human-readable manner. If the current machine is incompatible, the helper emits a SKIP message and returns an error code, based on which the caller can gracefully bail out in a suitable way. This allows a more readable conditions such as: mlxsw_only_on_spectrum 2+ || return Convert all existing open-coded guards to the new helper. Also add two new guards to do_mark_test() and do_drop_test(), which are supported only on Spectrum-2+, but the corresponding check was not there. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25Merge branch 'qca8081-phy-driver'David S. Miller4-47/+577
Luo Jie says: ==================== net: phy: Add qca8081 ethernet phy driver This patch series add the qca8081 ethernet phy driver support, which improve the wol feature, leverage at803x phy driver and add the fast retrain, master/slave seed and CDT feature. Changes in v7: * update Reviewed-by tags. Changes in v6: * add Reviewed-by tags on the applicable patches. Changes in v5: * rebase the patches on net-next/master. Changes in v4: * handle other interrupts in set_wol. * add genphy_c45_fast_retrain. Changes in v3: * correct a typo "excpet". * remove the suffix "PHY" from phy name. Changes in v2: * add definitions of fast retrain related registers in mdio.h. * break up the patch into small patches. * improve the at803x legacy code. Changes in v1: * merge qca8081 phy driver into at803x. * add cdt feature. * leverage at803x phy driver helpers. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>