diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2022-12-13 15:47:48 -0800 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2022-12-13 15:47:48 -0800 |
commit | 7e68dd7d07a28faa2e6574dd6b9dbd90cdeaae91 (patch) | |
tree | ae0427c5a3b905f24b3a44b510a9bcf35d9b67a3 /drivers/net/ethernet/microsoft/mana/mana_en.c | |
parent | 1ca06f1c1acecbe02124f14a37cce347b8c1a90c (diff) | |
parent | 7c4a6309e27f411743817fe74a832ec2d2798a4b (diff) | |
download | linux-7e68dd7d07a28faa2e6574dd6b9dbd90cdeaae91.tar.bz2 |
Merge tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
"Core:
- Allow live renaming when an interface is up
- Add retpoline wrappers for tc, improving considerably the
performances of complex queue discipline configurations
- Add inet drop monitor support
- A few GRO performance improvements
- Add infrastructure for atomic dev stats, addressing long standing
data races
- De-duplicate common code between OVS and conntrack offloading
infrastructure
- A bunch of UBSAN_BOUNDS/FORTIFY_SOURCE improvements
- Netfilter: introduce packet parser for tunneled packets
- Replace IPVS timer-based estimators with kthreads to scale up the
workload with the number of available CPUs
- Add the helper support for connection-tracking OVS offload
BPF:
- Support for user defined BPF objects: the use case is to allocate
own objects, build own object hierarchies and use the building
blocks to build own data structures flexibly, for example, linked
lists in BPF
- Make cgroup local storage available to non-cgroup attached BPF
programs
- Avoid unnecessary deadlock detection and failures wrt BPF task
storage helpers
- A relevant bunch of BPF verifier fixes and improvements
- Veristat tool improvements to support custom filtering, sorting,
and replay of results
- Add LLVM disassembler as default library for dumping JITed code
- Lots of new BPF documentation for various BPF maps
- Add bpf_rcu_read_{,un}lock() support for sleepable programs
- Add RCU grace period chaining to BPF to wait for the completion of
access from both sleepable and non-sleepable BPF programs
- Add support storing struct task_struct objects as kptrs in maps
- Improve helper UAPI by explicitly defining BPF_FUNC_xxx integer
values
- Add libbpf *_opts API-variants for bpf_*_get_fd_by_id() functions
Protocols:
- TCP: implement Protective Load Balancing across switch links
- TCP: allow dynamically disabling TCP-MD5 static key, reverting back
to fast[er]-path
- UDP: Introduce optional per-netns hash lookup table
- IPv6: simplify and cleanup sockets disposal
- Netlink: support different type policies for each generic netlink
operation
- MPTCP: add MSG_FASTOPEN and FastOpen listener side support
- MPTCP: add netlink notification support for listener sockets events
- SCTP: add VRF support, allowing sctp sockets binding to VRF devices
- Add bridging MAC Authentication Bypass (MAB) support
- Extensions for Ethernet VPN bridging implementation to better
support multicast scenarios
- More work for Wi-Fi 7 support, comprising conversion of all the
existing drivers to internal TX queue usage
- IPSec: introduce a new offload type (packet offload) allowing
complete header processing and crypto offloading
- IPSec: extended ack support for more descriptive XFRM error
reporting
- RXRPC: increase SACK table size and move processing into a
per-local endpoint kernel thread, reducing considerably the
required locking
- IEEE 802154: synchronous send frame and extended filtering support,
initial support for scanning available 15.4 networks
- Tun: bump the link speed from 10Mbps to 10Gbps
- Tun/VirtioNet: implement UDP segmentation offload support
Driver API:
- PHY/SFP: improve power level switching between standard level 1 and
the higher power levels
- New API for netdev <-> devlink_port linkage
- PTP: convert existing drivers to new frequency adjustment
implementation
- DSA: add support for rx offloading
- Autoload DSA tagging driver when dynamically changing protocol
- Add new PCP and APPTRUST attributes to Data Center Bridging
- Add configuration support for 800Gbps link speed
- Add devlink port function attribute to enable/disable RoCE and
migratable
- Extend devlink-rate to support strict prioriry and weighted fair
queuing
- Add devlink support to directly reading from region memory
- New device tree helper to fetch MAC address from nvmem
- New big TCP helper to simplify temporary header stripping
New hardware / drivers:
- Ethernet:
- Marvel Octeon CNF95N and CN10KB Ethernet Switches
- Marvel Prestera AC5X Ethernet Switch
- WangXun 10 Gigabit NIC
- Motorcomm yt8521 Gigabit Ethernet
- Microchip ksz9563 Gigabit Ethernet Switch
- Microsoft Azure Network Adapter
- Linux Automation 10Base-T1L adapter
- PHY:
- Aquantia AQR112 and AQR412
- Motorcomm YT8531S
- PTP:
- Orolia ART-CARD
- WiFi:
- MediaTek Wi-Fi 7 (802.11be) devices
- RealTek rtw8821cu, rtw8822bu, rtw8822cu and rtw8723du USB
devices
- Bluetooth:
- Broadcom BCM4377/4378/4387 Bluetooth chipsets
- Realtek RTL8852BE and RTL8723DS
- Cypress.CYW4373A0 WiFi + Bluetooth combo device
Drivers:
- CAN:
- gs_usb: bus error reporting support
- kvaser_usb: listen only and bus error reporting support
- Ethernet NICs:
- Intel (100G):
- extend action skbedit to RX queue mapping
- implement devlink-rate support
- support direct read from memory
- nVidia/Mellanox (mlx5):
- SW steering improvements, increasing rules update rate
- Support for enhanced events compression
- extend H/W offload packet manipulation capabilities
- implement IPSec packet offload mode
- nVidia/Mellanox (mlx4):
- better big TCP support
- Netronome Ethernet NICs (nfp):
- IPsec offload support
- add support for multicast filter
- Broadcom:
- RSS and PTP support improvements
- AMD/SolarFlare:
- netlink extened ack improvements
- add basic flower matches to offload, and related stats
- Virtual NICs:
- ibmvnic: introduce affinity hint support
- small / embedded:
- FreeScale fec: add initial XDP support
- Marvel mv643xx_eth: support MII/GMII/RGMII modes for Kirkwood
- TI am65-cpsw: add suspend/resume support
- Mediatek MT7986: add RX wireless wthernet dispatch support
- Realtek 8169: enable GRO software interrupt coalescing per
default
- Ethernet high-speed switches:
- Microchip (sparx5):
- add support for Sparx5 TC/flower H/W offload via VCAP
- Mellanox mlxsw:
- add 802.1X and MAC Authentication Bypass offload support
- add ip6gre support
- Embedded Ethernet switches:
- Mediatek (mtk_eth_soc):
- improve PCS implementation, add DSA untag support
- enable flow offload support
- Renesas:
- add rswitch R-Car Gen4 gPTP support
- Microchip (lan966x):
- add full XDP support
- add TC H/W offload via VCAP
- enable PTP on bridge interfaces
- Microchip (ksz8):
- add MTU support for KSZ8 series
- Qualcomm 802.11ax WiFi (ath11k):
- support configuring channel dwell time during scan
- MediaTek WiFi (mt76):
- enable Wireless Ethernet Dispatch (WED) offload support
- add ack signal support
- enable coredump support
- remain_on_channel support
- Intel WiFi (iwlwifi):
- enable Wi-Fi 7 Extremely High Throughput (EHT) PHY capabilities
- 320 MHz channels support
- RealTek WiFi (rtw89):
- new dynamic header firmware format support
- wake-over-WLAN support"
* tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2002 commits)
ipvs: fix type warning in do_div() on 32 bit
net: lan966x: Remove a useless test in lan966x_ptp_add_trap()
net: ipa: add IPA v4.7 support
dt-bindings: net: qcom,ipa: Add SM6350 compatible
bnxt: Use generic HBH removal helper in tx path
IPv6/GRO: generic helper to remove temporary HBH/jumbo header in driver
selftests: forwarding: Add bridge MDB test
selftests: forwarding: Rename bridge_mdb test
bridge: mcast: Support replacement of MDB port group entries
bridge: mcast: Allow user space to specify MDB entry routing protocol
bridge: mcast: Allow user space to add (*, G) with a source list and filter mode
bridge: mcast: Add support for (*, G) with a source list and filter mode
bridge: mcast: Avoid arming group timer when (S, G) corresponds to a source
bridge: mcast: Add a flag for user installed source entries
bridge: mcast: Expose __br_multicast_del_group_src()
bridge: mcast: Expose br_multicast_new_group_src()
bridge: mcast: Add a centralized error path
bridge: mcast: Place netlink policy before validation functions
bridge: mcast: Split (*, G) and (S, G) addition into different functions
bridge: mcast: Do not derive entry type from its filter mode
...
Diffstat (limited to 'drivers/net/ethernet/microsoft/mana/mana_en.c')
-rw-r--r-- | drivers/net/ethernet/microsoft/mana/mana_en.c | 185 |
1 files changed, 160 insertions, 25 deletions
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c index 27a0f3af8aab..2f6a048dee90 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -12,7 +12,20 @@ #include <net/checksum.h> #include <net/ip6_checksum.h> -#include "mana.h" +#include <net/mana/mana.h> +#include <net/mana/mana_auxiliary.h> + +static DEFINE_IDA(mana_adev_ida); + +static int mana_adev_idx_alloc(void) +{ + return ida_alloc(&mana_adev_ida, GFP_KERNEL); +} + +static void mana_adev_idx_free(int idx) +{ + ida_free(&mana_adev_ida, idx); +} /* Microsoft Azure Network Adapter (MANA) functions */ @@ -128,7 +141,7 @@ frag_err: return -ENOMEM; } -int mana_start_xmit(struct sk_buff *skb, struct net_device *ndev) +netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev) { enum mana_tx_pkt_format pkt_fmt = MANA_SHORT_PKT_FMT; struct mana_port_context *apc = netdev_priv(ndev); @@ -176,7 +189,7 @@ int mana_start_xmit(struct sk_buff *skb, struct net_device *ndev) pkg.wqe_req.client_data_unit = 0; pkg.wqe_req.num_sge = 1 + skb_shinfo(skb)->nr_frags; - WARN_ON_ONCE(pkg.wqe_req.num_sge > 30); + WARN_ON_ONCE(pkg.wqe_req.num_sge > MAX_TX_WQE_SGL_ENTRIES); if (pkg.wqe_req.num_sge <= ARRAY_SIZE(pkg.sgl_array)) { pkg.wqe_req.sgl = pkg.sgl_array; @@ -315,10 +328,10 @@ static void mana_get_stats64(struct net_device *ndev, rx_stats = &apc->rxqs[q]->stats; do { - start = u64_stats_fetch_begin_irq(&rx_stats->syncp); + start = u64_stats_fetch_begin(&rx_stats->syncp); packets = rx_stats->packets; bytes = rx_stats->bytes; - } while (u64_stats_fetch_retry_irq(&rx_stats->syncp, start)); + } while (u64_stats_fetch_retry(&rx_stats->syncp, start)); st->rx_packets += packets; st->rx_bytes += bytes; @@ -328,10 +341,10 @@ static void mana_get_stats64(struct net_device *ndev, tx_stats = &apc->tx_qp[q].txq.stats; do { - start = u64_stats_fetch_begin_irq(&tx_stats->syncp); + start = u64_stats_fetch_begin(&tx_stats->syncp); packets = tx_stats->packets; bytes = tx_stats->bytes; - } while (u64_stats_fetch_retry_irq(&tx_stats->syncp, start)); + } while (u64_stats_fetch_retry(&tx_stats->syncp, start)); st->tx_packets += packets; st->tx_bytes += bytes; @@ -633,13 +646,48 @@ static int mana_query_vport_cfg(struct mana_port_context *apc, u32 vport_index, return 0; } -static int mana_cfg_vport(struct mana_port_context *apc, u32 protection_dom_id, - u32 doorbell_pg_id) +void mana_uncfg_vport(struct mana_port_context *apc) +{ + mutex_lock(&apc->vport_mutex); + apc->vport_use_count--; + WARN_ON(apc->vport_use_count < 0); + mutex_unlock(&apc->vport_mutex); +} +EXPORT_SYMBOL_NS(mana_uncfg_vport, NET_MANA); + +int mana_cfg_vport(struct mana_port_context *apc, u32 protection_dom_id, + u32 doorbell_pg_id) { struct mana_config_vport_resp resp = {}; struct mana_config_vport_req req = {}; int err; + /* This function is used to program the Ethernet port in the hardware + * table. It can be called from the Ethernet driver or the RDMA driver. + * + * For Ethernet usage, the hardware supports only one active user on a + * physical port. The driver checks on the port usage before programming + * the hardware when creating the RAW QP (RDMA driver) or exposing the + * device to kernel NET layer (Ethernet driver). + * + * Because the RDMA driver doesn't know in advance which QP type the + * user will create, it exposes the device with all its ports. The user + * may not be able to create RAW QP on a port if this port is already + * in used by the Ethernet driver from the kernel. + * + * This physical port limitation only applies to the RAW QP. For RC QP, + * the hardware doesn't have this limitation. The user can create RC + * QPs on a physical port up to the hardware limits independent of the + * Ethernet usage on the same port. + */ + mutex_lock(&apc->vport_mutex); + if (apc->vport_use_count > 0) { + mutex_unlock(&apc->vport_mutex); + return -EBUSY; + } + apc->vport_use_count++; + mutex_unlock(&apc->vport_mutex); + mana_gd_init_req_hdr(&req.hdr, MANA_CONFIG_VPORT_TX, sizeof(req), sizeof(resp)); req.vport = apc->port_handle; @@ -666,9 +714,16 @@ static int mana_cfg_vport(struct mana_port_context *apc, u32 protection_dom_id, apc->tx_shortform_allowed = resp.short_form_allowed; apc->tx_vp_offset = resp.tx_vport_offset; + + netdev_info(apc->ndev, "Configured vPort %llu PD %u DB %u\n", + apc->port_handle, protection_dom_id, doorbell_pg_id); out: + if (err) + mana_uncfg_vport(apc); + return err; } +EXPORT_SYMBOL_NS(mana_cfg_vport, NET_MANA); static int mana_cfg_vport_steering(struct mana_port_context *apc, enum TRI_STATE rx, @@ -729,16 +784,19 @@ static int mana_cfg_vport_steering(struct mana_port_context *apc, resp.hdr.status); err = -EPROTO; } + + netdev_info(ndev, "Configured steering vPort %llu entries %u\n", + apc->port_handle, num_entries); out: kfree(req); return err; } -static int mana_create_wq_obj(struct mana_port_context *apc, - mana_handle_t vport, - u32 wq_type, struct mana_obj_spec *wq_spec, - struct mana_obj_spec *cq_spec, - mana_handle_t *wq_obj) +int mana_create_wq_obj(struct mana_port_context *apc, + mana_handle_t vport, + u32 wq_type, struct mana_obj_spec *wq_spec, + struct mana_obj_spec *cq_spec, + mana_handle_t *wq_obj) { struct mana_create_wqobj_resp resp = {}; struct mana_create_wqobj_req req = {}; @@ -787,9 +845,10 @@ static int mana_create_wq_obj(struct mana_port_context *apc, out: return err; } +EXPORT_SYMBOL_NS(mana_create_wq_obj, NET_MANA); -static void mana_destroy_wq_obj(struct mana_port_context *apc, u32 wq_type, - mana_handle_t wq_obj) +void mana_destroy_wq_obj(struct mana_port_context *apc, u32 wq_type, + mana_handle_t wq_obj) { struct mana_destroy_wqobj_resp resp = {}; struct mana_destroy_wqobj_req req = {}; @@ -814,6 +873,7 @@ static void mana_destroy_wq_obj(struct mana_port_context *apc, u32 wq_type, netdev_err(ndev, "Failed to destroy WQ object: %d, 0x%x\n", err, resp.hdr.status); } +EXPORT_SYMBOL_NS(mana_destroy_wq_obj, NET_MANA); static void mana_destroy_eq(struct mana_context *ac) { @@ -1469,10 +1529,10 @@ static int mana_create_txq(struct mana_port_context *apc, memset(&wq_spec, 0, sizeof(wq_spec)); memset(&cq_spec, 0, sizeof(cq_spec)); - wq_spec.gdma_region = txq->gdma_sq->mem_info.gdma_region; + wq_spec.gdma_region = txq->gdma_sq->mem_info.dma_region_handle; wq_spec.queue_size = txq->gdma_sq->queue_size; - cq_spec.gdma_region = cq->gdma_cq->mem_info.gdma_region; + cq_spec.gdma_region = cq->gdma_cq->mem_info.dma_region_handle; cq_spec.queue_size = cq->gdma_cq->queue_size; cq_spec.modr_ctx_id = 0; cq_spec.attached_eq = cq->gdma_cq->cq.parent->id; @@ -1487,8 +1547,10 @@ static int mana_create_txq(struct mana_port_context *apc, txq->gdma_sq->id = wq_spec.queue_index; cq->gdma_cq->id = cq_spec.queue_index; - txq->gdma_sq->mem_info.gdma_region = GDMA_INVALID_DMA_REGION; - cq->gdma_cq->mem_info.gdma_region = GDMA_INVALID_DMA_REGION; + txq->gdma_sq->mem_info.dma_region_handle = + GDMA_INVALID_DMA_REGION; + cq->gdma_cq->mem_info.dma_region_handle = + GDMA_INVALID_DMA_REGION; txq->gdma_txq_id = txq->gdma_sq->id; @@ -1699,10 +1761,10 @@ static struct mana_rxq *mana_create_rxq(struct mana_port_context *apc, memset(&wq_spec, 0, sizeof(wq_spec)); memset(&cq_spec, 0, sizeof(cq_spec)); - wq_spec.gdma_region = rxq->gdma_rq->mem_info.gdma_region; + wq_spec.gdma_region = rxq->gdma_rq->mem_info.dma_region_handle; wq_spec.queue_size = rxq->gdma_rq->queue_size; - cq_spec.gdma_region = cq->gdma_cq->mem_info.gdma_region; + cq_spec.gdma_region = cq->gdma_cq->mem_info.dma_region_handle; cq_spec.queue_size = cq->gdma_cq->queue_size; cq_spec.modr_ctx_id = 0; cq_spec.attached_eq = cq->gdma_cq->cq.parent->id; @@ -1715,8 +1777,8 @@ static struct mana_rxq *mana_create_rxq(struct mana_port_context *apc, rxq->gdma_rq->id = wq_spec.queue_index; cq->gdma_cq->id = cq_spec.queue_index; - rxq->gdma_rq->mem_info.gdma_region = GDMA_INVALID_DMA_REGION; - cq->gdma_cq->mem_info.gdma_region = GDMA_INVALID_DMA_REGION; + rxq->gdma_rq->mem_info.dma_region_handle = GDMA_INVALID_DMA_REGION; + cq->gdma_cq->mem_info.dma_region_handle = GDMA_INVALID_DMA_REGION; rxq->gdma_id = rxq->gdma_rq->id; cq->gdma_id = cq->gdma_cq->id; @@ -1797,6 +1859,7 @@ static void mana_destroy_vport(struct mana_port_context *apc) } mana_destroy_txq(apc); + mana_uncfg_vport(apc); if (gd->gdma_context->is_pf) mana_pf_deregister_hw_vport(apc); @@ -2069,12 +2132,16 @@ static int mana_probe_port(struct mana_context *ac, int port_idx, apc->pf_filter_handle = INVALID_MANA_HANDLE; apc->port_idx = port_idx; + mutex_init(&apc->vport_mutex); + apc->vport_use_count = 0; + ndev->netdev_ops = &mana_devops; ndev->ethtool_ops = &mana_ethtool_ops; ndev->mtu = ETH_DATA_LEN; ndev->max_mtu = ndev->mtu; ndev->min_mtu = ndev->mtu; ndev->needed_headroom = MANA_HEADROOM; + ndev->dev_port = port_idx; SET_NETDEV_DEV(ndev, gc->dev); netif_carrier_off(ndev); @@ -2112,6 +2179,69 @@ free_net: return err; } +static void adev_release(struct device *dev) +{ + struct mana_adev *madev = container_of(dev, struct mana_adev, adev.dev); + + kfree(madev); +} + +static void remove_adev(struct gdma_dev *gd) +{ + struct auxiliary_device *adev = gd->adev; + int id = adev->id; + + auxiliary_device_delete(adev); + auxiliary_device_uninit(adev); + + mana_adev_idx_free(id); + gd->adev = NULL; +} + +static int add_adev(struct gdma_dev *gd) +{ + struct auxiliary_device *adev; + struct mana_adev *madev; + int ret; + + madev = kzalloc(sizeof(*madev), GFP_KERNEL); + if (!madev) + return -ENOMEM; + + adev = &madev->adev; + ret = mana_adev_idx_alloc(); + if (ret < 0) + goto idx_fail; + adev->id = ret; + + adev->name = "rdma"; + adev->dev.parent = gd->gdma_context->dev; + adev->dev.release = adev_release; + madev->mdev = gd; + + ret = auxiliary_device_init(adev); + if (ret) + goto init_fail; + + ret = auxiliary_device_add(adev); + if (ret) + goto add_fail; + + gd->adev = adev; + return 0; + +add_fail: + auxiliary_device_uninit(adev); + +init_fail: + mana_adev_idx_free(adev->id); + +idx_fail: + kfree(madev); + + return ret; +} + int mana_probe(struct gdma_dev *gd, bool resuming) { struct gdma_context *gc = gd->gdma_context; @@ -2179,6 +2309,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming) break; } } + + err = add_adev(gd); out: if (err) mana_remove(gd, false); @@ -2195,6 +2327,10 @@ void mana_remove(struct gdma_dev *gd, bool suspending) int err; int i; + /* adev currently doesn't support suspending, always remove it */ + if (gd->adev) + remove_adev(gd); + for (i = 0; i < ac->num_ports; i++) { ndev = ac->ports[i]; if (!ndev) { @@ -2227,7 +2363,6 @@ void mana_remove(struct gdma_dev *gd, bool suspending) } mana_destroy_eq(ac); - out: mana_gd_deregister_device(gd); |