summaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/marvell/prestera
AgeCommit message (Collapse)AuthorFilesLines
2022-01-12net: marvell: prestera: Fix deinit sequence for routerYevhen Orlov4-2/+10
* Add missed call prestera_router_fini in prestera_switch_fini * Add prestera_router_hw_fini, which verify lists are empty Fixes: 69204174cc5c ("net: marvell: prestera: Add prestera router infra") Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Link: https://lore.kernel.org/r/20220111011129.5457-1-yevhen.orlov@plvision.eu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-12net: marvell: prestera: Refactor router functionsYevhen Orlov3-12/+14
* Reverse xmas tree variables order * User friendly messages on error paths * Refactor __prestera_inetaddr_event to use early return Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Link: https://lore.kernel.org/r/20220111011051.4941-1-yevhen.orlov@plvision.eu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-12net: marvell: prestera: Refactor get/put VR functionsYevhen Orlov2-17/+17
* Use refcount, instead of uint * Increment/decrement recount inside get/put * Fix error path in __prestera_vr_create. Remove unnecessary kfree. * Make __prestera_vr_destroy symmetric to "create" Fixes: bca5859bc6c6 ("net: marvell: prestera: add hardware router objects accounting") Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Link: https://lore.kernel.org/r/20220111011014.4418-1-yevhen.orlov@plvision.eu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-12net: marvell: prestera: Cleanup router structYevhen Orlov1-1/+0
Field "aborted" was added in 69204174cc5c ("net: marvell: prestera: Add prestera router infra"). It will not be used. So remove. Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Link: https://lore.kernel.org/r/20220111010826.3779-1-yevhen.orlov@plvision.eu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-30net: marvell: prestera: Implement initial inetaddr notifiersYevhen Orlov1-0/+40
Add inetaddr notifiers to support add/del IPv4 address on switchdev port. We create TRAP on first address, added on port and delete TRAP, when last address removed. Currently, driver supports only regular port to became routed. Other port type support will be added later Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30net: marvell: prestera: Register inetaddr stub notifiersYevhen Orlov3-1/+110
Initial implementation of notification handlers. For now this is just stub. So that we can move forward and add prestera_router_hw's objects manipulations. We support several addresses on interface. We just have nothing to do for second address, because rif is already enabled on this interface, after first one. Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30net: marvell: prestera: add hardware router objects accountingYevhen Orlov4-1/+255
Add prestera_router_hw.c. This file contains functions, which track HW objects relations and links. This include implicity creation of objects, that needed by requested one and implicity removing of objects, which reference counter is became zero. We need this layer, because kernel callbacks not always mapped to creation of single HW object. So let it be two different layers - one for subscribing and parsing kernel structures, and another (prestera_router_hw.c) for HW objects relations tracking. There is two types of objects on router_hw layer: - Explicit objects (rif_entry) : created by higher layer. - Implicit objects (vr) : created on demand by explicit objects. Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30net: marvell: prestera: Add prestera router infraYevhen Orlov4-1/+46
Add prestera_router.c, which contains code to subscribe/unsubscribe on kernel notifiers for router. This handle kernel notifications, parse structures to make key to manipulate prestera_router_hw's objects. Also prestera_router is container for router's objects database. Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30net: marvell: prestera: Add router interface ABIYevhen Orlov3-0/+127
Add functions to enable routing on port, which is not in vlan. Also we can enable routing on vlan. prestera_hw_rif_create() take index of allocated virtual router. Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30net: marvell: prestera: add virtual router ABIYevhen Orlov2-0/+46
Add functions and structures to allocate virtual router. prestera_hw_vr_create() return index of allocated VR so that we can move forward and also add another objects (e.g. router interface), which has link to VR. Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-13/+22
include/net/sock.h commit 8f905c0e7354 ("inet: fully convert sk->sk_rx_dst to RCU rules") commit 43f51df41729 ("net: move early demux fields close to sk_refcnt") https://lore.kernel.org/all/20211222141641.0caa0ab3@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-17net: marvell: prestera: fix incorrect structure accessYevhen Orlov1-7/+12
In line: upper = info->upper_dev; We access upper_dev field, which is related only for particular events (e.g. event == NETDEV_CHANGEUPPER). So, this line cause invalid memory access for another events, when ptr is not netdev_notifier_changeupper_info. The KASAN logs are as follows: [ 30.123165] BUG: KASAN: stack-out-of-bounds in prestera_netdev_port_event.constprop.0+0x68/0x538 [prestera] [ 30.133336] Read of size 8 at addr ffff80000cf772b0 by task udevd/778 [ 30.139866] [ 30.141398] CPU: 0 PID: 778 Comm: udevd Not tainted 5.16.0-rc3 #6 [ 30.147588] Hardware name: DNI AmazonGo1 A7040 board (DT) [ 30.153056] Call trace: [ 30.155547] dump_backtrace+0x0/0x2c0 [ 30.159320] show_stack+0x18/0x30 [ 30.162729] dump_stack_lvl+0x68/0x84 [ 30.166491] print_address_description.constprop.0+0x74/0x2b8 [ 30.172346] kasan_report+0x1e8/0x250 [ 30.176102] __asan_load8+0x98/0xe0 [ 30.179682] prestera_netdev_port_event.constprop.0+0x68/0x538 [prestera] [ 30.186847] prestera_netdev_event_handler+0x1b4/0x1c0 [prestera] [ 30.193313] raw_notifier_call_chain+0x74/0xa0 [ 30.197860] call_netdevice_notifiers_info+0x68/0xc0 [ 30.202924] register_netdevice+0x3cc/0x760 [ 30.207190] register_netdev+0x24/0x50 [ 30.211015] prestera_device_register+0x8a0/0xba0 [prestera] Fixes: 3d5048cc54bd ("net: marvell: prestera: move netdev topology validation to prestera_main") Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Link: https://lore.kernel.org/r/20211216171714.11341-1-yevhen.orlov@plvision.eu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-17net: marvell: prestera: fix incorrect return of port_findYevhen Orlov1-6/+10
In case, when some ports is in list and we don't find requested - we return last iterator state and not return NULL as expected. Fixes: 501ef3066c89 ("net: marvell: prestera: Add driver for Prestera family ASIC devices") Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Link: https://lore.kernel.org/r/20211216170736.8851-1-yevhen.orlov@plvision.eu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-16net: prestera: flower template supportVolodymyr Mytnyk6-3/+145
Add user template explicit support. At this moment, max TCAM rule size is utilized for all rules, doesn't matter which and how much flower matches are provided by user. It means that some of TCAM space is wasted, which impacts the number of filters that can be offloaded. Introducing the template, allows to have more HW offloaded filters by specifying the template explicitly. Example: tc qd add dev PORT clsact tc chain add dev PORT ingress protocol ip \ flower dst_ip 0.0.0.0/16 tc filter add dev PORT ingress protocol ip \ flower skip_sw dst_ip 1.2.3.4/16 action drop NOTE: chain 0 is the default chain id for "tc chain" & "tc filter" command, so it is omitted in the example above. This patch adds only template support for default chain 0 suppoerted by prestera driver at this moment. Chains are not supported yet, and will be added later. Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-06net: prestera: replace zero-length array with flexible-array memberJosé Expósito1-2/+2
One-element and zero-length arrays are deprecated and should be replaced with flexible-array members: https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays Replace zero-length array with flexible-array member and make use of the struct_size() helper. Link: https://github.com/KSPP/linux/issues/78 Signed-off-by: José Expósito <jose.exposito89@gmail.com> Reviewed-by: Volodymyr Mytnyk <vmytnyk@marvell.com> Tested-by: Volodymyr Mytnyk <vmytnyk@marvell.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20211204171349.22776-1-jose.exposito89@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-03net: prestera: acl: fix return value check in prestera_acl_rule_entry_find()Yang Yingliang1-5/+2
rhashtable_lookup_fast() returns NULL pointer not ERR_PTR(). Return rhashtable_lookup_fast() directly to fix this. Fixes: 47327e198d42 ("net: prestera: acl: migrate to new vTCAM api") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-30net: prestera: acl: add rule stats supportVolodymyr Mytnyk2-3/+48
Make flower to use counter API to get rule HW statistics. Co-developed-by: Serhiy Boiko <serhiy.boiko@marvell.com> Signed-off-by: Serhiy Boiko <serhiy.boiko@marvell.com> Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-30net: prestera: add counter HW APIVolodymyr Mytnyk8-2/+682
Add counter API for getting HW statistics. - HW statistics gathered by this API are deleyed. - Batch of conters is supported. - acl stat is supported. Co-developed-by: Serhiy Boiko <serhiy.boiko@marvell.com> Signed-off-by: Serhiy Boiko <serhiy.boiko@marvell.com> Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-30net: prestera: acl: migrate to new vTCAM apiVolodymyr Mytnyk9-662/+985
- Add new vTCAM HW API to configure HW ACLs. - Migrate acl to use new vTCAM HW API. - No counter support in this patch-set. Co-developed-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19net: marvell: prestera: fix double free issue on err pathVolodymyr Mytnyk1-4/+2
fix error path handling in prestera_bridge_port_join() that cases prestera driver to crash (see below). Trace: Internal error: Oops: 96000044 [#1] SMP Modules linked in: prestera_pci prestera uio_pdrv_genirq CPU: 1 PID: 881 Comm: ip Not tainted 5.15.0 #1 pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : prestera_bridge_destroy+0x2c/0xb0 [prestera] lr : prestera_bridge_port_join+0x2cc/0x350 [prestera] sp : ffff800011a1b0f0 ... x2 : ffff000109ca6c80 x1 : dead000000000100 x0 : dead000000000122 Call trace: prestera_bridge_destroy+0x2c/0xb0 [prestera] prestera_bridge_port_join+0x2cc/0x350 [prestera] prestera_netdev_port_event.constprop.0+0x3c4/0x450 [prestera] prestera_netdev_event_handler+0xf4/0x110 [prestera] raw_notifier_call_chain+0x54/0x80 call_netdevice_notifiers_info+0x54/0xa0 __netdev_upper_dev_link+0x19c/0x380 Fixes: e1189d9a5fbe ("net: marvell: prestera: Add Switchdev driver implementation") Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19net: marvell: prestera: fix brige port operationVolodymyr Mytnyk1-1/+1
Return NOTIFY_DONE (dont't care) for switchdev notifications that prestera driver don't know how to handle them. With introduction of SWITCHDEV_BRPORT_[UN]OFFLOADED switchdev events, the driver rejects adding swport to bridge operation which is handled by prestera_bridge_port_join() func. The root cause of this is that prestera driver returns error (EOPNOTSUPP) in prestera_switchdev_blk_event() handler for unknown swdev events. This causes switchdev_bridge_port_offload() to fail when adding port to bridge in prestera_bridge_port_join(). Fixes: 957e2235e526 ("net: make switchdev_bridge_port_{,unoffload} loosely coupled with the bridge") Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-11Merge tag 'net-5.16-rc1' of ↵Linus Torvalds4-67/+89
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf, can and netfilter. Current release - regressions: - bpf: do not reject when the stack read size is different from the tracked scalar size - net: fix premature exit from NAPI state polling in napi_disable() - riscv, bpf: fix RV32 broken build, and silence RV64 warning Current release - new code bugs: - net: fix possible NULL deref in sock_reserve_memory - amt: fix error return code in amt_init(); fix stopping the workqueue - ax88796c: use the correct ioctl callback Previous releases - always broken: - bpf: stop caching subprog index in the bpf_pseudo_func insn - security: fixups for the security hooks in sctp - nfc: add necessary privilege flags in netlink layer, limit operations to admin only - vsock: prevent unnecessary refcnt inc for non-blocking connect - net/smc: fix sk_refcnt underflow on link down and fallback - nfnetlink_queue: fix OOB when mac header was cleared - can: j1939: ignore invalid messages per standard - bpf, sockmap: - fix race in ingress receive verdict with redirect to self - fix incorrect sk_skb data_end access when src_reg = dst_reg - strparser, and tls are reusing qdisc_skb_cb and colliding - ethtool: fix ethtool msg len calculation for pause stats - vlan: fix a UAF in vlan_dev_real_dev() when ref-holder tries to access an unregistering real_dev - udp6: make encap_rcv() bump the v6 not v4 stats - drv: prestera: add explicit padding to fix m68k build - drv: felix: fix broken VLAN-tagged PTP under VLAN-aware bridge - drv: mvpp2: fix wrong SerDes reconfiguration order Misc & small latecomers: - ipvs: auto-load ipvs on genl access - mctp: sanity check the struct sockaddr_mctp padding fields - libfs: support RENAME_EXCHANGE in simple_rename() - avoid double accounting for pure zerocopy skbs" * tag 'net-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (123 commits) selftests/net: udpgso_bench_rx: fix port argument net: wwan: iosm: fix compilation warning cxgb4: fix eeprom len when diagnostics not implemented net: fix premature exit from NAPI state polling in napi_disable() net/smc: fix sk_refcnt underflow on linkdown and fallback net/mlx5: Lag, fix a potential Oops with mlx5_lag_create_definer() gve: fix unmatched u64_stats_update_end() net: ethernet: lantiq_etop: Fix compilation error selftests: forwarding: Fix packet matching in mirroring selftests vsock: prevent unnecessary refcnt inc for nonblocking connect net: marvell: mvpp2: Fix wrong SerDes reconfiguration order net: ethernet: ti: cpsw_ale: Fix access to un-initialized memory net: stmmac: allow a tc-taprio base-time of zero selftests: net: test_vxlan_under_vrf: fix HV connectivity test net: hns3: allow configure ETS bandwidth of all TCs net: hns3: remove check VF uc mac exist when set by PF net: hns3: fix some mac statistics is always 0 in device version V2 net: hns3: fix kernel crash when unload VF while it is being reset net: hns3: sync rx ring head in echo common pull net: hns3: fix pfc packet number incorrect after querying pfc parameters ...
2021-11-07net: marvell: prestera: fix hw structure laid outVolodymyr Mytnyk1-63/+68
The prestera FW v4.0 support commit has been merged accidentally w/o review comments addressed and waiting for the final patch set to be uploaded. So, fix the remaining comments related to structure laid out and build issues. Reported-by: kernel test robot <lkp@intel.com> Fixes: bb5dbf2cc64d ("net: marvell: prestera: add firmware v4.0 support") Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-06Merge tag 'pci-v5.16-changes' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Conserve IRQs by setting up portdrv IRQs only when there are users (Jan Kiszka) - Rework and simplify _OSC negotiation for control of PCIe features (Joerg Roedel) - Remove struct pci_dev.driver pointer since it's redundant with the struct device.driver pointer (Uwe Kleine-König) Resource management: - Coalesce contiguous host bridge apertures from _CRS to accommodate BARs that cover more than one aperture (Kai-Heng Feng) Sysfs: - Check CAP_SYS_ADMIN before parsing user input (Krzysztof Wilczyński) - Return -EINVAL consistently from "store" functions (Krzysztof Wilczyński) - Use sysfs_emit() in endpoint "show" functions to avoid buffer overruns (Kunihiko Hayashi) PCIe native device hotplug: - Ignore Link Down/Up caused by resets during error recovery so endpoint drivers can remain bound to the device (Lukas Wunner) Virtualization: - Avoid bus resets on Atheros QCA6174, where they hang the device (Ingmar Klein) - Work around Pericom PI7C9X2G switch packet drop erratum by using store and forward mode instead of cut-through (Nathan Rossi) - Avoid trying to enable AtomicOps on VFs; the PF setting applies to all VFs (Selvin Xavier) MSI: - Document that /sys/bus/pci/devices/.../irq contains the legacy INTx interrupt or the IRQ of the first MSI (not MSI-X) vector (Barry Song) VPD: - Add pci_read_vpd_any() and pci_write_vpd_any() to access anywhere in the possible VPD space; use these to simplify the cxgb3 driver (Heiner Kallweit) Peer-to-peer DMA: - Add (not subtract) the bus offset when calculating DMA address (Wang Lu) ASPM: - Re-enable LTR at Downstream Ports so they don't report Unsupported Requests when reset or hot-added devices send LTR messages (Mingchuang Qiao) Apple PCIe controller driver: - Add driver for Apple M1 PCIe controller (Alyssa Rosenzweig, Marc Zyngier) Cadence PCIe controller driver: - Return success when probe succeeds instead of falling into error path (Li Chen) HiSilicon Kirin PCIe controller driver: - Reorganize PHY logic and add support for external PHY drivers (Mauro Carvalho Chehab) - Support PERST# GPIOs for HiKey970 external PEX 8606 bridge (Mauro Carvalho Chehab) - Add Kirin 970 support (Mauro Carvalho Chehab) - Make driver removable (Mauro Carvalho Chehab) Intel VMD host bridge driver: - If IOMMU supports interrupt remapping, leave VMD MSI-X remapping enabled (Adrian Huang) - Number each controller so we can tell them apart in /proc/interrupts (Chunguang Xu) - Avoid building on UML because VMD depends on x86 bare metal APIs (Johannes Berg) Marvell Aardvark PCIe controller driver: - Define macros for PCI_EXP_DEVCTL_PAYLOAD_* (Pali Rohár) - Set Max Payload Size to 512 bytes per Marvell spec (Pali Rohár) - Downgrade PIO Response Status messages to debug level (Marek Behún) - Preserve CRS SV (Config Request Retry Software Visibility) bit in emulated Root Control register (Pali Rohár) - Fix issue in configuring reference clock (Pali Rohár) - Don't clear status bits for masked interrupts (Pali Rohár) - Don't mask unused interrupts (Pali Rohár) - Avoid code repetition in advk_pcie_rd_conf() (Marek Behún) - Retry config accesses on CRS response (Pali Rohár) - Simplify emulated Root Capabilities initialization (Pali Rohár) - Fix several link training issues (Pali Rohár) - Fix link-up checking via LTSSM (Pali Rohár) - Fix reporting of Data Link Layer Link Active (Pali Rohár) - Fix emulation of W1C bits (Marek Behún) - Fix MSI domain .alloc() method to return zero on success (Marek Behún) - Read entire 16-bit MSI vector in MSI handler, not just low 8 bits (Marek Behún) - Clear Root Port I/O Space, Memory Space, and Bus Master Enable bits at startup; PCI core will set those as necessary (Pali Rohár) - When operating as a Root Port, set class code to "PCI Bridge" instead of the default "Mass Storage Controller" (Pali Rohár) - Add emulation for PCI_BRIDGE_CTL_BUS_RESET since aardvark doesn't implement this per spec (Pali Rohár) - Add emulation of option ROM BAR since aardvark doesn't implement this per spec (Pali Rohár) MediaTek MT7621 PCIe controller driver: - Add MediaTek MT7621 PCIe host controller driver and DT binding (Sergio Paracuellos) Qualcomm PCIe controller driver: - Add SC8180x compatible string (Bjorn Andersson) - Add endpoint controller driver and DT binding (Manivannan Sadhasivam) - Restructure to use of_device_get_match_data() (Prasad Malisetty) - Add SC7280-specific pcie_1_pipe_clk_src handling (Prasad Malisetty) Renesas R-Car PCIe controller driver: - Remove unnecessary includes (Geert Uytterhoeven) Rockchip DesignWare PCIe controller driver: - Add DT binding (Simon Xue) Socionext UniPhier Pro5 controller driver: - Serialize INTx masking/unmasking (Kunihiko Hayashi) Synopsys DesignWare PCIe controller driver: - Run dwc .host_init() method before registering MSI interrupt handler so we can deal with pending interrupts left by bootloader (Bjorn Andersson) - Clean up Kconfig dependencies (Andy Shevchenko) - Export symbols to allow more modular drivers (Luca Ceresoli) TI DRA7xx PCIe controller driver: - Allow host and endpoint drivers to be modules (Luca Ceresoli) - Enable external clock if present (Luca Ceresoli) TI J721E PCIe driver: - Disable PHY when probe fails after initializing it (Christophe JAILLET) MicroSemi Switchtec management driver: - Return error to application when command execution fails because an out-of-band reset has cleared the device BARs, Memory Space Enable, etc (Kelvin Cao) - Fix MRPC error status handling issue (Kelvin Cao) - Mask out other bits when reading of management VEP instance ID (Kelvin Cao) - Return EOPNOTSUPP instead of ENOTSUPP from sysfs show functions (Kelvin Cao) - Add check of event support (Logan Gunthorpe) Miscellaneous: - Remove unused pci_pool wrappers, which have been replaced by dma_pool (Cai Huoqing) - Use 'unsigned int' instead of bare 'unsigned' (Krzysztof Wilczyński) - Use kstrtobool() directly, sans strtobool() wrapper (Krzysztof Wilczyński) - Fix some sscanf(), sprintf() format mismatches (Krzysztof Wilczyński) - Update PCI subsystem information in MAINTAINERS (Krzysztof Wilczyński) - Correct some misspellings (Krzysztof Wilczyński)" * tag 'pci-v5.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (137 commits) PCI: Add ACS quirk for Pericom PI7C9X2G switches PCI: apple: Configure RID to SID mapper on device addition iommu/dart: Exclude MSI doorbell from PCIe device IOVA range PCI: apple: Implement MSI support PCI: apple: Add INTx and per-port interrupt support PCI: kirin: Allow removing the driver PCI: kirin: De-init the dwc driver PCI: kirin: Disable clkreq during poweroff sequence PCI: kirin: Move the power-off code to a common routine PCI: kirin: Add power_off support for Kirin 960 PHY PCI: kirin: Allow building it as a module PCI: kirin: Add MODULE_* macros PCI: kirin: Add Kirin 970 compatible PCI: kirin: Support PERST# GPIOs for HiKey970 external PEX 8606 bridge PCI: apple: Set up reference clocks when probing PCI: apple: Add initial hardware bring-up PCI: of: Allow matching of an interrupt-map local to a PCI device of/irq: Allow matching of an interrupt-map local to an interrupt controller irqdomain: Make of_phandle_args_to_fwspec() generally available PCI: Do not enable AtomicOps on VFs ...
2021-11-05net: marvell: prestera: fix patchwork build problemsVolodymyr Mytnyk4-5/+10
fix the remaining build issues reported by patchwork in firmware v4.0 support commit which has been already merged. Fix patchwork issues: - source inline - checkpatch Fixes: bb5dbf2cc64d ("net: marvell: prestera: add firmware v4.0 support") Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-02net: marvell: prestera: Add explicit paddingGeert Uytterhoeven1-0/+12
On m68k: In function ‘prestera_hw_build_tests’, inlined from ‘prestera_hw_switch_init’ at drivers/net/ethernet/marvell/prestera/prestera_hw.c:788:2: ././include/linux/compiler_types.h:335:38: error: call to ‘__compiletime_assert_345’ declared with attribute error: BUILD_BUG_ON failed: sizeof(struct prestera_msg_switch_attr_req) != 16 ... The driver assumes structure members are naturally aligned, but does not add explicit padding, thus breaking architectures where integral values are not always naturally aligned (e.g. on m68k, __alignof(int) is 2, not 4). Fixes: bb5dbf2cc64d5cfa ("net: marvell: prestera: add firmware v4.0 support") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20211102082433.3820514-1-geert@linux-m68k.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-29net: marvell: prestera: add firmware v4.0 supportVolodymyr Mytnyk8-719/+951
Add firmware (FW) version 4.0 support for Marvell Prestera driver. Major changes have been made to new v4.0 FW ABI to add support of new features, introduce the stability of the FW ABI and ensure better forward compatibility for the future driver vesrions. Current v4.0 FW feature set support does not expect any changes to ABI, as it was defined and tested through long period of time. The ABI may be extended in case of new features, but it will not break the backward compatibility. ABI major changes done in v4.0: - L1 ABI, where MAC and PHY API configuration are split. - ACL has been split to low-level TCAM and Counters ABI to provide more HW ACL capabilities for future driver versions. To support backward support, the addition compatibility layer is required in the driver which will have two different codebase under "if FW-VER elif FW-VER else" conditions that will be removed in the future anyway, So, the idea was to break backward support and focus on more stable FW instead of supporting old version with very minimal and limited set of features/capabilities. Improve FW msg validation: * Use __le64, __le32, __le16 types in msg to/from FW to catch endian mismatch by sparse. * Use BUILD_BUG_ON for structures sent/recv to/from FW. Co-developed-by: Vadym Kochan <vkochan@marvell.com> Signed-off-by: Vadym Kochan <vkochan@marvell.com> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-19ethernet: prestera: use eth_hw_addr_gen()Jakub Kicinski1-2/+5
Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Vadym and Taras report that the current behavior of the driver is not exactly expected and it's better to add the port id in like other drivers do. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-12net: marvell: prestera: use dev_driver_string() instead of pci_dev->driver->nameUwe Kleine-König1-1/+1
Replace dev->driver_name() by dev_driver_string() for the corresponding struct device. This is a step toward removing pci_dev->driver. [bhelgaas: split to separate patch] Link: https://lore.kernel.org/r/20211004125935.2300113-8-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-10-02ethernet: use eth_hw_addr_set() instead of ether_addr_copy()Jakub Kicinski1-1/+1
Convert Ethernet from ether_addr_copy() to eth_hw_addr_set(): @@ expression dev, np; @@ - ether_addr_copy(dev->dev_addr, np) + eth_hw_addr_set(dev, np) Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27net/prestera: Split devlink and traps registrations to separate routinesLeon Romanovsky3-27/+14
Separate devlink registrations and traps registrations so devlink will be registered when driver is fully initialized. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-22devlink: Make devlink_register to be voidLeon Romanovsky1-5/+1
devlink_register() can't fail and always returns success, but all drivers are obligated to check returned status anyway. This adds a lot of boilerplate code to handle impossible flow. Make devlink_register() void and simplify the drivers that use that API call. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Simon Horman <simon.horman@corigine.com> Acked-by: Vladimir Oltean <olteanv@gmail.com> # dsa Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-2/+2
Conflicts: drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h 9e26680733d5 ("bnxt_en: Update firmware call to retrieve TX PTP timestamp") 9e518f25802c ("bnxt_en: 1PPS functions to configure TSIO pins") 099fdeda659d ("bnxt_en: Event handler for PPS events") kernel/bpf/helpers.c include/linux/bpf-cgroup.h a2baf4e8bb0f ("bpf: Fix potentially incorrect results with bpf_get_local_storage()") c7603cfa04e7 ("bpf: Add ambient BPF runtime context stored in current") drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c 5957cc557dc5 ("net/mlx5: Set all field of mlx5_irq before inserting it to the xarray") 2d0b41a37679 ("net/mlx5: Refcount mlx5_irq with integer") MAINTAINERS 7b637cd52f02 ("MAINTAINERS: fix Microchip CAN BUS Analyzer Tool entry typo") 7d901a1e878a ("net: phy: add Maxlinear GPY115/21x/24x driver") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-10net: switchdev: zero-initialize struct switchdev_notifier_fdb_info emitted ↵Vladimir Oltean1-2/+2
by drivers towards the bridge The blamed commit added a new field to struct switchdev_notifier_fdb_info, but did not make sure that all call paths set it to something valid. For example, a switchdev driver may emit a SWITCHDEV_FDB_ADD_TO_BRIDGE notifier, and since the 'is_local' flag is not set, it contains junk from the stack, so the bridge might interpret those notifications as being for local FDB entries when that was not intended. To avoid that now and in the future, zero-initialize all switchdev_notifier_fdb_info structures created by drivers such that all newly added fields to not need to touch drivers again. Fixes: 2c4eca3ef716 ("net: bridge: switchdev: include local flag in FDB notifications") Reported-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Karsten Graul <kgraul@linux.ibm.com> Link: https://lore.kernel.org/r/20210810115024.1629983-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-09devlink: Set device as early as possibleLeon Romanovsky3-5/+6
All kernel devlink implementations call to devlink_alloc() during initialization routine for specific device which is used later as a parent device for devlink_register(). Such late device assignment causes to the situation which requires us to call to device_register() before setting other parameters, but that call opens devlink to the world and makes accessible for the netlink users. Any attempt to move devlink_register() to be the last call generates the following error due to access to the devlink->dev pointer. [ 8.758862] devlink_nl_param_fill+0x2e8/0xe50 [ 8.760305] devlink_param_notify+0x6d/0x180 [ 8.760435] __devlink_params_register+0x2f1/0x670 [ 8.760558] devlink_params_register+0x1e/0x20 The simple change of API to set devlink device in the devlink_alloc() instead of devlink_register() fixes all this above and ensures that prior to call to devlink_register() everything already set. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-0/+2
Build failure in drivers/net/wwan/mhi_wwan_mbim.c: add missing parameter (0, assuming we don't want buffer pre-alloc). Conflict in drivers/net/dsa/sja1105/sja1105_main.c between: 589918df9322 ("net: dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too") 0fac6aa098ed ("net: dsa: sja1105: delete the best_effort_vlan_filtering mode") Follow the instructions from the commit message of the former commit - removed the if conditions. When looking at commit 589918df9322 ("net: dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too") note that the mask_iotag fields get removed by the following patch. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-04net/prestera: Fix devlink groups leakage in error flowLeon Romanovsky1-0/+2
Devlink trap group is registered but not released in error flow, add the missing devlink_trap_groups_unregister() call. Fixes: 0a9003f45e91 ("net: marvell: prestera: devlink: add traps/groups implementation") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23net: bridge: switchdev: allow the TX data plane forwarding to be offloadedTobias Waldekranz1-1/+1
Allow switchdevs to forward frames from the CPU in accordance with the bridge configuration in the same way as is done between bridge ports. This means that the bridge will only send a single skb towards one of the ports under the switchdev's control, and expects the driver to deliver the packet to all eligible ports in its domain. Primarily this improves the performance of multicast flows with multiple subscribers, as it allows the hardware to perform the frame replication. The basic flow between the driver and the bridge is as follows: - When joining a bridge port, the switchdev driver calls switchdev_bridge_port_offload() with tx_fwd_offload = true. - The bridge sends offloadable skbs to one of the ports under the switchdev's control using skb->offload_fwd_mark = true. - The switchdev driver checks the skb->offload_fwd_mark field and lets its FDB lookup select the destination port mask for this packet. v1->v2: - convert br_input_skb_cb::fwd_hwdoms to a plain unsigned long - introduce a static key "br_switchdev_fwd_offload_used" to minimize the impact of the newly introduced feature on all the setups which don't have hardware that can make use of it - introduce a check for nbp->flags & BR_FWD_OFFLOAD to optimize cache line access - reorder nbp_switchdev_frame_mark_accel() and br_handle_vlan() in __br_forward() - do not strip VLAN on egress if forwarding offload on VLAN-aware bridge is being used - propagate errors from .ndo_dfwd_add_station() if not EOPNOTSUPP v2->v3: - replace the solution based on .ndo_dfwd_add_station with a solution based on switchdev_bridge_port_offload - rename BR_FWD_OFFLOAD to BR_TX_FWD_OFFLOAD v3->v4: rebase v4->v5: - make sure the static key is decremented on bridge port unoffload - more function and variable renaming and comments for them: br_switchdev_fwd_offload_used to br_switchdev_tx_fwd_offload br_switchdev_accels_skb to br_switchdev_frame_uses_tx_fwd_offload nbp_switchdev_frame_mark_tx_fwd to nbp_switchdev_frame_mark_tx_fwd_to_hwdom nbp_switchdev_frame_mark_accel to nbp_switchdev_frame_mark_tx_fwd_offload fwd_accel to tx_fwd_offload Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-22net: bridge: move the switchdev object replay helpers to "push" modeVladimir Oltean1-3/+4
Starting with commit 4f2673b3a2b6 ("net: bridge: add helper to replay port and host-joined mdb entries"), DSA has introduced some bridge helpers that replay switchdev events (FDB/MDB/VLAN additions and deletions) that can be lost by the switchdev drivers in a variety of circumstances: - an IP multicast group was host-joined on the bridge itself before any switchdev port joined the bridge, leading to the host MDB entries missing in the hardware database. - during the bridge creation process, the MAC address of the bridge was added to the FDB as an entry pointing towards the bridge device itself, but with no switchdev ports being part of the bridge yet, this local FDB entry would remain unknown to the switchdev hardware database. - a VLAN/FDB/MDB was added to a bridge port that is a LAG interface, before any switchdev port joined that LAG, leading to the hardware database missing those entries. - a switchdev port left a LAG that is a bridge port, while the LAG remained part of the bridge, and all FDB/MDB/VLAN entries remained installed in the hardware database of the switchdev port. Also, since commit 0d2cfbd41c4a ("net: bridge: ignore switchdev events for LAG ports which didn't request replay"), DSA introduced a method, based on a const void *ctx, to ensure that two switchdev ports under the same LAG that is a bridge port do not see the same MDB/VLAN entry being replayed twice by the bridge, once for every bridge port that joins the LAG. With so many ordering corner cases being possible, it seems unreasonable to expect a switchdev driver writer to get it right from the first try. Therefore, now that DSA has experimented with the bridge replay helpers for a little bit, we can move the code to the bridge driver where it is more readily available to all switchdev drivers. To convert the switchdev object replay helpers from "pull mode" (where the driver asks for them) to a "push mode" (where the bridge offers them automatically), the biggest problem is that the bridge needs to be aware when a switchdev port joins and leaves, even when the switchdev is only indirectly a bridge port (for example when the bridge port is a LAG upper of the switchdev). Luckily, we already have a hook for that, in the form of the newly introduced switchdev_bridge_port_offload() and switchdev_bridge_port_unoffload() calls. These offer a natural place for hooking the object addition and deletion replays. Extend the above 2 functions with: - pointers to the switchdev atomic notifier (for FDB replays) and the blocking notifier (for MDB and VLAN replays). - the "const void *ctx" argument required for drivers to be able to disambiguate between which port is targeted, when multiple ports are lowers of the same LAG that is a bridge port. Most of the drivers pass NULL to this argument, except the ones that support LAG offload and have the proper context check already in place in the switchdev blocking notifier handler. Also unexport the replay helpers, since nobody except the bridge calls them directly now. Note that: (a) we abuse the terminology slightly, because FDB entries are not "switchdev objects", but we count them as objects nonetheless. With no direct way to prove it, I think they are not modeled as switchdev objects because those can only be installed by the bridge to the hardware (as opposed to FDB entries which can be propagated in the other direction too). This is merely an abuse of terms, FDB entries are replayed too, despite not being objects. (b) the bridge does not attempt to sync port attributes to newly joined ports, just the countable stuff (the objects). The reason for this is simple: no universal and symmetric way to sync and unsync them is known. For example, VLAN filtering: what to do on unsync, disable or leave it enabled? Similarly, STP state, ageing timer, etc etc. What a switchdev port does when it becomes standalone again is not really up to the bridge's competence, and the driver should deal with it. On the other hand, replaying deletions of switchdev objects can be seen a matter of cleanup and therefore be treated by the bridge, hence this patch. We make the replay helpers opt-in for drivers, because they might not bring immediate benefits for them: - nbp_vlan_init() is called _after_ netdev_master_upper_dev_link(), so br_vlan_replay() should not do anything for the new drivers on which we call it. The existing drivers where there was even a slight possibility for there to exist a VLAN on a bridge port before they join it are already guarded against this: mlxsw and prestera deny joining LAG interfaces that are members of a bridge. - br_fdb_replay() should now notify of local FDB entries, but I patched all drivers except DSA to ignore these new entries in commit 2c4eca3ef716 ("net: bridge: switchdev: include local flag in FDB notifications"). Driver authors can lift this restriction as they wish, and when they do, they can also opt into the FDB replay functionality. - br_mdb_replay() should fix a real issue which is described in commit 4f2673b3a2b6 ("net: bridge: add helper to replay port and host-joined mdb entries"). However most drivers do not offload the SWITCHDEV_OBJ_ID_HOST_MDB to see this issue: only cpsw and am65_cpsw offload this switchdev object, and I don't completely understand the way in which they offload this switchdev object anyway. So I'll leave it up to these drivers' respective maintainers to opt into br_mdb_replay(). So most of the drivers pass NULL notifier blocks for the replay helpers, except: - dpaa2-switch which was already acked/regression-tested with the helpers enabled (and there isn't much of a downside in having them) - ocelot which already had replay logic in "pull" mode - DSA which already had replay logic in "pull" mode An important observation is that the drivers which don't currently request bridge event replays don't even have the switchdev_bridge_port_{offload,unoffload} calls placed in proper places right now. This was done to avoid unnecessary rework for drivers which might never even add support for this. For driver writers who wish to add replay support, this can be used as a tentative placement guide: https://patchwork.kernel.org/project/netdevbpf/patch/20210720134655.892334-11-vladimir.oltean@nxp.com/ Cc: Vadym Kochan <vkochan@marvell.com> Cc: Taras Chornyi <tchornyi@marvell.com> Cc: Ioana Ciornei <ioana.ciornei@nxp.com> Cc: Lars Povlsen <lars.povlsen@microchip.com> Cc: Steen Hegelund <Steen.Hegelund@microchip.com> Cc: UNGLinuxDriver@microchip.com Cc: Claudiu Manoil <claudiu.manoil@nxp.com> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com> # dpaa2-switch Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-22net: bridge: switchdev: let drivers inform which bridge ports are offloadedVladimir Oltean3-3/+14
On reception of an skb, the bridge checks if it was marked as 'already forwarded in hardware' (checks if skb->offload_fwd_mark == 1), and if it is, it assigns the source hardware domain of that skb based on the hardware domain of the ingress port. Then during forwarding, it enforces that the egress port must have a different hardware domain than the ingress one (this is done in nbp_switchdev_allowed_egress). Non-switchdev drivers don't report any physical switch id (neither through devlink nor .ndo_get_port_parent_id), therefore the bridge assigns them a hardware domain of 0, and packets coming from them will always have skb->offload_fwd_mark = 0. So there aren't any restrictions. Problems appear due to the fact that DSA would like to perform software fallback for bonding and team interfaces that the physical switch cannot offload. +-- br0 ---+ / / | \ / / | \ / | | bond0 / | | / \ swp0 swp1 swp2 swp3 swp4 There, it is desirable that the presence of swp3 and swp4 under a non-offloaded LAG does not preclude us from doing hardware bridging beteen swp0, swp1 and swp2. The bandwidth of the CPU is often times high enough that software bridging between {swp0,swp1,swp2} and bond0 is not impractical. But this creates an impossible paradox given the current way in which port hardware domains are assigned. When the driver receives a packet from swp0 (say, due to flooding), it must set skb->offload_fwd_mark to something. - If we set it to 0, then the bridge will forward it towards swp1, swp2 and bond0. But the switch has already forwarded it towards swp1 and swp2 (not to bond0, remember, that isn't offloaded, so as far as the switch is concerned, ports swp3 and swp4 are not looking up the FDB, and the entire bond0 is a destination that is strictly behind the CPU). But we don't want duplicated traffic towards swp1 and swp2, so it's not ok to set skb->offload_fwd_mark = 0. - If we set it to 1, then the bridge will not forward the skb towards the ports with the same switchdev mark, i.e. not to swp1, swp2 and bond0. Towards swp1 and swp2 that's ok, but towards bond0? It should have forwarded the skb there. So the real issue is that bond0 will be assigned the same hardware domain as {swp0,swp1,swp2}, because the function that assigns hardware domains to bridge ports, nbp_switchdev_add(), recurses through bond0's lower interfaces until it finds something that implements devlink (calls dev_get_port_parent_id with bool recurse = true). This is a problem because the fact that bond0 can be offloaded by swp3 and swp4 in our example is merely an assumption. A solution is to give the bridge explicit hints as to what hardware domain it should use for each port. Currently, the bridging offload is very 'silent': a driver registers a netdevice notifier, which is put on the netns's notifier chain, and which sniffs around for NETDEV_CHANGEUPPER events where the upper is a bridge, and the lower is an interface it knows about (one registered by this driver, normally). Then, from within that notifier, it does a bunch of stuff behind the bridge's back, without the bridge necessarily knowing that there's somebody offloading that port. It looks like this: ip link set swp0 master br0 | v br_add_if() calls netdev_master_upper_dev_link() | v call_netdevice_notifiers | v dsa_slave_netdevice_event | v oh, hey! it's for me! | v .port_bridge_join What we do to solve the conundrum is to be less silent, and change the switchdev drivers to present themselves to the bridge. Something like this: ip link set swp0 master br0 | v br_add_if() calls netdev_master_upper_dev_link() | v bridge: Aye! I'll use this call_netdevice_notifiers ^ ppid as the | | hardware domain for v | this port, and zero dsa_slave_netdevice_event | if I got nothing. | | v | oh, hey! it's for me! | | | v | .port_bridge_join | | | +------------------------+ switchdev_bridge_port_offload(swp0, swp0) Then stacked interfaces (like bond0 on top of swp3/swp4) would be treated differently in DSA, depending on whether we can or cannot offload them. The offload case: ip link set bond0 master br0 | v br_add_if() calls netdev_master_upper_dev_link() | v bridge: Aye! I'll use this call_netdevice_notifiers ^ ppid as the | | switchdev mark for v | bond0. dsa_slave_netdevice_event | Coincidentally (or not), | | bond0 and swp0, swp1, swp2 v | all have the same switchdev hmm, it's not quite for me, | mark now, since the ASIC but my driver has already | is able to forward towards called .port_lag_join | all these ports in hw. for it, because I have | a port with dp->lag_dev == bond0. | | | v | .port_bridge_join | for swp3 and swp4 | | | +------------------------+ switchdev_bridge_port_offload(bond0, swp3) switchdev_bridge_port_offload(bond0, swp4) And the non-offload case: ip link set bond0 master br0 | v br_add_if() calls netdev_master_upper_dev_link() | v bridge waiting: call_netdevice_notifiers ^ huh, switchdev_bridge_port_offload | | wasn't called, okay, I'll use a v | hwdom of zero for this one. dsa_slave_netdevice_event : Then packets received on swp0 will | : not be software-forwarded towards v : swp1, but they will towards bond0. it's not for me, but bond0 is an upper of swp3 and swp4, but their dp->lag_dev is NULL because they couldn't offload it. Basically we can draw the conclusion that the lowers of a bridge port can come and go, so depending on the configuration of lowers for a bridge port, it can dynamically toggle between offloaded and unoffloaded. Therefore, we need an equivalent switchdev_bridge_port_unoffload too. This patch changes the way any switchdev driver interacts with the bridge. From now on, everybody needs to call switchdev_bridge_port_offload and switchdev_bridge_port_unoffload, otherwise the bridge will treat the port as non-offloaded and allow software flooding to other ports from the same ASIC. Note that these functions lay the ground for a more complex handshake between switchdev drivers and the bridge in the future. For drivers that will request a replay of the switchdev objects when they offload and unoffload a bridge port (DSA, dpaa2-switch, ocelot), we place the call to switchdev_bridge_port_unoffload() strategically inside the NETDEV_PRECHANGEUPPER notifier's code path, and not inside NETDEV_CHANGEUPPER. This is because the switchdev object replay helpers need the netdev adjacency lists to be valid, and that is only true in NETDEV_PRECHANGEUPPER. Cc: Vadym Kochan <vkochan@marvell.com> Cc: Taras Chornyi <tchornyi@marvell.com> Cc: Ioana Ciornei <ioana.ciornei@nxp.com> Cc: Lars Povlsen <lars.povlsen@microchip.com> Cc: Steen Hegelund <Steen.Hegelund@microchip.com> Cc: UNGLinuxDriver@microchip.com Cc: Claudiu Manoil <claudiu.manoil@nxp.com> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> # dpaa2-switch: regression Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com> # dpaa2-switch Tested-by: Horatiu Vultur <horatiu.vultur@microchip.com> # ocelot-switch Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: switchdev: add a context void pointer to struct switchdev_notifier_infoVladimir Oltean1-3/+3
In the case where the driver asks for a replay of a certain type of event (port object or attribute) for a bridge port that is a LAG, it may do so because this port has just joined the LAG. But there might already be other switchdev ports in that LAG, and it is preferable that those preexisting switchdev ports do not act upon the replayed event. The solution is to add a context to switchdev events, which is NULL most of the time (when the bridge layer initiates the call) but which can be set to a value controlled by the switchdev driver when a replay is requested. The driver can then check the context to figure out if all ports within the LAG should act upon the switchdev event, or just the ones that match the context. We have to modify all switchdev_handle_* helper functions as well as the prototypes in the drivers that use these helpers too, because these helpers hide the underlying struct switchdev_notifier_info from us and there is no way to retrieve the context otherwise. The context structure will be populated and used in later patches. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16net: marvell: prestera: Add matchall supportSerhiy Boiko10-1/+367
- Introduce matchall filter support - Add SPAN API to configure port mirroring. - Add tc mirror action. At this moment, only mirror (egress) action is supported. Example: tc filter ... action mirred egress mirror dev DEV Co-developed-by: Volodymyr Mytnyk <vmytnyk@marvell.com> Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com> Signed-off-by: Serhiy Boiko <serhiy.boiko@plvision.eu> Signed-off-by: Vadym Kochan <vkochan@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16net: marvell: Implement TC flower offloadSerhiy Boiko11-2/+1404
Add ACL infrastructure for Prestera Switch ASICs family devices to offload cls_flower rules to be processed in the HW. ACL implementation is based on tc filter api. The flower classifier is supported to configure ACL rules/matches/action. Supported actions: - drop - trap - pass Supported dissector keys: - indev - src_mac - dst_mac - src_ip - dst_ip - ip_proto - src_port - dst_port - vlan_id - vlan_ethtype - icmp type/code Co-developed-by: Volodymyr Mytnyk <vmytnyk@marvell.com> Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com> Signed-off-by: Serhiy Boiko <serhiy.boiko@plvision.eu> Signed-off-by: Vadym Kochan <vkochan@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14net: marvell: prestera: devlink: add traps with DROP actionOleksandr Mazur3-0/+137
Add traps that have init_action being set to DROP. Add 'trap_drop_counter_get' (devlink API) callback implementation, that is used to get number of packets that have been dropped by the HW (traps with action 'DROP'). Add new FW command CPU_CODE_COUNTERS_GET. Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14net: marvell: prestera: devlink: add traps/groups implementationOleksandr Mazur6-3/+452
Add devlink traps registration (with corresponding groups) for all the traffic types that driver traps to the CPU; prestera_rxtx: report each packet trapped to the CPU (RX) to the prestera_devlink; Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: marvell: prestera: add LAG supportSerhiy Boiko5-31/+531
The following features are supported: - LAG basic operations - create/delete LAG - add/remove a member to LAG - enable/disable member in LAG - LAG Bridge support - LAG VLAN support - LAG FDB support Limitations: - Only HASH lag tx type is supported - The Hash parameters are not configurable. They are applied during the LAG creation stage. - Enslaving a port to the LAG device that already has an upper device is not supported. Co-developed-by: Andrii Savka <andrii.savka@plvision.eu> Signed-off-by: Andrii Savka <andrii.savka@plvision.eu> Signed-off-by: Serhiy Boiko <serhiy.boiko@plvision.eu> Co-developed-by: Vadym Kochan <vkochan@marvell.com> Signed-off-by: Vadym Kochan <vkochan@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: marvell: prestera: do not propagate netdev events to prestera_switchdev.cVadym Kochan3-37/+19
Replace prestera_bridge_port_event(...) by prestera_bridge_port_join(...) and prestera_bridge_port_leave(). It simplifies the code by reading netdev event specific handling only once in prestera_main.c Signed-off-by: Vadym Kochan <vkochan@marvell.com> CC: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net: marvell: prestera: move netdev topology validation to prestera_mainVadym Kochan2-23/+26
Move handling of PRECHANGEUPPER event from prestera_switchdev to prestera_main which is responsible for basic netdev events handling and routing them to related module. Signed-off-by: Vadym Kochan <vkochan@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-01net: marvell: prestera: try to load previous fw versionVadym Kochan1-22/+61
Lets try to load previous fw version in case the latest one is missing on existing system. Signed-off-by: Vadym Kochan <vkochan@marvell.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-01net: marvell: prestera: bump supported firmware version to 3.0Vadym Kochan1-1/+1
New firmware version has some ABI and feature changes like: - LAG support - initial L3 support - changed events handling logic Signed-off-by: Vadym Kochan <vkochan@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>