summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2022-04-14rndis_host: enable the bogus MAC fixup for ZTE devices from cdc_etherLech Perczak2-0/+33
Certain ZTE modems, namely: MF823. MF831, MF910, built-in modem from MF286R, expose both CDC-ECM and RNDIS network interfaces. They have a trait of ignoring the locally-administered MAC address configured on the interface both in CDC-ECM and RNDIS part, and this leads to dropping of incoming traffic by the host. However, the workaround was only present in CDC-ECM, and MF286R explicitly requires it in RNDIS mode. Re-use the workaround in rndis_host as well, to fix operation of MF286R module, some versions of which expose only the RNDIS interface. Do so by introducing new flag, RNDIS_DRIVER_DATA_DST_MAC_FIXUP, and testing for it in rndis_rx_fixup. This is required, as RNDIS uses frame batching, and all of the packets inside the batch need the fixup. This might introduce a performance penalty, because test is done for every returned Ethernet frame. Apply the workaround to both "flavors" of RNDIS interfaces, as older ZTE modems, like MF823 found in the wild, report the USB_CLASS_COMM class interfaces, while MF286R reports USB_CLASS_WIRELESS_CONTROLLER. Suggested-by: Bjørn Mork <bjorn@mork.no> Cc: Kristian Evensen <kristian.evensen@gmail.com> Cc: Oliver Neukum <oliver@neukum.org> Signed-off-by: Lech Perczak <lech.perczak@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-14cdc_ether: export usbnet_cdc_zte_rx_fixupLech Perczak2-1/+3
Commit bfe9b9d2df66 ("cdc_ether: Improve ZTE MF823/831/910 handling") introduces a workaround for certain ZTE modems reporting invalid MAC addresses over CDC-ECM. The same issue was present on their RNDIS interface,which was fixed in commit a5a18bdf7453 ("rndis_host: Set valid random MAC on buggy devices"). However, internal modem of ZTE MF286R router, on its RNDIS interface, also exhibits a second issue fixed already in CDC-ECM, of the device not respecting configured random MAC address. In order to share the fixup for this with rndis_host driver, export the workaround function, which will be re-used in the following commit in rndis_host. Cc: Kristian Evensen <kristian.evensen@gmail.com> Cc: Bjørn Mork <bjorn@mork.no> Cc: Oliver Neukum <oliver@neukum.org> Signed-off-by: Lech Perczak <lech.perczak@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-13nfp: update nfp_X logging definitionsDylan Muller1-6/+20
Previously it was not possible to determine which code path was responsible for generating a certain message after a call to the nfp_X messaging definitions for cases of duplicate strings. We therefore modify nfp_err, nfp_warn, nfp_info, nfp_dbg and nfp_printk to print the corresponding file and line number where the nfp_X definition is used. Signed-off-by: Dylan Muller <dylan.muller@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13Merge branch 'ip-ingress-skb-reason'David S. Miller10-42/+113
Menglong Dong says: ==================== net: ip: add skb drop reasons to ip ingress In the series "net: use kfree_skb_reason() for ip/udp packet receive", skb drop reasons are added to the basic ingress path of IPv4. And in the series "net: use kfree_skb_reason() for ip/neighbour", the egress paths of IPv4 and IPv6 are handled. Related links: https://lore.kernel.org/netdev/20220205074739.543606-1-imagedong@tencent.com/ https://lore.kernel.org/netdev/20220226041831.2058437-1-imagedong@tencent.com/ Seems we still have a lot work to do with IP layer, including IPv6 basic ingress path, IPv4/IPv6 forwarding, IPv6 exthdrs, fragment and defrag, etc. In this series, skb drop reasons are added to the basic ingress path of IPv6 protocol and IPv4/IPv6 packet forwarding. Following functions, which are used for IPv6 packet receiving are handled: ip6_pkt_drop() ip6_rcv_core() ip6_protocol_deliver_rcu() And following functions that used for IPv6 TLV parse are handled: ip6_parse_tlv() ipv6_hop_ra() ipv6_hop_ioam() ipv6_hop_jumbo() ipv6_hop_calipso() ipv6_dest_hao() Besides, ip_forward() and ip6_forward(), which are used for IPv4/IPv6 forwarding, are also handled. And following new drop reasons are added: /* host unreachable, corresponding to IPSTATS_MIB_INADDRERRORS */ SKB_DROP_REASON_IP_INADDRERRORS /* network unreachable, corresponding to IPSTATS_MIB_INADDRERRORS */ SKB_DROP_REASON_IP_INNOROUTES /* packet size is too big, corresponding to * IPSTATS_MIB_INTOOBIGERRORS */ SKB_DROP_REASON_PKT_TOO_BIG In order to simply the definition and assignment for 'enum skb_drop_reason', some helper functions are introduced in the 1th patch. I'm not such if this is necessary, but it makes the code simpler. For example, we can replace the code: if (reason == SKB_DROP_REASON_NOT_SPECIFIED) reason = SKB_DROP_REASON_IP_INHDR; with: SKB_DR_OR(reason, IP_INHDR); In the 6th patch, the statistics for skb in ipv6_hop_jum() is removed, as I think it is redundant. There are two call chains for ipv6_hop_jumbo(). The first one is: ipv6_destopt_rcv() -> ip6_parse_tlv() -> ipv6_hop_jumbo() On this call chain, the drop statistics will be done in ipv6_destopt_rcv() with 'IPSTATS_MIB_INHDRERRORS' if ipv6_hop_jumbo() returns false. The second call chain is: ip6_rcv_core() -> ipv6_parse_hopopts() -> ip6_parse_tlv() And the drop statistics will also be done in ip6_rcv_core() with 'IPSTATS_MIB_INHDRERRORS' if ipv6_hop_jumbo() returns false. Therefore, the statistics in ipv6_hop_jumbo() is redundant, which means the drop is counted twice. The statistics in ipv6_hop_jumbo() is almost the same as the outside, except the 'IPSTATS_MIB_INTRUNCATEDPKTS', which seems that we have to ignore it. ====================================================================== Here is a basic test for IPv6 forwarding packet drop that monitored by 'dropwatch' tool: drop at: ip6_forward+0x81a/0xb70 (0xffffffff86c73f8a) origin: software input port ifindex: 7 timestamp: Wed Apr 13 11:51:06 2022 130010176 nsec protocol: 0x86dd length: 94 original length: 94 drop reason: IP_INADDRERRORS The origin cause of this case is that IPv6 doesn't allow to forward the packet with LOCAL-LINK saddr, and results the 'IP_INADDRERRORS' drop reason. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: ipv6: add skb drop reasons to ip6_protocol_deliver_rcu()Menglong Dong1-4/+12
Replace kfree_skb() used in ip6_protocol_deliver_rcu() with kfree_skb_reason(). No new reasons are added. Some paths are ignored, as they are not common, such as encapsulation on non-final protocol. Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: ipv6: add skb drop reasons to ip6_rcv_core()Menglong Dong1-8/+16
Replace kfree_skb() used in ip6_rcv_core() with kfree_skb_reason(). No new drop reasons are added. Seems now we use 'SKB_DROP_REASON_IP_INHDR' for too many case during ipv6 header parse or check, just like what 'IPSTATS_MIB_INHDRERRORS' do. Will it be too general and hard to know what happened? Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: ipv6: add skb drop reasons to TLV parseMenglong Dong1-12/+24
Replace kfree_skb() used in TLV encoded option header parsing with kfree_skb_reason(). Following functions are involved: ip6_parse_tlv() ipv6_hop_ra() ipv6_hop_ioam() ipv6_hop_jumbo() ipv6_hop_calipso() ipv6_dest_hao() Most skb drops during this process are regarded as 'InHdrErrors', as 'IPSTATS_MIB_INHDRERRORS' is used when ip6_parse_tlv() fails, which make we use 'SKB_DROP_REASON_IP_INHDR' correspondingly. However, 'IP_INHDR' is a relatively general reason. Therefore, we can use other reasons with higher priority in some cases. For example, 'SKB_DROP_REASON_UNHANDLED_PROTO' is used for unknown TLV options. Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: ipv6: remove redundant statistics in ipv6_hop_jumbo()Menglong Dong1-8/+1
There are two call chains for ipv6_hop_jumbo(). The first one is: ipv6_destopt_rcv() -> ip6_parse_tlv() -> ipv6_hop_jumbo() On this call chain, the drop statistics will be done in ipv6_destopt_rcv() with 'IPSTATS_MIB_INHDRERRORS' if ipv6_hop_jumbo() returns false. The second call chain is: ip6_rcv_core() -> ipv6_parse_hopopts() -> ip6_parse_tlv() And the drop statistics will also be done in ip6_rcv_core() with 'IPSTATS_MIB_INHDRERRORS' if ipv6_hop_jumbo() returns false. Therefore, the statistics in ipv6_hop_jumbo() is redundant, which means the drop is counted twice. The statistics in ipv6_hop_jumbo() is almost the same as the outside, except the 'IPSTATS_MIB_INTRUNCATEDPKTS', which seems that we have to ignore it. Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: icmp: introduce function icmpv6_param_prob_reason()Menglong Dong2-5/+13
In order to add the skb drop reasons support to icmpv6_param_prob(), introduce the function icmpv6_param_prob_reason() and make icmpv6_param_prob() an inline call to it. This new function will be used in the following patches. Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: ip: add skb drop reasons to ip forwardingMenglong Dong4-6/+20
Replace kfree_skb() which is used in ip6_forward() and ip_forward() with kfree_skb_reason(). The new drop reason 'SKB_DROP_REASON_PKT_TOO_BIG' is introduced for the case that the length of the packet exceeds MTU and can't fragment. Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: ipv6: add skb drop reasons to ip6_pkt_drop()Menglong Dong1-1/+5
Replace kfree_skb() used in ip6_pkt_drop() with kfree_skb_reason(). No new reason is added. Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: ipv4: add skb drop reasons to ip_error()Menglong Dong3-1/+13
Eventually, I find out the handler function for inputting route lookup fail: ip_error(). The drop reasons we used in ip_error() are almost corresponding to IPSTATS_MIB_*, and following new reasons are introduced: SKB_DROP_REASON_IP_INADDRERRORS SKB_DROP_REASON_IP_INNOROUTES Isn't the name SKB_DROP_REASON_IP_HOSTUNREACH and SKB_DROP_REASON_IP_NETUNREACH more accurate? To make them corresponding to IPSTATS_MIB_*, we keep their name still. Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13skb: add some helpers for skb drop reasonsMenglong Dong1-0/+12
In order to simply the definition and assignment for 'enum skb_drop_reason', introduce some helpers. SKB_DR() is used to define a variable of type 'enum skb_drop_reason' with the 'SKB_DROP_REASON_NOT_SPECIFIED' initial value. SKB_DR_SET() is used to set the value of the variable. Seems it is a little useless? But it makes the code shorter. SKB_DR_OR() is used to set the value of the variable if it is not set yet, which means its value is SKB_DROP_REASON_NOT_SPECIFIED. Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13Merge branch 'octeon_ep-driver'David S. Miller21-0/+5633
Veerasenareddy Burru says: ==================== Add octeon_ep driver This driver implements networking functionality of Marvell's Octeon PCI Endpoint NIC. This driver support following devices: * Network controller: Cavium, Inc. Device b200 V4 -> V5: - Fix warnings reported by clang. - Address comments from community reviews. V3 -> V4: - Fix warnings and errors reported by "make W=1 C=1". V2 -> V3: - Fix warnings and errors reported by kernel test robot: "Reported-by: kernel test robot <lkp@intel.com>" V1 -> V2: - Address review comments on original patch series. - Divide PATCH 1/4 from the original series into 4 patches in v2 patch series: PATCH 1/7 to PATCH 4/7. - Fix clang build errors. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13octeon_ep: add ethtool support for Octeon PCI Endpoint NICVeerasenareddy Burru3-1/+465
Add support for the following ethtool commands: ethtool -i|--driver devname ethtool devname ethtool -s devname [speed N] [autoneg on|off] [advertise N] ethtool -S|--statistics devname Signed-off-by: Veerasenareddy Burru <vburru@marvell.com> Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com> Signed-off-by: Satananda Burla <sburla@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13octeon_ep: add Tx/Rx processing and interrupt supportVeerasenareddy Burru3-0/+904
Add support to enable MSI-x and register interrupts. Add support to process Tx and Rx traffic. Includes processing Tx completions and Rx refill. Signed-off-by: Veerasenareddy Burru <vburru@marvell.com> Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com> Signed-off-by: Satananda Burla <sburla@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13octeon_ep: add support for ndo opsVeerasenareddy Burru3-1/+139
Add support for ndo ops to set MAC address, change MTU, get stats. Add control path support to set MAC address, change MTU, get stats, set speed, get and set link mode. Signed-off-by: Veerasenareddy Burru <vburru@marvell.com> Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com> Signed-off-by: Satananda Burla <sburla@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13octeon_ep: add Tx/Rx ring resource setup and cleanupVeerasenareddy Burru2-0/+441
Implement Tx/Rx ring resource allocation and cleanup. Signed-off-by: Veerasenareddy Burru <vburru@marvell.com> Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com> Signed-off-by: Satananda Burla <sburla@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13octeon_ep: Add mailbox for control commandsVeerasenareddy Burru3-10/+294
Add mailbox between host and NIC to send control commands from host to NIC and receive responses and notifications from NIC to host driver, like link status update. Signed-off-by: Veerasenareddy Burru <vburru@marvell.com> Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com> Signed-off-by: Satananda Burla <sburla@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13octeon_ep: add hardware configuration APIsVeerasenareddy Burru3-7/+505
Implement hardware resource init and shutdown helper APIs. This includes hardware Tx/Rx queue init/enable/disable/reset, non queue interrupt handler that decodes non-queue interrupt type. Signed-off-by: Veerasenareddy Burru <vburru@marvell.com> Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com> Signed-off-by: Satananda Burla <sburla@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13octeon_ep: Add driver framework and device initializationVeerasenareddy Burru20-0/+2904
Add driver framework and device setup and initialization for Octeon PCI Endpoint NIC. Add implementation to load module, initilaize, register network device, cleanup and unload module. Signed-off-by: Veerasenareddy Burru <vburru@marvell.com> Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com> Signed-off-by: Satananda Burla <sburla@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13Merge branch 'br-flush-filtering'David S. Miller10-35/+269
Nikolay Aleksandrov says: ==================== net: bridge: add flush filtering support This patch-set adds support to specify filtering conditions for a bulk delete (flush) operation. This version uses a new nlmsghdr delete flag called NLM_F_BULK in combination with a new ndo_fdb_del_bulk op which is used to signal that the driver supports bulk deletes (that avoids pushing common mac address checks to ndo_fdb_del implementations and also has a different prototype and parsed attribute expectations, more info in patch 03). The new delete flag can be used for any RTM_DEL* type, implementations just need to be careful with older kernels which are doing non-strict attribute parses. A new rtnl flag (RTNL_FLAG_BULK_DEL_SUPPORTED) is used to show that the delete supports NLM_F_BULK. A proper error is returned if bulk delete is not supported. For old kernels I use the fact that mac address attribute (lladdr) is mandatory in the classic fdb del case, but it's not allowed if bulk deleting so older kernels will error out. Patch 01 and 02 are minor rtnetlink cleanups to make the code easier to read. They remove hardcoded values and use names instead. Patch 03 uses BIT() for rtnl flags. Patch 04 adds the new NLM_F_BULK delete request modifier, patch 05 adds the new bulk delete flag and checks for it if the delete requests have NLM_F_BULK set, it also warns if rtnl register is called with a non-delete kind and the bulk delete flag is set. Patch 06 adds the new ndo_fdb_del_bulk call. Patch 07 adds NLM_F_BULK support to rtnl_fdb_del, on such request strict parsing is used only for the supported attributes, and if the ndo is implemented it's called, the NTF_SELF/MASTER rules are the same as for the standard rtnl_fdb_del. Patch 08 implements bridge-specific minimal ndo_fdb_del_bulk call which uses the current br_fdb_flush to delete all entries. Patch 09 adds filtering support to the new bridge flush op which supports target ifindex (port or bridge), vlan id and flags/state mask. Patch 10 adds ndm state and flags mask attributes which will be used for filtering. Patch 11 converts ndm state/flags and their masks to bridge-private flags and fills them in the filter descriptor for matching. Finally patch 12 fills in the target ifindex (after validating it) and vlan id (already validated by rtnl_fdb_flush) for matching. Flush filtering is needed because user-space applications need a quick way to delete only a specific set of entries, e.g. mlag implementations need a way to flush only dynamic entries excluding externally learned ones or only externally learned ones without static entries etc. Also apps usually want to target only a specific vlan or port/vlan combination. The current 2 flush operations (per port and bridge-wide) are not extensible and cannot provide such filtering. I decided against embedding new attrs into the old flush attributes for multiple reasons - proper error handling on unsupported attributes, older kernels silently flushing all, need for a second mechanism to signal that the attribute should be parsed (e.g. using boolopts), special treatment for permanent entries. Examples: $ bridge fdb flush dev bridge vlan 100 static < flush all static entries on vlan 100 > $ bridge fdb flush dev bridge vlan 1 dynamic < flush all dynamic entries on vlan 1 > $ bridge fdb flush dev bridge port ens16 vlan 1 dynamic < flush all dynamic entries on port ens16 and vlan 1 > $ bridge fdb flush dev ens16 vlan 1 dynamic master < as above: flush all dynamic entries on port ens16 and vlan 1 > $ bridge fdb flush dev bridge nooffloaded nopermanent self < flush all non-offloaded and non-permanent entries > $ bridge fdb flush dev bridge static noextern_learn < flush all static entries which are not externally learned > $ bridge fdb flush dev bridge permanent < flush all permanent entries > $ bridge fdb flush dev bridge port bridge permanent < flush all permanent entries pointing to the bridge itself > Example of a flush call with unsupported netlink attribute (NDA_DST): $ bridge fdb flush dev bridge vlan 100 dynamic dst Error: Unsupported attribute. Example of a flush call on an older kernel: $ bridge fdb flush dev bridge dynamic Error: invalid address. Example of calling PF_UNSPEC RTM_DELNEIGH which doesn't support bulk delete with NLM_F_BULK set (ip neigh is changed to add the flag): $ ip n del 192.168.122.5 lladdr 00:11:22:33:44:55 dev ens3 Error: Bulk delete is not supported. Note that all flags have their negated version (static vs nostatic etc) and there are some tricky cases to handle like "static" which in flag terms means fdbs that have NUD_NOARP but *not* NUD_PERMANENT, so the mask matches on both but we need only NUD_NOARP to be set. That's because permanent entries have both set so we can't just match on NUD_NOARP. Also note that this flush operation doesn't treat permanent entries in a special way (fdb_delete vs fdb_delete_local), it will delete them regardless if any port is using them. We can extend the api with a flag to do that if needed in the future. Patch-sets (in order): - Initial bulk del infra and fdb flush filtering (this set) - iproute2 support - selftests v4: Add and check for rtnl del bulk supported flag when using NLM_F_BULK (new patch 05), patches 01 - 03 are also new minor cleanups to remove use of raw values and make code easier to read, don't rename br_fdb_flush in patch 08, set port ifindex as flush target if NDA_IFINDEX is missing and flush was called with port netdev and NTF_MASTER (patch 12). v3: Add NLM_F_BULK delete modifier and ndo_fdb_del_bulk callback, patches 01 - 03 and 06 are new. Patch 04 is changed to implement bulk_del instead of flush, patches 05, 07 and 08 are adjusted to use NDA_ attributes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: bridge: fdb: add support for flush filtering based on ifindex and vlanNikolay Aleksandrov1-1/+44
Add support for fdb flush filtering based on destination ifindex and vlan id. The ifindex must either match a port's device ifindex or the bridge's. The vlan support is trivial since it's already validated by rtnl_fdb_del, we just need to fill it in. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: bridge: fdb: add support for flush filtering based on ndm flags and stateNikolay Aleksandrov2-3/+60
Add support for fdb flush filtering based on ndm flags and state. NDM state and flags are mapped to bridge-specific flags and matched according to the specified masks. NTF_USE is used to represent added_by_user flag since it sets it on fdb add and we don't have a 1:1 mapping for it. Only allowed bits can be set, NTF_SELF and NTF_MASTER are ignored. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: rtnetlink: add ndm flags and state mask attributesNikolay Aleksandrov2-0/+4
Add ndm flags/state masks which will be used for bulk delete filtering. All of these are used by the bridge and vxlan drivers. Also minimal attr policy validation is added, it is up to ndo_fdb_del_bulk implementers to further validate them. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: bridge: fdb: add support for fine-grained flushingNikolay Aleksandrov4-12/+54
Add the ability to specify exactly which fdbs to be flushed. They are described by a new structure - net_bridge_fdb_flush_desc. Currently it can match on port/bridge ifindex, vlan id and fdb flags. It is used to describe the existing dynamic fdb flush operation. Note that this flush operation doesn't treat permanent entries in a special way (fdb_delete vs fdb_delete_local), it will delete them regardless if any port is using them, so currently it can't directly replace deletes which need to handle that case, although we can extend it later for that too. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: bridge: fdb: add ndo_fdb_del_bulkNikolay Aleksandrov3-0/+27
Add a minimal ndo_fdb_del_bulk implementation which flushes all entries. Support for more fine-grained filtering will be added in the following patches. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: rtnetlink: add NLM_F_BULK support to rtnl_fdb_delNikolay Aleksandrov1-19/+48
When NLM_F_BULK is specified in a fdb del message we need to handle it differently. First since this is a new call we can strictly validate the passed attributes, at first only ifindex and vlan are allowed as these will be the initially supported filter attributes, any other attribute is rejected. The mac address is no longer mandatory, but we use it to error out in older kernels because it cannot be specified with bulk request (the attribute is not allowed) and then we have to dispatch the call to ndo_fdb_del_bulk if the device supports it. The del bulk callback can do further validation of the attributes if necessary. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: add ndo_fdb_del_bulkNikolay Aleksandrov1-0/+9
Add a new netdev op called ndo_fdb_del_bulk, it will be later used for driver-specific bulk delete implementation dispatched from rtnetlink. The first user will be the bridge, we need it to signal to rtnetlink from the driver that we support bulk delete operation (NLM_F_BULK). Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: rtnetlink: add bulk delete support flagNikolay Aleksandrov2-1/+10
Add a new rtnl flag (RTNL_FLAG_BULK_DEL_SUPPORTED) which is used to verify that the delete operation allows bulk object deletion. Also emit a warning if anyone tries to set it for non-delete kind. Suggested-by: David Ahern <dsahern@kernel.org> Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: netlink: add NLM_F_BULK delete request modifierNikolay Aleksandrov1-0/+1
Add a new delete request modifier called NLM_F_BULK which, when supported, would cause the request to delete multiple objects. The flag is a convenient way to signal that a multiple delete operation is requested which can be gradually added to different delete requests. In order to make sure older kernels will error out if the operation is not supported instead of doing something unintended we have to break a required condition when implementing support for this flag, f.e. for neighbors we will omit the mandatory mac address attribute. Initially it will be used to add flush with filtering support for bridge fdbs, but it also opens the door to add similar support to others. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: rtnetlink: use BIT for flag valuesNikolay Aleksandrov1-1/+1
Use BIT to define flag values. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: rtnetlink: add helper to extract msg type's kindNikolay Aleksandrov2-1/+7
Add a helper which extracts the msg type's kind using the kind mask (0x3). Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: rtnetlink: add msg kind namesNikolay Aleksandrov2-3/+10
Add rtnl kind names instead of using raw values. We'll need to check for DEL kind later to validate bulk flag support. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13Merge branch 'net-ti-storm-prevention-support'David S. Miller7-1/+472
Grygorii Strashko says: ==================== net: ethernet: ti: enable bc/mc storm prevention support This series first adds supports for the ALE feature to rate limit number ingress broadcast(BC)/multicast(MC) packets per/sec which main purpose is BC/MC storm prevention. And then enables corresponding support for ingress broadcast(BC)/multicast(MC) packets rate limiting for TI CPSW switchdev and AM65x/J221E CPSW_NUSS drivers by implementing HW offload for simple tc-flower with policer action with matches on dst_mac/mask: - ff:ff:ff:ff:ff:ff/ff:ff:ff:ff:ff:ff has to be used for BC packets rate limiting (exact match) - 01:00:00:00:00:00/01:00:00:00:00:00 fixed value has to be used for MC packets rate limiting The CPSW supports MC/BC packets rate limiting in packets/sec and affects all ingress MC/BC packets and serves as BC/MC storm prevention feature. Examples: - BC rate limit to 1000pps: tc qdisc add dev eth0 clsact tc filter add dev eth0 ingress flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \ action police pkts_rate 1000 pkts_burst 1 drop - MC rate limit to 20000pps: tc qdisc add dev eth0 clsact tc filter add dev eth0 ingress flower skip_sw dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 \ action police rate pkts_rate 20000 pkts_burst 1 drop pkts_burst - not used. The solution inspired patch from Vladimir Oltean [1]. Changes in v3: - comments applied - policer validation added Changes in v2: - switch to packet-per-second policing introduced by commit 2ffe0395288a ("net/sched: act_police: add support for packet-per-second policing") [2] v2: https://patchwork.kernel.org/project/netdevbpf/cover/20211101170122.19160-1-grygorii.strashko@ti.com/ v1: https://patchwork.kernel.org/project/netdevbpf/cover/20201114035654.32658-1-grygorii.strashko@ti.com/ [1] https://lore.kernel.org/patchwork/patch/1217254/ [2] https://patchwork.kernel.org/project/netdevbpf/cover/20210312140831.23346-1-simon.horman@netronome.com/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: ethernet: ti: cpsw_new: enable bc/mc storm prevention supportGrygorii Strashko3-1/+216
This patch enables support for ingress broadcast(BC)/multicast(MC) packets rate limiting in TI CPSW switchdev driver (the corresponding ALE support was added in previous patch) by implementing HW offload for simple tc-flower with policer action with matches on dst_mac: - ff:ff:ff:ff:ff:ff/ff:ff:ff:ff:ff:ff has to be used for BC packets rate limiting (exact match) - 01:00:00:00:00:00/01:00:00:00:00:00 fixed value has to be used for MC packets rate limiting The CPSW supports MC/BC packets rate limiting in packets/sec and affects all ingress MC/BC packets and serves as BC/MC storm prevention feature. Examples: - BC rate limit to 1000pps: tc qdisc add dev eth0 clsact tc filter add dev eth0 ingress flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \ action police pkts_rate 1000 pkts_burst 1 drop - MC rate limit to 20000pps: tc qdisc add dev eth0 clsact tc filter add dev eth0 ingress flower skip_sw dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 \ action police rate pkts_rate 10000 pkts_burst 1 drop pkts_burst - not used. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: ethernet: ti: am65-cpsw: enable bc/mc storm prevention supportGrygorii Strashko2-0/+188
This patch enables support for ingress broadcast(BC)/multicast(MC) packets rate limiting in TI AM65x CPSW driver (the corresponding ALE support was added in previous patch) by implementing HW offload for simple tc-flower with policer action with matches on dst_mac/mask: - ff:ff:ff:ff:ff:ff/ff:ff:ff:ff:ff:ff has to be used for BC packets rate limiting (exact match) - 01:00:00:00:00:00/01:00:00:00:00:00 fixed value has to be used for MC packets rate limiting The CPSW supports MC/BC packets rate limiting in packets/sec and affects all ingress MC/BC packets and serves as BC/MC storm prevention feature. Examples: - BC rate limit to 1000pps: tc qdisc add dev eth0 clsact tc filter add dev eth0 ingress flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \ action police pkts_rate 1000 pkts_burst 1 drop - MC rate limit to 20000pps: tc qdisc add dev eth0 clsact tc filter add dev eth0 ingress flower skip_sw dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 \ action police rate pkts_rate 20000 pkts_burst 1 drop pkts_burst - not used. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13drivers: net: cpsw: ale: add broadcast/multicast rate limit supportGrygorii Strashko2-0/+68
The CPSW ALE supports feature to rate limit number ingress broadcast(BC)/multicast(MC) packets per/sec which main purpose is BC/MC storm prevention. The ALE BC/MC packet rate limit configuration consist of two parts: - global ALE_CONTROL.ENABLE_RATE_LIMIT bit 0 which enables rate limiting globally ALE_PRESCALE.PRESCALE specifies rate limiting interval - per-port ALE_PORTCTLx.BCASTMCAST/_LIMIT specifies number of BC/MC packets allowed per rate limiting interval. When port.BCASTMCAST/_LIMIT is 0 rate limiting is disabled for Port. When BC/MC packet rate limiting is enabled the number of allowed packets per/sec is defined as: number_of_packets/sec = (Fclk / ALE_PRESCALE) * port.BCASTMCAST/_LIMIT Hence, the ALE_PRESCALE configuration is common for all ports the 1ms interval is selected and configured during ALE initialization while port.BCAST/MCAST_LIMIT are configured per-port. This allows to achieve: - min number_of_packets = 1000 when port.BCAST/MCAST_LIMIT = 1 - max number_of_packets = 1000 * 255 = 255000 when port.BCAST/MCAST_LIMIT = 0xFF The ALE_CONTROL.ENABLE_RATE_LIMIT can also be enabled once during ALE initialization as rate limiting enabled by non zero port.BCASTMCAST/_LIMIT values. This patch implements above logic in ALE and adds new ALE APIs cpsw_ale_rx_ratelimit_bc(); cpsw_ale_rx_ratelimit_mc(); Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: phylink: remove phylink_helper_basex_speed()Russell King (Oracle)2-34/+0
As there are now no users of phylink_helper_basex_speed(), we can remove this obsolete functionality. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: ethernet: mtk_eth_soc: use after free in __mtk_ppe_check_skb()Dan Carpenter1-1/+2
The __mtk_foe_entry_clear() function frees "entry" so we have to use the _safe() version of hlist_for_each_entry() to prevent a use after free. Fixes: 33fc42de3327 ("net: ethernet: mtk_eth_soc: support creating mac address based offload entries") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: ethernet: ti: am65-cpsw-nuss: using pm_runtime_resume_and_get instead ↵Minghao Chi1-22/+11
of pm_runtime_get_sync Using pm_runtime_resume_and_get is more appropriate for simplifing code Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13NFC: NULL out the dev->rfkill to prevent UAFLin Ma1-0/+1
Commit 3e3b5dfcd16a ("NFC: reorder the logic in nfc_{un,}register_device") assumes the device_is_registered() in function nfc_dev_up() will help to check when the rfkill is unregistered. However, this check only take effect when device_del(&dev->dev) is done in nfc_unregister_device(). Hence, the rfkill object is still possible be dereferenced. The crash trace in latest kernel (5.18-rc2): [ 68.760105] ================================================================== [ 68.760330] BUG: KASAN: use-after-free in __lock_acquire+0x3ec1/0x6750 [ 68.760756] Read of size 8 at addr ffff888009c93018 by task fuzz/313 [ 68.760756] [ 68.760756] CPU: 0 PID: 313 Comm: fuzz Not tainted 5.18.0-rc2 #4 [ 68.760756] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 68.760756] Call Trace: [ 68.760756] <TASK> [ 68.760756] dump_stack_lvl+0x57/0x7d [ 68.760756] print_report.cold+0x5e/0x5db [ 68.760756] ? __lock_acquire+0x3ec1/0x6750 [ 68.760756] kasan_report+0xbe/0x1c0 [ 68.760756] ? __lock_acquire+0x3ec1/0x6750 [ 68.760756] __lock_acquire+0x3ec1/0x6750 [ 68.760756] ? lockdep_hardirqs_on_prepare+0x410/0x410 [ 68.760756] ? register_lock_class+0x18d0/0x18d0 [ 68.760756] lock_acquire+0x1ac/0x4f0 [ 68.760756] ? rfkill_blocked+0xe/0x60 [ 68.760756] ? lockdep_hardirqs_on_prepare+0x410/0x410 [ 68.760756] ? mutex_lock_io_nested+0x12c0/0x12c0 [ 68.760756] ? nla_get_range_signed+0x540/0x540 [ 68.760756] ? _raw_spin_lock_irqsave+0x4e/0x50 [ 68.760756] _raw_spin_lock_irqsave+0x39/0x50 [ 68.760756] ? rfkill_blocked+0xe/0x60 [ 68.760756] rfkill_blocked+0xe/0x60 [ 68.760756] nfc_dev_up+0x84/0x260 [ 68.760756] nfc_genl_dev_up+0x90/0xe0 [ 68.760756] genl_family_rcv_msg_doit+0x1f4/0x2f0 [ 68.760756] ? genl_family_rcv_msg_attrs_parse.constprop.0+0x230/0x230 [ 68.760756] ? security_capable+0x51/0x90 [ 68.760756] genl_rcv_msg+0x280/0x500 [ 68.760756] ? genl_get_cmd+0x3c0/0x3c0 [ 68.760756] ? lock_acquire+0x1ac/0x4f0 [ 68.760756] ? nfc_genl_dev_down+0xe0/0xe0 [ 68.760756] ? lockdep_hardirqs_on_prepare+0x410/0x410 [ 68.760756] netlink_rcv_skb+0x11b/0x340 [ 68.760756] ? genl_get_cmd+0x3c0/0x3c0 [ 68.760756] ? netlink_ack+0x9c0/0x9c0 [ 68.760756] ? netlink_deliver_tap+0x136/0xb00 [ 68.760756] genl_rcv+0x1f/0x30 [ 68.760756] netlink_unicast+0x430/0x710 [ 68.760756] ? memset+0x20/0x40 [ 68.760756] ? netlink_attachskb+0x740/0x740 [ 68.760756] ? __build_skb_around+0x1f4/0x2a0 [ 68.760756] netlink_sendmsg+0x75d/0xc00 [ 68.760756] ? netlink_unicast+0x710/0x710 [ 68.760756] ? netlink_unicast+0x710/0x710 [ 68.760756] sock_sendmsg+0xdf/0x110 [ 68.760756] __sys_sendto+0x19e/0x270 [ 68.760756] ? __ia32_sys_getpeername+0xa0/0xa0 [ 68.760756] ? fd_install+0x178/0x4c0 [ 68.760756] ? fd_install+0x195/0x4c0 [ 68.760756] ? kernel_fpu_begin_mask+0x1c0/0x1c0 [ 68.760756] __x64_sys_sendto+0xd8/0x1b0 [ 68.760756] ? lockdep_hardirqs_on+0xbf/0x130 [ 68.760756] ? syscall_enter_from_user_mode+0x1d/0x50 [ 68.760756] do_syscall_64+0x3b/0x90 [ 68.760756] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 68.760756] RIP: 0033:0x7f67fb50e6b3 ... [ 68.760756] RSP: 002b:00007f67fa91fe90 EFLAGS: 00000293 ORIG_RAX: 000000000000002c [ 68.760756] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f67fb50e6b3 [ 68.760756] RDX: 000000000000001c RSI: 0000559354603090 RDI: 0000000000000003 [ 68.760756] RBP: 00007f67fa91ff00 R08: 00007f67fa91fedc R09: 000000000000000c [ 68.760756] R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffe824d496e [ 68.760756] R13: 00007ffe824d496f R14: 00007f67fa120000 R15: 0000000000000003 [ 68.760756] </TASK> [ 68.760756] [ 68.760756] Allocated by task 279: [ 68.760756] kasan_save_stack+0x1e/0x40 [ 68.760756] __kasan_kmalloc+0x81/0xa0 [ 68.760756] rfkill_alloc+0x7f/0x280 [ 68.760756] nfc_register_device+0xa3/0x1a0 [ 68.760756] nci_register_device+0x77a/0xad0 [ 68.760756] nfcmrvl_nci_register_dev+0x20b/0x2c0 [ 68.760756] nfcmrvl_nci_uart_open+0xf2/0x1dd [ 68.760756] nci_uart_tty_ioctl+0x2c3/0x4a0 [ 68.760756] tty_ioctl+0x764/0x1310 [ 68.760756] __x64_sys_ioctl+0x122/0x190 [ 68.760756] do_syscall_64+0x3b/0x90 [ 68.760756] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 68.760756] [ 68.760756] Freed by task 314: [ 68.760756] kasan_save_stack+0x1e/0x40 [ 68.760756] kasan_set_track+0x21/0x30 [ 68.760756] kasan_set_free_info+0x20/0x30 [ 68.760756] __kasan_slab_free+0x108/0x170 [ 68.760756] kfree+0xb0/0x330 [ 68.760756] device_release+0x96/0x200 [ 68.760756] kobject_put+0xf9/0x1d0 [ 68.760756] nfc_unregister_device+0x77/0x190 [ 68.760756] nfcmrvl_nci_unregister_dev+0x88/0xd0 [ 68.760756] nci_uart_tty_close+0xdf/0x180 [ 68.760756] tty_ldisc_kill+0x73/0x110 [ 68.760756] tty_ldisc_hangup+0x281/0x5b0 [ 68.760756] __tty_hangup.part.0+0x431/0x890 [ 68.760756] tty_release+0x3a8/0xc80 [ 68.760756] __fput+0x1f0/0x8c0 [ 68.760756] task_work_run+0xc9/0x170 [ 68.760756] exit_to_user_mode_prepare+0x194/0x1a0 [ 68.760756] syscall_exit_to_user_mode+0x19/0x50 [ 68.760756] do_syscall_64+0x48/0x90 [ 68.760756] entry_SYSCALL_64_after_hwframe+0x44/0xae This patch just add the null out of dev->rfkill to make sure such dereference cannot happen. This is safe since the device_lock() already protect the check/write from data race. Fixes: 3e3b5dfcd16a ("NFC: reorder the logic in nfc_{un,}register_device") Signed-off-by: Lin Ma <linma@zju.edu.cn> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13ipv6: exthdrs: use swap() instead of open coding itGuo Zhengkui1-4/+1
Address the following coccicheck warning: net/ipv6/exthdrs.c:620:44-45: WARNING opportunity for swap() by using swap() for the swapping of variable values and drop the tmp (`addr`) variable that is not needed any more. Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13selftests: net: fib_rule_tests: add support to select a test to runAlaa Mohamed1-1/+11
Add boilerplate test loop in test to run all tests in fib_rule_tests.sh Signed-off-by: Alaa Mohamed <eng.alaamohamedsoliman.am@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: ethernet: mtk_eth_soc: use standard property for cci-control-portLorenzo Bianconi2-2/+2
Rely on standard cci-control-port property to identify CCI port reference. Update mt7622 dts binding. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13Merge branch '40GbE' of ↵David S. Miller10-30/+88
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 40GbE Intel Wired LAN Driver Updates 2022-04-12 This series contains updates to i40e and ice drivers. Joe Damato adds TSO support for MPLS packets on i40e and ice drivers. He also adds tracking and reporting of tx_stopped statistic for i40e. Nabil S. Alramli adds reporting of tx_restart to ethtool for i40e. Mateusz adds new device id support for i40e. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13Merge branch 'tls-rx-refactor-part-3'David S. Miller1-71/+60
Jakub Kicinski says: ==================== tls: rx: random refactoring part 3 TLS Rx refactoring. Part 3 of 3. This set is mostly around rx_list and async processing. The last two patches are minor optimizations. A couple of features to follow. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13tls: rx: only copy IV from the packet for TLS 1.2Jakub Kicinski1-10/+10
TLS 1.3 and ChaChaPoly don't carry IV in the packet. The code before this change would copy out iv_size worth of whatever followed the TLS header in the packet and then for TLS 1.3 | ChaCha overwrite that with the sequence number. Waste of cycles especially with TLS 1.2 being close to dead and TLS 1.3 being the common case. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13tls: rx: use MAX_IV_SIZE for allocationsJakub Kicinski1-1/+1
IVs are 8 or 16 bytes, no point reading out the exact value for quantities this small. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13tls: rx: use async as an in-out argumentJakub Kicinski1-15/+16
Propagating EINPROGRESS thru multiple layers of functions is error prone. Use darg->async as an in/out argument, like we use darg->zc today. On input it tells the code if async is allowed, on output if it took place. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>