summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2022-11-03net: devlink: track netdev with devlink_port assignedJiri Pirko3-9/+99
Currently, ethernet drivers are using devlink_port_type_eth_set() and devlink_port_type_clear() to set devlink port type and link to related netdev. Instead of calling them directly, let the driver use SET_NETDEV_DEVLINK_PORT macro to assign devlink_port pointer and let devlink to track it. Note the devlink port pointer is static during the time netdevice is registered. In devlink code, use per-namespace netdev notifier to track the netdevices with devlink_port assigned and change the internal devlink_port type and related type pointer accordingly. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03net: devlink: take RTNL in port_fill() function only if it is not heldJiri Pirko1-15/+31
Follow-up patch is going to introduce a netdevice notifier event processing which is called with RTNL mutex held. Processing of this will eventually lead to call to port_notity() and port_fill() which currently takes RTNL mutex internally. So as a temporary solution, propagate a bool indicating if the mutex is already held. This will go away in one of the follow-up patches. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03net: devlink: move port_type_netdev_checks() call to __devlink_port_type_set()Jiri Pirko1-30/+33
As __devlink_port_type_set() is going to be called directly from netdevice notifier event handle in one of the follow-up patches, move the port_type_netdev_checks() call there. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03net: devlink: move port_type_warn_schedule() call to __devlink_port_type_set()Jiri Pirko1-2/+5
As __devlink_port_type_set() is going to be called directly from netdevice notifier event handle in one of the follow-up patches, move the port_type_warn_schedule() call there. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03net: devlink: convert devlink port type-specific pointers to unionJiri Pirko2-7/+23
Instead of storing type_dev as a void pointer, convert it to union and use it to store either struct net_device or struct ib_device pointer. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03Merge branch 'bridge-add-mac-authentication-bypass-mab-support'Jakub Kicinski10-6/+241
Ido Schimmel says: ==================== bridge: Add MAC Authentication Bypass (MAB) support Patch #1 adds MAB support in the bridge driver. See the commit message for motivation, design choices and implementation details. Patch #2 adds corresponding test cases. Follow-up patchsets will add offload support in mlxsw and mv88e6xxx. ==================== Link: https://lore.kernel.org/r/20221101193922.2125323-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03selftests: forwarding: Add MAC Authentication Bypass (MAB) test casesHans J. Schultz2-1/+162
Add four test cases to verify MAB functionality: * Verify that a locked FDB entry can be generated by the bridge, preventing a host from communicating via the bridge. Test that user space can clear the "locked" flag by replacing the entry, thereby authenticating the host and allowing it to communicate via the bridge. * Test that an entry cannot roam to a locked port, but that it can roam to an unlocked port. * Test that MAB can only be enabled on a port that is both locked and has learning enabled. * Test that locked FDB entries are flushed from a port when MAB is disabled. Signed-off-by: Hans J. Schultz <netdev@kapio-technology.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03bridge: Add MAC Authentication Bypass (MAB) supportHans J. Schultz8-5/+79
Hosts that support 802.1X authentication are able to authenticate themselves by exchanging EAPOL frames with an authenticator (Ethernet bridge, in this case) and an authentication server. Access to the network is only granted by the authenticator to successfully authenticated hosts. The above is implemented in the bridge using the "locked" bridge port option. When enabled, link-local frames (e.g., EAPOL) can be locally received by the bridge, but all other frames are dropped unless the host is authenticated. That is, unless the user space control plane installed an FDB entry according to which the source address of the frame is located behind the locked ingress port. The entry can be dynamic, in which case learning needs to be enabled so that the entry will be refreshed by incoming traffic. There are deployments in which not all the devices connected to the authenticator (the bridge) support 802.1X. Such devices can include printers and cameras. One option to support such deployments is to unlock the bridge ports connecting these devices, but a slightly more secure option is to use MAB. When MAB is enabled, the MAC address of the connected device is used as the user name and password for the authentication. For MAB to work, the user space control plane needs to be notified about MAC addresses that are trying to gain access so that they will be compared against an allow list. This can be implemented via the regular learning process with the sole difference that learned FDB entries are installed with a new "locked" flag indicating that the entry cannot be used to authenticate the device. The flag cannot be set by user space, but user space can clear the flag by replacing the entry, thereby authenticating the device. Locked FDB entries implement the following semantics with regards to roaming, aging and forwarding: 1. Roaming: Locked FDB entries can roam to unlocked (authorized) ports, in which case the "locked" flag is cleared. FDB entries cannot roam to locked ports regardless of MAB being enabled or not. Therefore, locked FDB entries are only created if an FDB entry with the given {MAC, VID} does not already exist. This behavior prevents unauthenticated devices from disrupting traffic destined to already authenticated devices. 2. Aging: Locked FDB entries age and refresh by incoming traffic like regular entries. 3. Forwarding: Locked FDB entries forward traffic like regular entries. If user space detects an unauthorized MAC behind a locked port and wishes to prevent traffic with this MAC DA from reaching the host, it can do so using tc or a different mechanism. Enable the above behavior using a new bridge port option called "mab". It can only be enabled on a bridge port that is both locked and has learning enabled. Locked FDB entries are flushed from the port once MAB is disabled. A new option is added because there are pure 802.1X deployments that are not interested in notifications about locked FDB entries. Signed-off-by: Hans J. Schultz <netdev@kapio-technology.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski369-1627/+3233
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03Merge tag 'net-6.1-rc4' of ↵Linus Torvalds50-190/+401
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bluetooth and netfilter. Current release - regressions: - net: several zerocopy flags fixes - netfilter: fix possible memory leak in nf_nat_init() - openvswitch: add missing .resv_start_op Previous releases - regressions: - neigh: fix null-ptr-deref in neigh_table_clear() - sched: fix use after free in red_enqueue() - dsa: fall back to default tagger if we can't load the one from DT - bluetooth: fix use-after-free in l2cap_conn_del() Previous releases - always broken: - netfilter: netlink notifier might race to release objects - nfc: fix potential memory leak of skb - bluetooth: fix use-after-free caused by l2cap_reassemble_sdu - bluetooth: use skb_put to set length - eth: tun: fix bugs for oversize packet when napi frags enabled - eth: lan966x: fixes for when MTU is changed - eth: dwmac-loongson: fix invalid mdio_node" * tag 'net-6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (53 commits) vsock: fix possible infinite sleep in vsock_connectible_wait_data() vsock: remove the unused 'wait' in vsock_connectible_recvmsg() ipv6: fix WARNING in ip6_route_net_exit_late() bridge: Fix flushing of dynamic FDB entries net, neigh: Fix null-ptr-deref in neigh_table_clear() net/smc: Fix possible leaked pernet namespace in smc_init() stmmac: dwmac-loongson: fix invalid mdio_node ibmvnic: Free rwi on reset success net: mdio: fix undefined behavior in bit shift for __mdiobus_register Bluetooth: L2CAP: Fix attempting to access uninitialized memory Bluetooth: L2CAP: Fix l2cap_global_chan_by_psm Bluetooth: L2CAP: Fix accepting connection request for invalid SPSM Bluetooth: hci_conn: Fix not restoring ISO buffer count on disconnect Bluetooth: L2CAP: Fix memory leak in vhci_write Bluetooth: L2CAP: fix use-after-free in l2cap_conn_del() Bluetooth: virtio_bt: Use skb_put to set length Bluetooth: hci_conn: Fix CIS connection dst_type handling Bluetooth: L2CAP: Fix use-after-free caused by l2cap_reassemble_sdu netfilter: ipset: enforce documented limit to prevent allocating huge memory isdn: mISDN: netjet: fix wrong check of device registration ...
2022-11-03Merge tag 'powerpc-6.1-4' of ↵Linus Torvalds5-5/+27
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix an endian thinko in the asm-generic compat_arg_u64() which led to syscall arguments being swapped for some compat syscalls. - Fix syscall wrapper handling of syscalls with 64-bit arguments on 32-bit kernels, which led to syscall arguments being misplaced. - A build fix for amdgpu on Book3E with AltiVec disabled. Thanks to Andreas Schwab, Christian Zigotzky, and Arnd Bergmann. * tag 'powerpc-6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/32: Select ARCH_SPLIT_ARG64 powerpc/32: fix syscall wrappers with 64-bit arguments asm-generic: compat: fix compat_arg_u64() and compat_arg_u64_dual() powerpc/64e: Fix amdgpu build on Book3E w/o AltiVec
2022-11-03Merge branch 'add-new-pcp-and-apptrust-attributes-to-dcbnl'Paolo Abeni11-8/+744
Daniel Machon says: ==================== Add new PCP and APPTRUST attributes to dcbnl This patch series adds new extension attributes to dcbnl, to support PCP prioritization (and thereby hw offloadable pcp-based queue classification) and per-selector trust and trust order. Additionally, the microchip sparx5 driver has been dcb-enabled to make use of the new attributes to offload PCP, DSCP and Default prio to the switch, and implement trust order of selectors. For pre-RFC discussion see: https://lore.kernel.org/netdev/Yv9VO1DYAxNduw6A@DEN-LT-70577/ For RFC series see: https://lore.kernel.org/netdev/20220915095757.2861822-1-daniel.machon@microchip.com/ In summary: there currently exist no convenient way to offload per-port PCP-based queue classification to hardware. The DCB subsystem offers different ways to prioritize through its APP table, but lacks an option for PCP. Similarly, there is no way to indicate the notion of trust for APP table selectors. This patch series addresses both topics. PCP based queue classification: - 8021Q standardizes the Priority Code Point table (see 6.9.3 of IEEE Std 802.1Q-2018). This patch series makes it possible, to offload the PCP classification to said table. The new PCP selector is not a standard part of the APP managed object, therefore it is encapsulated in a new non-std extension attribute. Selector trust: - ASIC's often has the notion of trust DSCP and trust PCP. The new attribute makes it possible to specify a trust order of app selectors, which drivers can then react on. DCB-enable sparx5 driver: - Now supports offloading of DSCP, PCP and default priority. Only one mapping of protocol:priority is allowed. Consecutive mappings of the same protocol to some new priority, will overwrite the previous. This is to keep a consistent view of the app table and the hardware. - Now supports dscp and pcp trust, by use of the introduced dcbnl_set/getapptrust ops. Sparx5 supports trust orders: [], [dscp], [pcp] and [dscp, pcp]. For now, only DSCP and PCP selectors are supported by the driver, everything else is bounced. Patch #1 introduces a new PCP selector to the APP object, which makes it possible to encode PCP and DEI in the app triplet and offload it to the PCP table of the ASIC. Patch #2 Introduces the new extension attributes DCB_ATTR_DCB_APP_TRUST_TABLE and DCB_ATTR_DCB_APP_TRUST. Trusted selectors are passed in the nested DCB_ATTR_DCB_APP_TRUST_TABLE attribute, and assembled into an array of selectors: u8 selectors[256]; where lower indexes has higher precedence. In the array, selectors are stored consecutively, starting from index zero. With a maximum number of 256 unique selectors, the list has the same maximum size. Patch #3 Sets up the dcbnl ops hook, and adds support for offloading pcp app entries, to the PCP table of the switch. Patch #4 Makes use of the dcbnl_set/getapptrust ops, to set a per-port trust order. Patch #5 Adds support for offloading dscp app entries to the DSCP table of the switch. Patch #6 Adds support for offloading default prio app entries to the switch. ==================== Link: https://lore.kernel.org/r/20221101094834.2726202-1-daniel.machon@microchip.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-03net: microchip: sparx5: add support for offloading default prioDaniel Machon3-0/+40
Add support for offloading default prio {ETHERTYPE, 0, prio}. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-03net: microchip: sparx5: add support for offloading dscp tableDaniel Machon3-3/+116
Add support for offloading dscp app entries. Dscp values are global for all ports on the sparx5 switch. Therefore, we replicate each dscp app entry per-port. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-03net: microchip: sparx5: add support for apptrustDaniel Machon3-2/+126
Make use of set/getapptrust() to implement per-selector trust and trust order. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-03net: microchip: sparx5: add support for offloading pcp tableDaniel Machon8-2/+327
Add new registers and functions to support offload of pcp app entries. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-03net: dcb: add new apptrust attributeDaniel Machon3-2/+80
Add new apptrust extension attributes to the 8021Qaz APP managed object. Two new attributes, DCB_ATTR_DCB_APP_TRUST_TABLE and DCB_ATTR_DCB_APP_TRUST, has been added. Trusted selectors are passed in the nested attribute DCB_ATTR_DCB_APP_TRUST, in order of precedence. The new attributes are meant to allow drivers, whose hw supports the notion of trust, to be able to set whether a particular app selector is trusted - and in which order. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-03net: dcb: add new pcp selector to app objectDaniel Machon2-4/+60
Add new PCP selector for the 8021Qaz APP managed object. As the PCP selector is not part of the 8021Qaz standard, a new non-std extension attribute DCB_ATTR_DCB_APP has been introduced. Also two helper functions to translate between selector and app attribute type has been added. The new selector has been given a value of 255, to minimize the risk of future overlap of std- and non-std attributes. The new DCB_ATTR_DCB_APP is sent alongside the ieee std attribute in the app table. This means that the dcb_app struct can now both contain std- and non-std app attributes. Currently there is no overlap between the selector values of the two attributes. The purpose of adding the PCP selector, is to be able to offload PCP-based queue classification to the 8021Q Priority Code Point table, see 6.9.3 of IEEE Std 802.1Q-2018. PCP and DEI is encoded in the protocol field as 8*dei+pcp, so that a mapping of PCP 2 and DEI 1 to priority 3 is encoded as {255, 10, 3}. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-03net: mana: Assign interrupts to CPUs based on NUMA nodesSaurabh Sengar2-3/+28
In large VMs with multiple NUMA nodes, network performance is usually best if network interrupts are all assigned to the same virtual NUMA node. This patch assigns online CPU according to a numa aware policy, local cpus are returned first, followed by non-local ones, then it wraps around. Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Link: https://lore.kernel.org/r/1667282761-11547-1-git-send-email-ssengar@linux.microsoft.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-03Merge branch 'vsock-remove-an-unused-variable-and-fix-infinite-sleep'Paolo Abeni1-3/+4
Dexuan Cui says: ==================== vsock: remove an unused variable and fix infinite sleep Patch 1 removes the unused 'wait' variable. Patch 2 fixes an infinite sleep issue reported by a hv_sock user. ==================== Link: https://lore.kernel.org/r/20221101021706.26152-1-decui@microsoft.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-03vsock: fix possible infinite sleep in vsock_connectible_wait_data()Dexuan Cui1-1/+4
Currently vsock_connectible_has_data() may miss a wakeup operation between vsock_connectible_has_data() == 0 and the prepare_to_wait(). Fix the race by adding the process to the wait queue before checking vsock_connectible_has_data(). Fixes: b3f7fd54881b ("af_vsock: separate wait data loop") Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reported-by: Frédéric Dalleau <frederic.dalleau@docker.com> Tested-by: Frédéric Dalleau <frederic.dalleau@docker.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-03vsock: remove the unused 'wait' in vsock_connectible_recvmsg()Dexuan Cui1-2/+0
Remove the unused variable introduced by 19c1b90e1979. Fixes: 19c1b90e1979 ("af_vsock: separate receive data loop") Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-03net: fec: add initial XDP supportShenwei Wang2-2/+226
This patch adds the initial XDP support to Freescale driver. It supports XDP_PASS, XDP_DROP and XDP_REDIRECT actions. Upcoming patches will add support for XDP_TX and Zero Copy features. As the patch is rather large, the part of codes to collect the statistics is separated and will prepare a dedicated patch for that part. I just tested with the application of xdpsock. -- Native here means running command of "xdpsock -i eth0" -- SKB-Mode means running command of "xdpsock -S -i eth0" The following are the testing result relating to XDP mode: root@imx8qxpc0mek:~/bpf# ./xdpsock -i eth0 sock0@eth0:0 rxdrop xdp-drv pps pkts 1.00 rx 371347 2717794 tx 0 0 root@imx8qxpc0mek:~/bpf# ./xdpsock -S -i eth0 sock0@eth0:0 rxdrop xdp-skb pps pkts 1.00 rx 202229 404528 tx 0 0 root@imx8qxpc0mek:~/bpf# ./xdp2 eth0 proto 0: 496708 pkt/s proto 0: 505469 pkt/s proto 0: 505283 pkt/s proto 0: 505443 pkt/s proto 0: 505465 pkt/s root@imx8qxpc0mek:~/bpf# ./xdp2 -S eth0 proto 0: 0 pkt/s proto 17: 118778 pkt/s proto 17: 118989 pkt/s proto 0: 1 pkt/s proto 17: 118987 pkt/s proto 0: 0 pkt/s proto 17: 118943 pkt/s proto 17: 118976 pkt/s proto 0: 1 pkt/s proto 17: 119006 pkt/s proto 0: 0 pkt/s proto 17: 119071 pkt/s proto 17: 119092 pkt/s Signed-off-by: Shenwei Wang <shenwei.wang@nxp.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/20221031185350.2045675-1-shenwei.wang@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-03net: tun: bump the link speed from 10Mbps to 10GbpsIlya Maximets1-1/+1
The 10Mbps link speed was set in 2004 when the ethtool interface was initially added to the tun driver. It might have been a good assumption 18 years ago, but CPUs and network stack came a long way since then. Other virtual ports typically report much higher speeds. For example, veth reports 10Gbps since its introduction in 2007. Some userspace applications rely on the current link speed in certain situations. For example, Open vSwitch is using link speed as an upper bound for QoS configuration if user didn't specify the maximum rate. Advertised 10Mbps doesn't match reality in a modern world, so users have to always manually override the value with something more sensible to avoid configuration issues, e.g. limiting the traffic too much. This also creates additional confusion among users. Bump the advertised speed to at least match the veth. Alternative might be to explicitly report UNKNOWN and let the user decide on a right value for them. And it is indeed "the right way" of fixing the problem. However, that may cause issues with bonding or with some userspace applications that may rely on speed value to be reported (even though they should not). Just changing the speed value should be a safer option. Users can still override the speed with ethtool, if necessary. RFC discussion is linked below. Link: https://lore.kernel.org/lkml/20221021114921.3705550-1-i.maximets@ovn.org/ Link: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-July/051958.html Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Link: https://lore.kernel.org/r/20221031173953.614577-1-i.maximets@ovn.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-02ipv6: fix WARNING in ip6_route_net_exit_late()Zhengchao Shao1-4/+10
During the initialization of ip6_route_net_init_late(), if file ipv6_route or rt6_stats fails to be created, the initialization is successful by default. Therefore, the ipv6_route or rt6_stats file doesn't be found during the remove in ip6_route_net_exit_late(). It will cause WRNING. The following is the stack information: name 'rt6_stats' WARNING: CPU: 0 PID: 9 at fs/proc/generic.c:712 remove_proc_entry+0x389/0x460 Modules linked in: Workqueue: netns cleanup_net RIP: 0010:remove_proc_entry+0x389/0x460 PKRU: 55555554 Call Trace: <TASK> ops_exit_list+0xb0/0x170 cleanup_net+0x4ea/0xb00 process_one_work+0x9bf/0x1710 worker_thread+0x665/0x1080 kthread+0x2e4/0x3a0 ret_from_fork+0x1f/0x30 </TASK> Fixes: cdb1876192db ("[NETNS][IPV6] route6 - create route6 proc files for the namespace") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20221102020610.351330-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-02bridge: Fix flushing of dynamic FDB entriesIdo Schimmel2-2/+2
The following commands should result in all the dynamic FDB entries being flushed, but instead all the non-local (non-permanent) entries are flushed: # bridge fdb add 00:aa:bb:cc:dd:ee dev dummy1 master static # bridge fdb add 00:11:22:33:44:55 dev dummy1 master dynamic # ip link set dev br0 type bridge fdb_flush # bridge fdb show brport dummy1 00:00:00:00:00:01 master br0 permanent 33:33:00:00:00:01 self permanent 01:00:5e:00:00:01 self permanent This is because br_fdb_flush() works with FDB flags and not the corresponding enumerator values. Fix by passing the FDB flag instead. After the fix: # bridge fdb add 00:aa:bb:cc:dd:ee dev dummy1 master static # bridge fdb add 00:11:22:33:44:55 dev dummy1 master dynamic # ip link set dev br0 type bridge fdb_flush # bridge fdb show brport dummy1 00:aa:bb:cc:dd:ee master br0 static 00:00:00:00:00:01 master br0 permanent 33:33:00:00:00:01 self permanent 01:00:5e:00:00:01 self permanent Fixes: 1f78ee14eeac ("net: bridge: fdb: add support for fine-grained flushing") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20221101185753.2120691-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-02Merge branch 'rocker-two-small-changes'Jakub Kicinski1-7/+8
Ido Schimmel says: ==================== rocker: Two small changes Patch #1 avoids allocating and scheduling a work item when it is not doing any work. Patch #2 aligns rocker with other switchdev drivers to explicitly mark FDB entries as offloaded. Needed for upcoming MAB offload [1]. [1] https://lore.kernel.org/netdev/20221025100024.1287157-1-idosch@nvidia.com/ ==================== Link: https://lore.kernel.org/r/20221101123936.1900453-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-02rocker: Explicitly mark learned FDB entries as offloadedIdo Schimmel1-0/+1
Currently, FDB entries that are notified to the bridge driver via 'SWITCHDEV_FDB_ADD_TO_BRIDGE' are always marked as offloaded by the bridge. With MAB enabled, this will no longer be universally true. Device drivers will report locked FDB entries to the bridge to let it know that the corresponding hosts required authorization, but it does not mean that these entries are necessarily programmed in the underlying hardware. We would like to solve it by having the bridge driver determine the offload indication based of the 'offloaded' bit in the FDB notification [1]. Prepare for that change by having rocker explicitly mark learned FDB entries as offloaded. This is consistent with all the other switchdev drivers. [1] https://lore.kernel.org/netdev/20221025100024.1287157-4-idosch@nvidia.com/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-02rocker: Avoid unnecessary scheduling of work itemIdo Schimmel1-7/+7
The work item function ofdpa_port_fdb_learn_work() does not do anything when 'OFDPA_OP_FLAG_LEARNED' is not set in the work item's flags. Therefore, do not allocate and do not schedule the work item when the flag is not set. Suggested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-02net, neigh: Fix null-ptr-deref in neigh_table_clear()Chen Zhongjin1-1/+1
When IPv6 module gets initialized but hits an error in the middle, kenel panic with: KASAN: null-ptr-deref in range [0x0000000000000598-0x000000000000059f] CPU: 1 PID: 361 Comm: insmod Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) RIP: 0010:__neigh_ifdown.isra.0+0x24b/0x370 RSP: 0018:ffff888012677908 EFLAGS: 00000202 ... Call Trace: <TASK> neigh_table_clear+0x94/0x2d0 ndisc_cleanup+0x27/0x40 [ipv6] inet6_init+0x21c/0x2cb [ipv6] do_one_initcall+0xd3/0x4d0 do_init_module+0x1ae/0x670 ... Kernel panic - not syncing: Fatal exception When ipv6 initialization fails, it will try to cleanup and calls: neigh_table_clear() neigh_ifdown(tbl, NULL) pneigh_queue_purge(&tbl->proxy_queue, dev_net(dev == NULL)) # dev_net(NULL) triggers null-ptr-deref. Fix it by passing NULL to pneigh_queue_purge() in neigh_ifdown() if dev is NULL, to make kernel not panic immediately. Fixes: 66ba215cb513 ("neigh: fix possible DoS due to net iface start/stop loop") Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Link: https://lore.kernel.org/r/20221101121552.21890-1-chenzhongjin@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-02net/smc: Fix possible leaked pernet namespace in smc_init()Chen Zhongjin1-2/+4
In smc_init(), register_pernet_subsys(&smc_net_stat_ops) is called without any error handling. If it fails, registering of &smc_net_ops won't be reverted. And if smc_nl_init() fails, &smc_net_stat_ops itself won't be reverted. This leaves wild ops in subsystem linkedlist and when another module tries to call register_pernet_operations() it triggers page fault: BUG: unable to handle page fault for address: fffffbfff81b964c RIP: 0010:register_pernet_operations+0x1b9/0x5f0 Call Trace: <TASK> register_pernet_subsys+0x29/0x40 ebtables_init+0x58/0x1000 [ebtables] ... Fixes: 194730a9beb5 ("net/smc: Make SMC statistics network namespace aware") Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Link: https://lore.kernel.org/r/20221101093722.127223-1-chenzhongjin@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-02sfc (gcc13): synchronize ef100_enqueue_skb()'s return typeJiri Slaby (SUSE)1-1/+2
ef100_enqueue_skb() generates a valid warning with gcc-13: drivers/net/ethernet/sfc/ef100_tx.c:370:5: error: conflicting types for 'ef100_enqueue_skb' due to enum/integer mismatch; have 'int(struct efx_tx_queue *, struct sk_buff *)' drivers/net/ethernet/sfc/ef100_tx.h:25:13: note: previous declaration of 'ef100_enqueue_skb' with type 'netdev_tx_t(struct efx_tx_queue *, struct sk_buff *)' I.e. the type of the ef100_enqueue_skb()'s return value in the declaration is int, while the definition spells enum netdev_tx_t. Synchronize them to the latter. Cc: Martin Liska <mliska@suse.cz> Cc: Edward Cree <ecree.xilinx@gmail.com> Cc: Martin Habets <habetsm.xilinx@gmail.com> Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Link: https://lore.kernel.org/r/20221031114440.10461-1-jirislaby@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-02bonding (gcc13): synchronize bond_{a,t}lb_xmit() typesJiri Slaby (SUSE)1-2/+2
Both bond_alb_xmit() and bond_tlb_xmit() produce a valid warning with gcc-13: drivers/net/bonding/bond_alb.c:1409:13: error: conflicting types for 'bond_tlb_xmit' due to enum/integer mismatch; have 'netdev_tx_t(struct sk_buff *, struct net_device *)' ... include/net/bond_alb.h:160:5: note: previous declaration of 'bond_tlb_xmit' with type 'int(struct sk_buff *, struct net_device *)' drivers/net/bonding/bond_alb.c:1523:13: error: conflicting types for 'bond_alb_xmit' due to enum/integer mismatch; have 'netdev_tx_t(struct sk_buff *, struct net_device *)' ... include/net/bond_alb.h:159:5: note: previous declaration of 'bond_alb_xmit' with type 'int(struct sk_buff *, struct net_device *)' I.e. the return type of the declaration is int, while the definitions spell netdev_tx_t. Synchronize both of them to the latter. Cc: Martin Liska <mliska@suse.cz> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Link: https://lore.kernel.org/r/20221031114409.10417-1-jirislaby@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-02qed (gcc13): use u16 for fid to be big enoughJiri Slaby (SUSE)1-1/+2
gcc 13 correctly reports overflow in qed_grc_dump_addr_range(): In file included from drivers/net/ethernet/qlogic/qed/qed.h:23, from drivers/net/ethernet/qlogic/qed/qed_debug.c:10: drivers/net/ethernet/qlogic/qed/qed_debug.c: In function 'qed_grc_dump_addr_range': include/linux/qed/qed_if.h:1217:9: error: overflow in conversion from 'int' to 'u8' {aka 'unsigned char'} changes value from '(int)vf_id << 8 | 128' to '128' [-Werror=overflow] We do: u8 fid; ... fid = vf_id << 8 | 128; Since fid is 16bit (and the stored value above too), fid should be u16, not u8. Fix that. Cc: Martin Liska <mliska@suse.cz> Cc: Ariel Elior <aelior@marvell.com> Cc: Manish Chopra <manishc@marvell.com> Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Link: https://lore.kernel.org/r/20221031114354.10398-1-jirislaby@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-02net: broadcom: bcm4908_enet: report queued and transmitted bytesRafał Miłecki1-0/+4
This allows BQL to operate avoiding buffer bloat and reducing latency. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Link: https://lore.kernel.org/r/20221031104856.32388-1-zajec5@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-02stmmac: dwmac-loongson: fix invalid mdio_nodeLiu Peibao1-5/+2
In current code "plat->mdio_node" is always NULL, the mdio support is lost as there is no "mdio_bus_data". The original driver could work as the "mdio" variable is never set to false, which is described in commit <b0e03950dd71> ("stmmac: dwmac-loongson: fix uninitialized variable ......"). And after this commit merged, the "mdio" variable is always false, causing the mdio supoort logic lost. Fixes: 30bba69d7db4 ("stmmac: pci: Add dwmac support for Loongson") Signed-off-by: Liu Peibao <liupeibao@loongson.cn> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20221101060218.16453-1-liupeibao@loongson.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-02ibmvnic: Free rwi on reset successNick Child1-8/+8
Free the rwi structure in the event that the last rwi in the list processed successfully. The logic in commit 4f408e1fa6e1 ("ibmvnic: retry reset if there are no other resets") introduces an issue that results in a 32 byte memory leak whenever the last rwi in the list gets processed. Fixes: 4f408e1fa6e1 ("ibmvnic: retry reset if there are no other resets") Signed-off-by: Nick Child <nnac123@linux.ibm.com> Link: https://lore.kernel.org/r/20221031150642.13356-1-nnac123@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-02net: mdio: fix undefined behavior in bit shift for __mdiobus_registerGaosheng Cui1-1/+1
Shifting signed 32-bit value by 31 bits is undefined, so changing significant bit to unsigned. The UBSAN warning calltrace like below: UBSAN: shift-out-of-bounds in drivers/net/phy/mdio_bus.c:586:27 left shift of 1 by 31 places cannot be represented in type 'int' Call Trace: <TASK> dump_stack_lvl+0x7d/0xa5 dump_stack+0x15/0x1b ubsan_epilogue+0xe/0x4e __ubsan_handle_shift_out_of_bounds+0x1e7/0x20c __mdiobus_register+0x49d/0x4e0 fixed_mdio_bus_init+0xd8/0x12d do_one_initcall+0x76/0x430 kernel_init_freeable+0x3b3/0x422 kernel_init+0x24/0x1e0 ret_from_fork+0x1f/0x30 </TASK> Fixes: 4fd5f812c23c ("phylib: allow incremental scanning of an mii bus") Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20221031132645.168421-1-cuigaosheng1@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-02octeontx2-af: Allow mkex profile without DMAC and add L2M/L2B header ↵Suman Ghosh3-18/+70
extraction support 1. It is possible to have custom mkex profiles which do not extract DMAC at all into the key. Hence allow mkex profiles which do not have DMAC to be loaded into MCAM hardware. This patch also adds debugging prints needed to identify profiles with wrong configuration. 2. If a mkex profile set "l2l3mb" field for Rx interface, then Rx multicast and broadcast entry should be configured. Signed-off-by: Suman Ghosh <sumang@marvell.com> Link: https://lore.kernel.org/r/20221031090856.1404303-1-sumang@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfJakub Kicinski5-37/+52
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net 1) netlink socket notifier might win race to release objects that are already pending to be released via commit release path, reported by syzbot. 2) No need to postpone flow rule release to commit release path, this triggered the syzbot report, complementary fix to previous patch. 3) Use explicit signed chars in IPVS to unbreak arm, from Jason A. Donenfeld. 4) Missing check for proc entry creation failure in IPVS, from Zhengchao Shao. 5) Incorrect error path handling when BPF NAT fails to register, from Chen Zhongjin. 6) Prevent huge memory allocation in ipset hash types, from Jozsef Kadlecsik. Except the incorrect BPF NAT error path which is broken in 6.1-rc, anything else has been broken for several releases. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: ipset: enforce documented limit to prevent allocating huge memory netfilter: nf_nat: Fix possible memory leak in nf_nat_init() ipvs: fix WARNING in ip_vs_app_net_cleanup() ipvs: fix WARNING in __ip_vs_cleanup_batch() ipvs: use explicitly signed chars netfilter: nf_tables: release flow rule object from commit path netfilter: nf_tables: netlink notifier might race to release objects ==================== Link: https://lore.kernel.org/r/20221102184659.2502-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-02Merge tag 'for-net-2022-10-02' of ↵Jakub Kicinski4-22/+98
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth 2022-11-02 - Fix memory leak in hci_vhci driver - Fix handling of skb on virtio_bt driver - Fix accepting connection for invalid L2CAP PSM - Fix attemting to access uninitialized memory - Fix use-after-free in l2cap_reassemble_sdu - Fix use-after-free in l2cap_conn_del - Fix handling of destination address type for CIS - Fix not restoring ISO buffer count on disconnect * tag 'for-net-2022-10-02' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: L2CAP: Fix attempting to access uninitialized memory Bluetooth: L2CAP: Fix l2cap_global_chan_by_psm Bluetooth: L2CAP: Fix accepting connection request for invalid SPSM Bluetooth: hci_conn: Fix not restoring ISO buffer count on disconnect Bluetooth: L2CAP: Fix memory leak in vhci_write Bluetooth: L2CAP: fix use-after-free in l2cap_conn_del() Bluetooth: virtio_bt: Use skb_put to set length Bluetooth: hci_conn: Fix CIS connection dst_type handling Bluetooth: L2CAP: Fix use-after-free caused by l2cap_reassemble_sdu ==================== Link: https://lore.kernel.org/r/20221102235927.3324891-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-02Bluetooth: L2CAP: Fix attempting to access uninitialized memoryLuiz Augusto von Dentz1-1/+2
On l2cap_parse_conf_req the variable efs is only initialized if remote_efs has been set. CVE: CVE-2022-42895 CC: stable@vger.kernel.org Reported-by: Tamás Koczka <poprdi@google.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Reviewed-by: Tedd Ho-Jeong An <tedd.an@intel.com>
2022-11-02Bluetooth: L2CAP: Fix l2cap_global_chan_by_psmLuiz Augusto von Dentz1-1/+1
l2cap_global_chan_by_psm shall not return fixed channels as they are not meant to be connected by (S)PSM. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Reviewed-by: Tedd Ho-Jeong An <tedd.an@intel.com>
2022-11-02Bluetooth: L2CAP: Fix accepting connection request for invalid SPSMLuiz Augusto von Dentz1-0/+25
The Bluetooth spec states that the valid range for SPSM is from 0x0001-0x00ff so it is invalid to accept values outside of this range: BLUETOOTH CORE SPECIFICATION Version 5.3 | Vol 3, Part A page 1059: Table 4.15: L2CAP_LE_CREDIT_BASED_CONNECTION_REQ SPSM ranges CVE: CVE-2022-42896 CC: stable@vger.kernel.org Reported-by: Tamás Koczka <poprdi@google.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Reviewed-by: Tedd Ho-Jeong An <tedd.an@intel.com>
2022-11-02Bluetooth: hci_conn: Fix not restoring ISO buffer count on disconnectLuiz Augusto von Dentz1-0/+11
When disconnecting an ISO link the controller may not generate HCI_EV_NUM_COMP_PKTS for unacked packets which needs to be restored in hci_conn_del otherwise the host would assume they are still in use and would not be able to use all the buffers available. Fixes: 26afbd826ee3 ("Bluetooth: Add initial implementation of CIS connections") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Tested-by: Frédéric Danis <frederic.danis@collabora.com>
2022-11-02Bluetooth: L2CAP: Fix memory leak in vhci_writeHawkins Jiawei1-4/+3
Syzkaller reports a memory leak as follows: ==================================== BUG: memory leak unreferenced object 0xffff88810d81ac00 (size 240): [...] hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff838733d9>] __alloc_skb+0x1f9/0x270 net/core/skbuff.c:418 [<ffffffff833f742f>] alloc_skb include/linux/skbuff.h:1257 [inline] [<ffffffff833f742f>] bt_skb_alloc include/net/bluetooth/bluetooth.h:469 [inline] [<ffffffff833f742f>] vhci_get_user drivers/bluetooth/hci_vhci.c:391 [inline] [<ffffffff833f742f>] vhci_write+0x5f/0x230 drivers/bluetooth/hci_vhci.c:511 [<ffffffff815e398d>] call_write_iter include/linux/fs.h:2192 [inline] [<ffffffff815e398d>] new_sync_write fs/read_write.c:491 [inline] [<ffffffff815e398d>] vfs_write+0x42d/0x540 fs/read_write.c:578 [<ffffffff815e3cdd>] ksys_write+0x9d/0x160 fs/read_write.c:631 [<ffffffff845e0645>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<ffffffff845e0645>] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 [<ffffffff84600087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd ==================================== HCI core will uses hci_rx_work() to process frame, which is queued to the hdev->rx_q tail in hci_recv_frame() by HCI driver. Yet the problem is that, HCI core may not free the skb after handling ACL data packets. To be more specific, when start fragment does not contain the L2CAP length, HCI core just copies skb into conn->rx_skb and finishes frame process in l2cap_recv_acldata(), without freeing the skb, which triggers the above memory leak. This patch solves it by releasing the relative skb, after processing the above case in l2cap_recv_acldata(). Fixes: 4d7ea8ee90e4 ("Bluetooth: L2CAP: Fix handling fragmented length") Link: https://lore.kernel.org/all/0000000000000d0b1905e6aaef64@google.com/ Reported-and-tested-by: syzbot+8f819e36e01022991cfa@syzkaller.appspotmail.com Signed-off-by: Hawkins Jiawei <yin31149@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-11-02Bluetooth: L2CAP: fix use-after-free in l2cap_conn_del()Zhengchao Shao1-0/+1
When l2cap_recv_frame() is invoked to receive data, and the cid is L2CAP_CID_A2MP, if the channel does not exist, it will create a channel. However, after a channel is created, the hold operation of the channel is not performed. In this case, the value of channel reference counting is 1. As a result, after hci_error_reset() is triggered, l2cap_conn_del() invokes the close hook function of A2MP to release the channel. Then l2cap_chan_unlock(chan) will trigger UAF issue. The process is as follows: Receive data: l2cap_data_channel() a2mp_channel_create() --->channel ref is 2 l2cap_chan_put() --->channel ref is 1 Triger event: hci_error_reset() hci_dev_do_close() ... l2cap_disconn_cfm() l2cap_conn_del() l2cap_chan_hold() --->channel ref is 2 l2cap_chan_del() --->channel ref is 1 a2mp_chan_close_cb() --->channel ref is 0, release channel l2cap_chan_unlock() --->UAF of channel The detailed Call Trace is as follows: BUG: KASAN: use-after-free in __mutex_unlock_slowpath+0xa6/0x5e0 Read of size 8 at addr ffff8880160664b8 by task kworker/u11:1/7593 Workqueue: hci0 hci_error_reset Call Trace: <TASK> dump_stack_lvl+0xcd/0x134 print_report.cold+0x2ba/0x719 kasan_report+0xb1/0x1e0 kasan_check_range+0x140/0x190 __mutex_unlock_slowpath+0xa6/0x5e0 l2cap_conn_del+0x404/0x7b0 l2cap_disconn_cfm+0x8c/0xc0 hci_conn_hash_flush+0x11f/0x260 hci_dev_close_sync+0x5f5/0x11f0 hci_dev_do_close+0x2d/0x70 hci_error_reset+0x9e/0x140 process_one_work+0x98a/0x1620 worker_thread+0x665/0x1080 kthread+0x2e4/0x3a0 ret_from_fork+0x1f/0x30 </TASK> Allocated by task 7593: kasan_save_stack+0x1e/0x40 __kasan_kmalloc+0xa9/0xd0 l2cap_chan_create+0x40/0x930 amp_mgr_create+0x96/0x990 a2mp_channel_create+0x7d/0x150 l2cap_recv_frame+0x51b8/0x9a70 l2cap_recv_acldata+0xaa3/0xc00 hci_rx_work+0x702/0x1220 process_one_work+0x98a/0x1620 worker_thread+0x665/0x1080 kthread+0x2e4/0x3a0 ret_from_fork+0x1f/0x30 Freed by task 7593: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 kasan_set_free_info+0x20/0x30 ____kasan_slab_free+0x167/0x1c0 slab_free_freelist_hook+0x89/0x1c0 kfree+0xe2/0x580 l2cap_chan_put+0x22a/0x2d0 l2cap_conn_del+0x3fc/0x7b0 l2cap_disconn_cfm+0x8c/0xc0 hci_conn_hash_flush+0x11f/0x260 hci_dev_close_sync+0x5f5/0x11f0 hci_dev_do_close+0x2d/0x70 hci_error_reset+0x9e/0x140 process_one_work+0x98a/0x1620 worker_thread+0x665/0x1080 kthread+0x2e4/0x3a0 ret_from_fork+0x1f/0x30 Last potentially related work creation: kasan_save_stack+0x1e/0x40 __kasan_record_aux_stack+0xbe/0xd0 call_rcu+0x99/0x740 netlink_release+0xe6a/0x1cf0 __sock_release+0xcd/0x280 sock_close+0x18/0x20 __fput+0x27c/0xa90 task_work_run+0xdd/0x1a0 exit_to_user_mode_prepare+0x23c/0x250 syscall_exit_to_user_mode+0x19/0x50 do_syscall_64+0x42/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Second to last potentially related work creation: kasan_save_stack+0x1e/0x40 __kasan_record_aux_stack+0xbe/0xd0 call_rcu+0x99/0x740 netlink_release+0xe6a/0x1cf0 __sock_release+0xcd/0x280 sock_close+0x18/0x20 __fput+0x27c/0xa90 task_work_run+0xdd/0x1a0 exit_to_user_mode_prepare+0x23c/0x250 syscall_exit_to_user_mode+0x19/0x50 do_syscall_64+0x42/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: d0be8347c623 ("Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-11-02Bluetooth: virtio_bt: Use skb_put to set lengthSoenke Huster1-1/+1
By using skb_put we ensure that skb->tail is set correctly. Currently, skb->tail is always zero, which leads to errors, such as the following page fault in rfcomm_recv_frame: BUG: unable to handle page fault for address: ffffed1021de29ff #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page RIP: 0010:rfcomm_run+0x831/0x4040 (net/bluetooth/rfcomm/core.c:1751) Fixes: afd2daa26c7a ("Bluetooth: Add support for virtio transport driver") Signed-off-by: Soenke Huster <soenke.huster@eknoes.de> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-11-02Bluetooth: hci_conn: Fix CIS connection dst_type handlingPauli Virtanen2-8/+13
hci_connect_cis and iso_connect_cis call hci_bind_cis inconsistently with dst_type being either ISO socket address type or the HCI type, but these values cannot be mixed like this. Fix this by using only the HCI type. CIS connection dst_type was also not initialized in hci_bind_cis, even though it is used in hci_conn_hash_lookup_cis to find existing connections. Set the value in hci_bind_cis, so that existing CIS connections are found e.g. when doing deferred socket connections, also when dst_type is not 0 (ADDR_LE_DEV_PUBLIC). Fixes: 26afbd826ee3 ("Bluetooth: Add initial implementation of CIS connections") Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-11-02Bluetooth: L2CAP: Fix use-after-free caused by l2cap_reassemble_sduMaxim Mikityanskiy1-7/+41
Fix the race condition between the following two flows that run in parallel: 1. l2cap_reassemble_sdu -> chan->ops->recv (l2cap_sock_recv_cb) -> __sock_queue_rcv_skb. 2. bt_sock_recvmsg -> skb_recv_datagram, skb_free_datagram. An SKB can be queued by the first flow and immediately dequeued and freed by the second flow, therefore the callers of l2cap_reassemble_sdu can't use the SKB after that function returns. However, some places continue accessing struct l2cap_ctrl that resides in the SKB's CB for a short time after l2cap_reassemble_sdu returns, leading to a use-after-free condition (the stack trace is below, line numbers for kernel 5.19.8). Fix it by keeping a local copy of struct l2cap_ctrl. BUG: KASAN: use-after-free in l2cap_rx_state_recv (net/bluetooth/l2cap_core.c:6906) bluetooth Read of size 1 at addr ffff88812025f2f0 by task kworker/u17:3/43169 Workqueue: hci0 hci_rx_work [bluetooth] Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4)) print_report.cold (mm/kasan/report.c:314 mm/kasan/report.c:429) ? l2cap_rx_state_recv (net/bluetooth/l2cap_core.c:6906) bluetooth kasan_report (mm/kasan/report.c:162 mm/kasan/report.c:493) ? l2cap_rx_state_recv (net/bluetooth/l2cap_core.c:6906) bluetooth l2cap_rx_state_recv (net/bluetooth/l2cap_core.c:6906) bluetooth l2cap_rx (net/bluetooth/l2cap_core.c:7236 net/bluetooth/l2cap_core.c:7271) bluetooth ret_from_fork (arch/x86/entry/entry_64.S:306) </TASK> Allocated by task 43169: kasan_save_stack (mm/kasan/common.c:39) __kasan_slab_alloc (mm/kasan/common.c:45 mm/kasan/common.c:436 mm/kasan/common.c:469) kmem_cache_alloc_node (mm/slab.h:750 mm/slub.c:3243 mm/slub.c:3293) __alloc_skb (net/core/skbuff.c:414) l2cap_recv_frag (./include/net/bluetooth/bluetooth.h:425 net/bluetooth/l2cap_core.c:8329) bluetooth l2cap_recv_acldata (net/bluetooth/l2cap_core.c:8442) bluetooth hci_rx_work (net/bluetooth/hci_core.c:3642 net/bluetooth/hci_core.c:3832) bluetooth process_one_work (kernel/workqueue.c:2289) worker_thread (./include/linux/list.h:292 kernel/workqueue.c:2437) kthread (kernel/kthread.c:376) ret_from_fork (arch/x86/entry/entry_64.S:306) Freed by task 27920: kasan_save_stack (mm/kasan/common.c:39) kasan_set_track (mm/kasan/common.c:45) kasan_set_free_info (mm/kasan/generic.c:372) ____kasan_slab_free (mm/kasan/common.c:368 mm/kasan/common.c:328) slab_free_freelist_hook (mm/slub.c:1780) kmem_cache_free (mm/slub.c:3536 mm/slub.c:3553) skb_free_datagram (./include/net/sock.h:1578 ./include/net/sock.h:1639 net/core/datagram.c:323) bt_sock_recvmsg (net/bluetooth/af_bluetooth.c:295) bluetooth l2cap_sock_recvmsg (net/bluetooth/l2cap_sock.c:1212) bluetooth sock_read_iter (net/socket.c:1087) new_sync_read (./include/linux/fs.h:2052 fs/read_write.c:401) vfs_read (fs/read_write.c:482) ksys_read (fs/read_write.c:620) do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) Link: https://lore.kernel.org/linux-bluetooth/CAKErNvoqga1WcmoR3-0875esY6TVWFQDandbVZncSiuGPBQXLA@mail.gmail.com/T/#u Fixes: d2a7ac5d5d3a ("Bluetooth: Add the ERTM receive state machine") Fixes: 4b51dae96731 ("Bluetooth: Add streaming mode receive and incoming packet classifier") Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>