summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2021-06-22sfc: explain that "attached" VFs only refer to XenÍñigo Huguet2-4/+6
During SRIOV disabling it is checked wether any VF is currently attached to a guest, using pci_vfs_assigned function. However, this check only works with VFs attached with Xen, not with vfio/KVM. Added comments clarifying this point. Also, replaced manual check of PCI_DEV_FLAGS_ASSIGNED flag and used the helper function pci_is_dev_assigned instead. Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22sfc: error code if SRIOV cannot be disabledÍñigo Huguet1-4/+11
If SRIOV cannot be disabled during device removal or module unloading, return error code so it can be logged properly in the calling function. Note that this can only happen if any VF is currently attached to a guest using Xen, but not with vfio/KVM. Despite that in that case the VFs won't work properly with PF removed and/or the module unloaded, I have let it as is because I don't know what side effects may have changing it, and also it seems to be the same that other drivers are doing in this situation. In the case of being called during SRIOV reconfiguration, the behavior hasn't changed because the function is called with force=false. Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22sfc: avoid double pci_remove of VFsÍñigo Huguet1-9/+1
If pci_remove was called for a PF with VFs, the removal of the VFs was called twice from efx_ef10_sriov_fini: one directly with pci_driver->remove and another implicit by calling pci_disable_sriov, which also perform the VFs remove. This was leading to crashing the kernel on the second attempt. Given that pci_disable_sriov already calls to pci remove function, get rid of the direct call to pci_driver->remove from the driver. 2 different ways to trigger the bug: - Create one or more VFs, then attach the PF to a virtual machine (at least with qemu/KVM) - Create one or more VFs, then remove the PF with: echo 1 > /sys/bus/pci/devices/PF_PCI_ID/remove Removing sfc module does not trigger the error, at least for me, because it removes the VF first, and then the PF. Example of a log with the error: list_del corruption, ffff967fd20a8ad0->next is LIST_POISON1 (dead000000000100) ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:47! [...trimmed...] RIP: 0010:__list_del_entry_valid.cold.1+0x12/0x4c [...trimmed...] Call Trace: efx_dissociate+0x1f/0x140 [sfc] efx_pci_remove+0x27/0x150 [sfc] pci_device_remove+0x3b/0xc0 device_release_driver_internal+0x103/0x1f0 pci_stop_bus_device+0x69/0x90 pci_stop_and_remove_bus_device+0xe/0x20 pci_iov_remove_virtfn+0xba/0x120 sriov_disable+0x2f/0xe0 efx_ef10_pci_sriov_disable+0x52/0x80 [sfc] ? pcie_aer_is_native+0x12/0x40 efx_ef10_sriov_fini+0x72/0x110 [sfc] efx_pci_remove+0x62/0x150 [sfc] pci_device_remove+0x3b/0xc0 device_release_driver_internal+0x103/0x1f0 unbind_store+0xf6/0x130 kernfs_fop_write+0x116/0x190 vfs_write+0xa5/0x1a0 ksys_write+0x4f/0xb0 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x65/0xca Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22vxlan: add missing rcu_read_lock() in neigh_reduce()Eric Dumazet1-0/+2
syzbot complained in neigh_reduce(), because rcu_read_lock_bh() is treated differently than rcu_read_lock() WARNING: suspicious RCU usage 5.13.0-rc6-syzkaller #0 Not tainted ----------------------------- include/net/addrconf.h:313 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 3 locks held by kworker/0:0/5: #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline] #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline] #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline] #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:617 [inline] #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:644 [inline] #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x871/0x1600 kernel/workqueue.c:2247 #1: ffffc90000ca7da8 ((work_completion)(&port->wq)){+.+.}-{0:0}, at: process_one_work+0x8a5/0x1600 kernel/workqueue.c:2251 #2: ffffffff8bf795c0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x1da/0x3130 net/core/dev.c:4180 stack backtrace: CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.13.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: events ipvlan_process_multicast Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x141/0x1d7 lib/dump_stack.c:120 __in6_dev_get include/net/addrconf.h:313 [inline] __in6_dev_get include/net/addrconf.h:311 [inline] neigh_reduce drivers/net/vxlan.c:2167 [inline] vxlan_xmit+0x34d5/0x4c30 drivers/net/vxlan.c:2919 __netdev_start_xmit include/linux/netdevice.h:4944 [inline] netdev_start_xmit include/linux/netdevice.h:4958 [inline] xmit_one net/core/dev.c:3654 [inline] dev_hard_start_xmit+0x1eb/0x920 net/core/dev.c:3670 __dev_queue_xmit+0x2133/0x3130 net/core/dev.c:4246 ipvlan_process_multicast+0xa99/0xd70 drivers/net/ipvlan/ipvlan_core.c:287 process_one_work+0x98d/0x1600 kernel/workqueue.c:2276 worker_thread+0x64c/0x1120 kernel/workqueue.c:2422 kthread+0x3b1/0x4a0 kernel/kthread.c:313 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 Fixes: f564f45c4518 ("vxlan: add ipv6 proxy support") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22bpf: Fix regression on BPF_OBJ_GET with non-O_RDWR flagsMaciej Żenczykowski1-1/+1
This reverts commit d37300ed1821 ("bpf: program: Refuse non-O_RDWR flags in BPF_OBJ_GET"). It breaks Android userspace which expects to be able to fetch programs with just read permissions. See: https://cs.android.com/android/platform/superproject/+/master:frameworks/libs/net/common/native/bpf_syscall_wrappers/include/BpfSyscallWrappers.h;drc=7005c764be23d31fa1d69e826b4a2f6689a8c81e;l=124 Side-note: another option to fix it would be to extend bpf_prog_new_fd() and to pass in used file mode flags in the same way as we do for maps via bpf_map_new_fd(). Meaning, they'd end up in anon_inode_getfd() and thus would be retained for prog fd operations with bpf() syscall. Right now these flags are not checked with progs since they are immutable for their lifetime (as opposed to maps which can be updated from user space). In future this could potentially change with new features, but at that point it's still fine to do the bpf_prog_new_fd() extension when needed. For a simple stable fix, a revert is less churn. Fixes: d37300ed1821 ("bpf: program: Refuse non-O_RDWR flags in BPF_OBJ_GET") Signed-off-by: Maciej Żenczykowski <maze@google.com> [ Daniel: added side-note to commit message ] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Acked-by: Greg Kroah-Hartman <gregkh@google.com> Link: https://lore.kernel.org/bpf/20210618105526.265003-1-zenczykowski@gmail.com
2021-06-22netfilter: nf_tables: do not allow to delete table with owner by handlePablo Neira Ayuso1-3/+9
nft_table_lookup_byhandle() also needs to validate the netlink PortID owner when deleting a table by handle. Fixes: 6001a930ce03 ("netfilter: nftables: introduce table ownership") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-06-22netfilter: nf_tables: skip netlink portID validation if zeroPablo Neira Ayuso1-1/+1
nft_table_lookup() allows us to obtain the table object by the name and the family. The netlink portID validation needs to be skipped for the dump path, since the ownership only applies to commands to update the given table. Skip validation if the specified netlink PortID is zero when calling nft_table_lookup(). Fixes: 6001a930ce03 ("netfilter: nftables: introduce table ownership") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-06-22xfrm: Fix xfrm offload fallback fail caseAyush Sawal1-0/+1
In case of xfrm offload, if xdo_dev_state_add() of driver returns -EOPNOTSUPP, xfrm offload fallback is failed. In xfrm state_add() both xso->dev and xso->real_dev are initialized to dev and when err(-EOPNOTSUPP) is returned only xso->dev is set to null. So in this scenario the condition in func validate_xmit_xfrm(), if ((x->xso.dev != dev) && (x->xso.real_dev == dev)) return skb; returns true, due to which skb is returned without calling esp_xmit() below which has fallback code. Hence the CRYPTO_FALLBACK is failing. So fixing this with by keeping x->xso.real_dev as NULL when err is returned in func xfrm_dev_state_add(). Fixes: bdfd2d1fa79a ("bonding/xfrm: use real_dev instead of slave_dev") Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2021-06-21pkt_sched: sch_qfq: fix qfq_change_class() error pathEric Dumazet1-5/+3
If qfq_change_class() is unable to allocate memory for qfq_aggregate, it frees the class that has been inserted in the class hash table, but does not unhash it. Defer the insertion after the problematic allocation. BUG: KASAN: use-after-free in hlist_add_head include/linux/list.h:884 [inline] BUG: KASAN: use-after-free in qdisc_class_hash_insert+0x200/0x210 net/sched/sch_api.c:731 Write of size 8 at addr ffff88814a534f10 by task syz-executor.4/31478 CPU: 0 PID: 31478 Comm: syz-executor.4 Not tainted 5.13.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x141/0x1d7 lib/dump_stack.c:120 print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:233 __kasan_report mm/kasan/report.c:419 [inline] kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:436 hlist_add_head include/linux/list.h:884 [inline] qdisc_class_hash_insert+0x200/0x210 net/sched/sch_api.c:731 qfq_change_class+0x96c/0x1990 net/sched/sch_qfq.c:489 tc_ctl_tclass+0x514/0xe50 net/sched/sch_api.c:2113 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5564 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2504 netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1340 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1929 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2350 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433 do_syscall_64+0x3a/0xb0 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x4665d9 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fdc7b5f0188 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 000000000056bf80 RCX: 00000000004665d9 RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000003 RBP: 00007fdc7b5f01d0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002 R13: 00007ffcf7310b3f R14: 00007fdc7b5f0300 R15: 0000000000022000 Allocated by task 31445: kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38 kasan_set_track mm/kasan/common.c:46 [inline] set_alloc_info mm/kasan/common.c:428 [inline] ____kasan_kmalloc mm/kasan/common.c:507 [inline] ____kasan_kmalloc mm/kasan/common.c:466 [inline] __kasan_kmalloc+0x9b/0xd0 mm/kasan/common.c:516 kmalloc include/linux/slab.h:556 [inline] kzalloc include/linux/slab.h:686 [inline] qfq_change_class+0x705/0x1990 net/sched/sch_qfq.c:464 tc_ctl_tclass+0x514/0xe50 net/sched/sch_api.c:2113 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5564 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2504 netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1340 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1929 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2350 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433 do_syscall_64+0x3a/0xb0 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae Freed by task 31445: kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38 kasan_set_track+0x1c/0x30 mm/kasan/common.c:46 kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:357 ____kasan_slab_free mm/kasan/common.c:360 [inline] ____kasan_slab_free mm/kasan/common.c:325 [inline] __kasan_slab_free+0xfb/0x130 mm/kasan/common.c:368 kasan_slab_free include/linux/kasan.h:212 [inline] slab_free_hook mm/slub.c:1583 [inline] slab_free_freelist_hook+0xdf/0x240 mm/slub.c:1608 slab_free mm/slub.c:3168 [inline] kfree+0xe5/0x7f0 mm/slub.c:4212 qfq_change_class+0x10fb/0x1990 net/sched/sch_qfq.c:518 tc_ctl_tclass+0x514/0xe50 net/sched/sch_api.c:2113 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5564 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2504 netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1340 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1929 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2350 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433 do_syscall_64+0x3a/0xb0 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae The buggy address belongs to the object at ffff88814a534f00 which belongs to the cache kmalloc-128 of size 128 The buggy address is located 16 bytes inside of 128-byte region [ffff88814a534f00, ffff88814a534f80) The buggy address belongs to the page: page:ffffea0005294d00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x14a534 flags: 0x57ff00000000200(slab|node=1|zone=2|lastcpupid=0x7ff) raw: 057ff00000000200 ffffea00004fee00 0000000600000006 ffff8880110418c0 raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 0, migratetype Unmovable, gfp_mask 0x12cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY), pid 29797, ts 604817765317, free_ts 604810151744 prep_new_page mm/page_alloc.c:2358 [inline] get_page_from_freelist+0x1033/0x2b60 mm/page_alloc.c:3994 __alloc_pages+0x1b2/0x500 mm/page_alloc.c:5200 alloc_pages+0x18c/0x2a0 mm/mempolicy.c:2272 alloc_slab_page mm/slub.c:1646 [inline] allocate_slab+0x2c5/0x4c0 mm/slub.c:1786 new_slab mm/slub.c:1849 [inline] new_slab_objects mm/slub.c:2595 [inline] ___slab_alloc+0x4a1/0x810 mm/slub.c:2758 __slab_alloc.constprop.0+0xa7/0xf0 mm/slub.c:2798 slab_alloc_node mm/slub.c:2880 [inline] slab_alloc mm/slub.c:2922 [inline] __kmalloc+0x315/0x330 mm/slub.c:4050 kmalloc include/linux/slab.h:561 [inline] kzalloc include/linux/slab.h:686 [inline] __register_sysctl_table+0x112/0x1090 fs/proc/proc_sysctl.c:1318 mpls_dev_sysctl_register+0x1b7/0x2d0 net/mpls/af_mpls.c:1421 mpls_add_dev net/mpls/af_mpls.c:1472 [inline] mpls_dev_notify+0x214/0x8b0 net/mpls/af_mpls.c:1588 notifier_call_chain+0xb5/0x200 kernel/notifier.c:83 call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:2121 call_netdevice_notifiers_extack net/core/dev.c:2133 [inline] call_netdevice_notifiers net/core/dev.c:2147 [inline] register_netdevice+0x106b/0x1500 net/core/dev.c:10312 veth_newlink+0x585/0xac0 drivers/net/veth.c:1547 __rtnl_newlink+0x1062/0x1710 net/core/rtnetlink.c:3452 rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3500 page last free stack trace: reset_page_owner include/linux/page_owner.h:24 [inline] free_pages_prepare mm/page_alloc.c:1298 [inline] free_pcp_prepare+0x223/0x300 mm/page_alloc.c:1342 free_unref_page_prepare mm/page_alloc.c:3250 [inline] free_unref_page+0x12/0x1d0 mm/page_alloc.c:3298 __vunmap+0x783/0xb60 mm/vmalloc.c:2566 free_work+0x58/0x70 mm/vmalloc.c:80 process_one_work+0x98d/0x1600 kernel/workqueue.c:2276 worker_thread+0x64c/0x1120 kernel/workqueue.c:2422 kthread+0x3b1/0x4a0 kernel/kthread.c:313 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 Memory state around the buggy address: ffff88814a534e00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88814a534e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff88814a534f00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88814a534f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88814a535000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Fixes: 462dbc9101acd ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21net: dsa: mv88e6xxx: Fix adding vlan 0Eldar Gasanov1-3/+3
8021q module adds vlan 0 to all interfaces when it starts. When 8021q module is loaded it isn't possible to create bond with mv88e6xxx interfaces, bonding module dipslay error "Couldn't add bond vlan ids", because it tries to add vlan 0 to slave interfaces. There is unexpected behavior in the switch. When a PVID is assigned to a port the switch changes VID to PVID in ingress frames with VID 0 on the port. Expected that the switch doesn't assign PVID to tagged frames with VID 0. But there isn't a way to change this behavior in the switch. Fixes: 57e661aae6a8 ("net: dsa: mv88e6xxx: Link aggregation support") Signed-off-by: Eldar Gasanov <eldargasanov2@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21vsock: notify server to shutdown when client has pending signalLongpeng(Mike)1-1/+1
The client's sk_state will be set to TCP_ESTABLISHED if the server replay the client's connect request. However, if the client has pending signal, its sk_state will be set to TCP_CLOSE without notify the server, so the server will hold the corrupt connection. client server 1. sk_state=TCP_SYN_SENT | 2. call ->connect() | 3. wait reply | | 4. sk_state=TCP_ESTABLISHED | 5. insert to connected list | 6. reply to the client 7. sk_state=TCP_ESTABLISHED | 8. insert to connected list | 9. *signal pending* <--------------------- the user kill client 10. sk_state=TCP_CLOSE | client is exiting... | 11. call ->release() | virtio_transport_close if (!(sk->sk_state == TCP_ESTABLISHED || sk->sk_state == TCP_CLOSING)) return true; *return at here, the server cannot notice the connection is corrupt* So the client should notify the peer in this case. Cc: David S. Miller <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jorgen Hansen <jhansen@vmware.com> Cc: Norbert Slusarek <nslusarek@gmx.net> Cc: Andra Paraschiv <andraprs@amazon.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: David Brazdil <dbrazdil@google.com> Cc: Alexander Popov <alex.popov@linux.com> Suggested-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lkml.org/lkml/2021/5/17/418 Signed-off-by: lixianming <lixianming5@huawei.com> Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21net: mana: Fix a memory leak in an error handling path in 'mana_create_txq()'Christophe JAILLET1-2/+4
If this test fails we must free some resources as in all the other error handling paths of this function. Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21Merge branch 'nnicstar-fixes'David S. Miller1-11/+13
Zheyu Ma says: ==================== atm: nicstar: fix two bugs about error handling Zheyu Ma (2): atm: nicstar: use 'dma_free_coherent' instead of 'kfree' atm: nicstar: register the interrupt handler in the right place ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21atm: nicstar: register the interrupt handler in the right placeZheyu Ma1-9/+9
Because the error handling is sequential, the application of resources should be carried out in the order of error handling, so the operation of registering the interrupt handler should be put in front, so as not to free the unregistered interrupt handler during error handling. This log reveals it: [ 3.438724] Trying to free already-free IRQ 23 [ 3.439060] WARNING: CPU: 5 PID: 1 at kernel/irq/manage.c:1825 free_irq+0xfb/0x480 [ 3.440039] Modules linked in: [ 3.440257] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.12.4-g70e7f0549188-dirty #142 [ 3.440793] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 3.441561] RIP: 0010:free_irq+0xfb/0x480 [ 3.441845] Code: 6e 08 74 6f 4d 89 f4 e8 c3 78 09 00 4d 8b 74 24 18 4d 85 f6 75 e3 e8 b4 78 09 00 8b 75 c8 48 c7 c7 a0 ac d5 85 e8 95 d7 f5 ff <0f> 0b 48 8b 75 c0 4c 89 ff e8 87 c5 90 03 48 8b 43 40 4c 8b a0 80 [ 3.443121] RSP: 0000:ffffc90000017b50 EFLAGS: 00010086 [ 3.443483] RAX: 0000000000000000 RBX: ffff888107c6f000 RCX: 0000000000000000 [ 3.443972] RDX: 0000000000000000 RSI: ffffffff8123f301 RDI: 00000000ffffffff [ 3.444462] RBP: ffffc90000017b90 R08: 0000000000000001 R09: 0000000000000003 [ 3.444950] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [ 3.444994] R13: ffff888107dc0000 R14: ffff888104f6bf00 R15: ffff888107c6f0a8 [ 3.444994] FS: 0000000000000000(0000) GS:ffff88817bd40000(0000) knlGS:0000000000000000 [ 3.444994] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3.444994] CR2: 0000000000000000 CR3: 000000000642e000 CR4: 00000000000006e0 [ 3.444994] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3.444994] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 3.444994] Call Trace: [ 3.444994] ns_init_card_error+0x18e/0x250 [ 3.444994] nicstar_init_one+0x10d2/0x1130 [ 3.444994] local_pci_probe+0x4a/0xb0 [ 3.444994] pci_device_probe+0x126/0x1d0 [ 3.444994] ? pci_device_remove+0x100/0x100 [ 3.444994] really_probe+0x27e/0x650 [ 3.444994] driver_probe_device+0x84/0x1d0 [ 3.444994] ? mutex_lock_nested+0x16/0x20 [ 3.444994] device_driver_attach+0x63/0x70 [ 3.444994] __driver_attach+0x117/0x1a0 [ 3.444994] ? device_driver_attach+0x70/0x70 [ 3.444994] bus_for_each_dev+0xb6/0x110 [ 3.444994] ? rdinit_setup+0x40/0x40 [ 3.444994] driver_attach+0x22/0x30 [ 3.444994] bus_add_driver+0x1e6/0x2a0 [ 3.444994] driver_register+0xa4/0x180 [ 3.444994] __pci_register_driver+0x77/0x80 [ 3.444994] ? uPD98402_module_init+0xd/0xd [ 3.444994] nicstar_init+0x1f/0x75 [ 3.444994] do_one_initcall+0x7a/0x3d0 [ 3.444994] ? rdinit_setup+0x40/0x40 [ 3.444994] ? rcu_read_lock_sched_held+0x4a/0x70 [ 3.444994] kernel_init_freeable+0x2a7/0x2f9 [ 3.444994] ? rest_init+0x2c0/0x2c0 [ 3.444994] kernel_init+0x13/0x180 [ 3.444994] ? rest_init+0x2c0/0x2c0 [ 3.444994] ? rest_init+0x2c0/0x2c0 [ 3.444994] ret_from_fork+0x1f/0x30 [ 3.444994] Kernel panic - not syncing: panic_on_warn set ... [ 3.444994] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.12.4-g70e7f0549188-dirty #142 [ 3.444994] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 3.444994] Call Trace: [ 3.444994] dump_stack+0xba/0xf5 [ 3.444994] ? free_irq+0xfb/0x480 [ 3.444994] panic+0x155/0x3ed [ 3.444994] ? __warn+0xed/0x150 [ 3.444994] ? free_irq+0xfb/0x480 [ 3.444994] __warn+0x103/0x150 [ 3.444994] ? free_irq+0xfb/0x480 [ 3.444994] report_bug+0x119/0x1c0 [ 3.444994] handle_bug+0x3b/0x80 [ 3.444994] exc_invalid_op+0x18/0x70 [ 3.444994] asm_exc_invalid_op+0x12/0x20 [ 3.444994] RIP: 0010:free_irq+0xfb/0x480 [ 3.444994] Code: 6e 08 74 6f 4d 89 f4 e8 c3 78 09 00 4d 8b 74 24 18 4d 85 f6 75 e3 e8 b4 78 09 00 8b 75 c8 48 c7 c7 a0 ac d5 85 e8 95 d7 f5 ff <0f> 0b 48 8b 75 c0 4c 89 ff e8 87 c5 90 03 48 8b 43 40 4c 8b a0 80 [ 3.444994] RSP: 0000:ffffc90000017b50 EFLAGS: 00010086 [ 3.444994] RAX: 0000000000000000 RBX: ffff888107c6f000 RCX: 0000000000000000 [ 3.444994] RDX: 0000000000000000 RSI: ffffffff8123f301 RDI: 00000000ffffffff [ 3.444994] RBP: ffffc90000017b90 R08: 0000000000000001 R09: 0000000000000003 [ 3.444994] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [ 3.444994] R13: ffff888107dc0000 R14: ffff888104f6bf00 R15: ffff888107c6f0a8 [ 3.444994] ? vprintk_func+0x71/0x110 [ 3.444994] ns_init_card_error+0x18e/0x250 [ 3.444994] nicstar_init_one+0x10d2/0x1130 [ 3.444994] local_pci_probe+0x4a/0xb0 [ 3.444994] pci_device_probe+0x126/0x1d0 [ 3.444994] ? pci_device_remove+0x100/0x100 [ 3.444994] really_probe+0x27e/0x650 [ 3.444994] driver_probe_device+0x84/0x1d0 [ 3.444994] ? mutex_lock_nested+0x16/0x20 [ 3.444994] device_driver_attach+0x63/0x70 [ 3.444994] __driver_attach+0x117/0x1a0 [ 3.444994] ? device_driver_attach+0x70/0x70 [ 3.444994] bus_for_each_dev+0xb6/0x110 [ 3.444994] ? rdinit_setup+0x40/0x40 [ 3.444994] driver_attach+0x22/0x30 [ 3.444994] bus_add_driver+0x1e6/0x2a0 [ 3.444994] driver_register+0xa4/0x180 [ 3.444994] __pci_register_driver+0x77/0x80 [ 3.444994] ? uPD98402_module_init+0xd/0xd [ 3.444994] nicstar_init+0x1f/0x75 [ 3.444994] do_one_initcall+0x7a/0x3d0 [ 3.444994] ? rdinit_setup+0x40/0x40 [ 3.444994] ? rcu_read_lock_sched_held+0x4a/0x70 [ 3.444994] kernel_init_freeable+0x2a7/0x2f9 [ 3.444994] ? rest_init+0x2c0/0x2c0 [ 3.444994] kernel_init+0x13/0x180 [ 3.444994] ? rest_init+0x2c0/0x2c0 [ 3.444994] ? rest_init+0x2c0/0x2c0 [ 3.444994] ret_from_fork+0x1f/0x30 [ 3.444994] Dumping ftrace buffer: [ 3.444994] (ftrace buffer empty) [ 3.444994] Kernel Offset: disabled [ 3.444994] Rebooting in 1 seconds.. Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21atm: nicstar: use 'dma_free_coherent' instead of 'kfree'Zheyu Ma1-2/+4
When 'nicstar_init_one' fails, 'ns_init_card_error' will be executed for error handling, but the correct memory free function should be used, otherwise it will cause an error. Since 'card->rsq.org' and 'card->tsq.org' are allocated using 'dma_alloc_coherent' function, they should be freed using 'dma_free_coherent'. Fix this by using 'dma_free_coherent' instead of 'kfree' This log reveals it: [ 3.440294] kernel BUG at mm/slub.c:4206! [ 3.441059] invalid opcode: 0000 [#1] PREEMPT SMP PTI [ 3.441430] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.12.4-g70e7f0549188-dirty #141 [ 3.441986] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 3.442780] RIP: 0010:kfree+0x26a/0x300 [ 3.443065] Code: e8 3a c3 b9 ff e9 d6 fd ff ff 49 8b 45 00 31 db a9 00 00 01 00 75 4d 49 8b 45 00 a9 00 00 01 00 75 0a 49 8b 45 08 a8 01 75 02 <0f> 0b 89 d9 b8 00 10 00 00 be 06 00 00 00 48 d3 e0 f7 d8 48 63 d0 [ 3.443396] RSP: 0000:ffffc90000017b70 EFLAGS: 00010246 [ 3.443396] RAX: dead000000000100 RBX: 0000000000000000 RCX: 0000000000000000 [ 3.443396] RDX: 0000000000000000 RSI: ffffffff85d3df94 RDI: ffffffff85df38e6 [ 3.443396] RBP: ffffc90000017b90 R08: 0000000000000001 R09: 0000000000000001 [ 3.443396] R10: 0000000000000000 R11: 0000000000000001 R12: ffff888107dc0000 [ 3.443396] R13: ffffea00001f0100 R14: ffff888101a8bf00 R15: ffff888107dc0160 [ 3.443396] FS: 0000000000000000(0000) GS:ffff88817bc80000(0000) knlGS:0000000000000000 [ 3.443396] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3.443396] CR2: 0000000000000000 CR3: 000000000642e000 CR4: 00000000000006e0 [ 3.443396] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3.443396] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 3.443396] Call Trace: [ 3.443396] ns_init_card_error+0x12c/0x220 [ 3.443396] nicstar_init_one+0x10d2/0x1130 [ 3.443396] local_pci_probe+0x4a/0xb0 [ 3.443396] pci_device_probe+0x126/0x1d0 [ 3.443396] ? pci_device_remove+0x100/0x100 [ 3.443396] really_probe+0x27e/0x650 [ 3.443396] driver_probe_device+0x84/0x1d0 [ 3.443396] ? mutex_lock_nested+0x16/0x20 [ 3.443396] device_driver_attach+0x63/0x70 [ 3.443396] __driver_attach+0x117/0x1a0 [ 3.443396] ? device_driver_attach+0x70/0x70 [ 3.443396] bus_for_each_dev+0xb6/0x110 [ 3.443396] ? rdinit_setup+0x40/0x40 [ 3.443396] driver_attach+0x22/0x30 [ 3.443396] bus_add_driver+0x1e6/0x2a0 [ 3.443396] driver_register+0xa4/0x180 [ 3.443396] __pci_register_driver+0x77/0x80 [ 3.443396] ? uPD98402_module_init+0xd/0xd [ 3.443396] nicstar_init+0x1f/0x75 [ 3.443396] do_one_initcall+0x7a/0x3d0 [ 3.443396] ? rdinit_setup+0x40/0x40 [ 3.443396] ? rcu_read_lock_sched_held+0x4a/0x70 [ 3.443396] kernel_init_freeable+0x2a7/0x2f9 [ 3.443396] ? rest_init+0x2c0/0x2c0 [ 3.443396] kernel_init+0x13/0x180 [ 3.443396] ? rest_init+0x2c0/0x2c0 [ 3.443396] ? rest_init+0x2c0/0x2c0 [ 3.443396] ret_from_fork+0x1f/0x30 [ 3.443396] Modules linked in: [ 3.443396] Dumping ftrace buffer: [ 3.443396] (ftrace buffer empty) [ 3.458593] ---[ end trace 3c6f8f0d8ef59bcd ]--- [ 3.458922] RIP: 0010:kfree+0x26a/0x300 [ 3.459198] Code: e8 3a c3 b9 ff e9 d6 fd ff ff 49 8b 45 00 31 db a9 00 00 01 00 75 4d 49 8b 45 00 a9 00 00 01 00 75 0a 49 8b 45 08 a8 01 75 02 <0f> 0b 89 d9 b8 00 10 00 00 be 06 00 00 00 48 d3 e0 f7 d8 48 63 d0 [ 3.460499] RSP: 0000:ffffc90000017b70 EFLAGS: 00010246 [ 3.460870] RAX: dead000000000100 RBX: 0000000000000000 RCX: 0000000000000000 [ 3.461371] RDX: 0000000000000000 RSI: ffffffff85d3df94 RDI: ffffffff85df38e6 [ 3.461873] RBP: ffffc90000017b90 R08: 0000000000000001 R09: 0000000000000001 [ 3.462372] R10: 0000000000000000 R11: 0000000000000001 R12: ffff888107dc0000 [ 3.462871] R13: ffffea00001f0100 R14: ffff888101a8bf00 R15: ffff888107dc0160 [ 3.463368] FS: 0000000000000000(0000) GS:ffff88817bc80000(0000) knlGS:0000000000000000 [ 3.463949] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3.464356] CR2: 0000000000000000 CR3: 000000000642e000 CR4: 00000000000006e0 [ 3.464856] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3.465356] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 3.465860] Kernel panic - not syncing: Fatal exception [ 3.466370] Dumping ftrace buffer: [ 3.466616] (ftrace buffer empty) [ 3.466871] Kernel Offset: disabled [ 3.467122] Rebooting in 1 seconds.. Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21Merge branch 'mptcp-sdeq-fixes'David S. Miller3-30/+24
Mat Martineau says: ==================== mptcp: 32-bit sequence number improvements MPTCP-level sequence numbers are 64 bits, but RFC 8684 allows use of 32-bit sequence numbers in the DSS option to save header space. Those 32-bit numbers are the least significant bits of the full 64-bit sequence number, so the receiver must infer the correct upper 32 bits. These two patches improve the logic for determining the full 64-bit sequence numbers when the 32-bit truncated version has wrapped around. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21mptcp: fix 32 bit DSN expansionPaolo Abeni1-16/+1
The current implementation of 32 bit DSN expansion is buggy. After the previous patch, we can simply reuse the newly introduced helper to do the expansion safely. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/120 Fixes: 648ef4b88673 ("mptcp: Implement MPTCP receive path") Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21mptcp: fix bad handling of 32 bit ack wrap-aroundPaolo Abeni2-14/+23
When receiving 32 bits DSS ack from the peer, the MPTCP need to expand them to 64 bits value. The current code is buggy WRT detecting 32 bits ack wrap-around: when the wrap-around happens the current unsigned 32 bit ack value is lower than the previous one. Additionally check for possible reverse wrap and make the helper visible, so that we could re-use it for the next patch. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/204 Fixes: cc9d25669866 ("mptcp: update per unacked sequence on pkt reception") Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21netfilter: nf_tables_offload: check FLOW_DISSECTOR_KEY_BASIC in VLAN ↵Pablo Neira Ayuso1-10/+7
transfer logic The VLAN transfer logic should actually check for FLOW_DISSECTOR_KEY_BASIC, not FLOW_DISSECTOR_KEY_CONTROL. Moreover, do not fallback to case 2) .n_proto is set to 802.1q or 802.1ad, if FLOW_DISSECTOR_KEY_BASIC is unset. Fixes: 783003f3bb8a ("netfilter: nftables_offload: special ethertype handling for VLAN") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-06-21netfilter: nf_tables: memleak in hw offload abort pathPablo Neira Ayuso2-37/+31
Release flow from the abort path, this is easy to reproduce since b72920f6e4a9 ("netfilter: nftables: counter hardware offload support"). If the preparation phase fails, then the abort path is exercised without releasing the flow rule object. unreferenced object 0xffff8881f0fa7700 (size 128): comm "nft", pid 1335, jiffies 4294931120 (age 4163.740s) hex dump (first 32 bytes): 08 e4 de 13 82 88 ff ff 98 e4 de 13 82 88 ff ff ................ 48 e4 de 13 82 88 ff ff 01 00 00 00 00 00 00 00 H............... backtrace: [<00000000634547e7>] flow_rule_alloc+0x26/0x80 [<00000000c8426156>] nft_flow_rule_create+0xc9/0x3f0 [nf_tables] [<0000000075ff8e46>] nf_tables_newrule+0xc79/0x10a0 [nf_tables] [<00000000ba65e40e>] nfnetlink_rcv_batch+0xaac/0xf90 [nfnetlink] [<00000000505c614a>] nfnetlink_rcv+0x1bb/0x1f0 [nfnetlink] [<00000000eb78e1fe>] netlink_unicast+0x34b/0x480 [<00000000a8f72c94>] netlink_sendmsg+0x3af/0x690 [<000000009cb1ddf4>] sock_sendmsg+0x96/0xa0 [<0000000039d06e44>] ____sys_sendmsg+0x3fe/0x440 [<00000000137e82ca>] ___sys_sendmsg+0xd8/0x140 [<000000000c6bf6a6>] __sys_sendmsg+0xb3/0x130 [<0000000043bd6268>] do_syscall_64+0x40/0xb0 [<00000000afdebc2d>] entry_SYSCALL_64_after_hwframe+0x44/0xae Remove flow rule release from the offload commit path, otherwise error from the offload commit phase might trigger a double-free due to the execution of the abort_offload -> abort. After this patch, the abort path takes care of releasing the flow rule. This fix also needs to move the nft_flow_rule_create() call before the transaction object is added otherwise the abort path might find a NULL pointer to the flow rule object for the NFT_CHAIN_HW_OFFLOAD case. While at it, rename BASIC-like goto tags to slightly more meaningful names rather than adding a new "err3" tag. Fixes: 63b48c73ff56 ("netfilter: nf_tables_offload: undo updates if transaction fails") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-06-21tls: prevent oversized sendfile() hangs by ignoring MSG_MOREJakub Kicinski1-1/+1
We got multiple reports that multi_chunk_sendfile test case from tls selftest fails. This was sort of expected, as the original fix was never applied (see it in the first Link:). The test in question uses sendfile() with count larger than the size of the underlying file. This will make splice set MSG_MORE on all sendpage calls, meaning TLS will never close and flush the last partial record. Eric seem to have addressed a similar problem in commit 35f9c09fe9c7 ("tcp: tcp_sendpages() should call tcp_push() once") by introducing MSG_SENDPAGE_NOTLAST. Unlike MSG_MORE MSG_SENDPAGE_NOTLAST is not set on the last call of a "pipefull" of data (PIPE_DEF_BUFFERS == 16, so every 16 pages or whenever we run out of data). Having a break every 16 pages should be fine, TLS can pack exactly 4 pages into a record, so for aligned reads there should be no difference, unaligned may see one extra record per sendpage(). Sticking to TCP semantics seems preferable to modifying splice, but we can revisit it if real life scenarios show a regression. Reported-by: Vadim Fedorenko <vfedorenko@novek.ru> Reported-by: Seth Forshee <seth.forshee@canonical.com> Link: https://lore.kernel.org/netdev/1591392508-14592-1-git-send-email-pooja.trivedi@stackpath.com/ Fixes: 3c4d7559159b ("tls: kernel TLS support") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Tested-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21Merge tag 'linux-can-fixes-for-5.13-20210619' of ↵David S. Miller6-5/+22
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2021-06-19 this is a pull request of 5 patches for net/master. The first patch is by Thadeu Lima de Souza Cascardo and fixes a potential use-after-free in the CAN broadcast manager socket, by delaying the release of struct bcm_op after synchronize_rcu(). Oliver Hartkopp's patch fixes a similar potential user-after-free in the CAN gateway socket by synchronizing RCU operations before removing gw job entry. Another patch by Oliver Hartkopp fixes a potential use-after-free in the ISOTP socket by omitting unintended hrtimer restarts on socket release. Oleksij Rempel's patch for the j1939 socket fixes a potential use-after-free by setting the SOCK_RCU_FREE flag on the socket. The last patch is by Pavel Skripkin and fixes a use-after-free in the ems_usb CAN driver. All patches are intended for stable and have stable@v.k.o on Cc. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21Merge tag 'wireless-drivers-2021-06-19' of ↵David S. Miller1-3/+10
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for v5.13 Only one important fix for an mwifiex regression. mwifiex * fix deadlock during rmmod or firmware reset, regression from cfg80211 RTNL changes in v5.12-rc1 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21hv_netvsc: Set needed_headroom according to VFHaiyang Zhang1-0/+5
Set needed_headroom according to VF if VF needs a bigger headroom. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21net/netif_receive_skb_core: Use migrate_disable()Sebastian Andrzej Siewior1-2/+2
The preempt disable around do_xdp_generic() has been introduced in commit bbbe211c295ff ("net: rcu lock and preempt disable missing around generic xdp") For BPF it is enough to use migrate_disable() and the code was updated as it can be seen in commit 3c58482a382ba ("bpf: Provide bpf_prog_run_pin_on_cpu() helper") This is a leftover which was not converted. Use migrate_disable() before invoking do_xdp_generic(). Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21net: sched: add barrier to ensure correct ordering for lockless qdiscYunsheng Lin1-0/+12
The spin_trylock() was assumed to contain the implicit barrier needed to ensure the correct ordering between STATE_MISSED setting/clearing and STATE_MISSED checking in commit a90c57f2cedd ("net: sched: fix packet stuck problem for lockless qdisc"). But it turns out that spin_trylock() only has load-acquire semantic, for strongly-ordered system(like x86), the compiler barrier implicitly contained in spin_trylock() seems enough to ensure the correct ordering. But for weakly-orderly system (like arm64), the store-release semantic is needed to ensure the correct ordering as clear_bit() and test_bit() is store operation, see queued_spin_lock(). So add the explicit barrier to ensure the correct ordering for the above case. Fixes: a90c57f2cedd ("net: sched: fix packet stuck problem for lockless qdisc") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21vrf: do not push non-ND strict packets with a source LLA through packet taps ↵Antoine Tenart1-7/+7
again Non-ND strict packets with a source LLA go through the packet taps again, while non-ND strict packets with other source addresses do not, and we can see a clone of those packets on the vrf interface (we should not). This is due to a series of changes: Commit 6f12fa775530[1] made non-ND strict packets not being pushed again in the packet taps. This changed with commit 205704c618af[2] for those packets having a source LLA, as they need a lookup with the orig_iif. The issue now is those packets do not skip the 'vrf_ip6_rcv' function to the end (as the ones without a source LLA) and go through the check to call packet taps again. This check was changed by commit 6f12fa775530[1] and do not exclude non-strict packets anymore. Packets matching 'need_strict && !is_ndisc && is_ll_src' are now being sent through the packet taps again. This can be seen by dumping packets on the vrf interface. Fix this by having the same code path for all non-ND strict packets and selectively lookup with the orig_iif for those with a source LLA. This has the effect to revert to the pre-205704c618af[2] condition, which should also be easier to maintain. [1] 6f12fa775530 ("vrf: mark skb for multicast or link-local as enslaved to VRF") [2] 205704c618af ("vrf: packets with lladdr src needs dst at input with orig_iif when needs strict") Fixes: 205704c618af ("vrf: packets with lladdr src needs dst at input with orig_iif when needs strict") Cc: Stephen Suryaputra <ssuryaextr@gmail.com> Reported-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Antoine Tenart <atenart@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21skmsg: Increase sk->sk_drops when dropping packetsCong Wang1-8/+14
It is hard to observe packet drops without increasing relevant drop counters, here we should increase sk->sk_drops which is a protocol-independent counter. Fortunately psock is always associated with a struct sock, we can just use psock->sk. Suggested-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20210615021342.7416-9-xiyou.wangcong@gmail.com
2021-06-21skmsg: Pass source psock to sk_psock_skb_redirect()Cong Wang1-5/+6
sk_psock_skb_redirect() only takes skb as a parameter, we will need to know where this skb is from, so just pass the source psock to this function as a new parameter. This patch prepares for the next one. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20210615021342.7416-8-xiyou.wangcong@gmail.com
2021-06-21skmsg: Teach sk_psock_verdict_apply() to return errorsCong Wang1-9/+14
Currently sk_psock_verdict_apply() is void, but it handles some error conditions too. Its caller is impossible to learn whether it succeeds or fails, especially sk_psock_verdict_recv(). Make it return int to indicate error cases and propagate errors to callers properly. Fixes: ef5659280eb1 ("bpf, sockmap: Allow skipping sk_skb parser program") Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20210615021342.7416-7-xiyou.wangcong@gmail.com
2021-06-21skmsg: Fix a memory leak in sk_psock_verdict_apply()Cong Wang1-0/+5
If the dest psock does not set SK_PSOCK_TX_ENABLED, the skb can't be queued anywhere so must be dropped. This one is found during code review. Fixes: 799aa7f98d53 ("skmsg: Avoid lock_sock() in sk_psock_backlog()") Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20210615021342.7416-6-xiyou.wangcong@gmail.com
2021-06-21skmsg: Clear skb redirect pointer before dropping itCong Wang1-0/+2
When we drop skb inside sk_psock_skb_redirect(), we have to clear its skb->_sk_redir pointer too, otherwise kfree_skb() would misinterpret it as a valid skb->_skb_refdst and dst_release() would eventually complain. Fixes: e3526bb92a20 ("skmsg: Move sk_redir from TCP_SKB_CB to skb") Reported-by: Jiang Wang <jiang.wang@bytedance.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20210615021342.7416-5-xiyou.wangcong@gmail.com
2021-06-21udp: Fix a memory leak in udp_read_sock()Cong Wang1-0/+2
sk_psock_verdict_recv() clones the skb and uses the clone afterward, so udp_read_sock() should free the skb after using it, regardless of error or not. This fixes a real kmemleak. Fixes: d7f571188ecf ("udp: Implement ->read_sock() for sockmap") Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20210615021342.7416-4-xiyou.wangcong@gmail.com
2021-06-21selftests/bpf: Retry for EAGAIN in udp_redir_to_connected()Cong Wang1-1/+6
We use non-blocking sockets for testing sockmap redirections, and got some random EAGAIN errors from UDP tests. There is no guarantee the packet would be immediately available to receive as soon as it is sent out, even on the local host. For UDP, this is especially true because it does not lock the sock during BH (unlike the TCP path). This is probably why we only saw this error in UDP cases. No matter how hard we try to make the queue empty check accurate, it is always possible for recvmsg() to beat ->sk_data_ready(). Therefore, we should just retry in case of EAGAIN. Fixes: d6378af615275 ("selftests/bpf: Add a test case for udp sockmap") Reported-by: Jiang Wang <jiang.wang@bytedance.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20210615021342.7416-3-xiyou.wangcong@gmail.com
2021-06-21skmsg: Improve udp_bpf_recvmsg() accuracyCong Wang4-31/+65
I tried to reuse sk_msg_wait_data() for different protocols, but it turns out it can not be simply reused. For example, UDP actually uses two queues to receive skb: udp_sk(sk)->reader_queue and sk->sk_receive_queue. So we have to check both of them to know whether we have received any packet. Also, UDP does not lock the sock during BH Rx path, it makes no sense for its ->recvmsg() to lock the sock. It is always possible for ->recvmsg() to be called before packets actually arrive in the receive queue, we just use best effort to make it accurate here. Fixes: 1f5be6b3b063 ("udp: Implement udp_bpf_recvmsg() for sockmap") Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20210615021342.7416-2-xiyou.wangcong@gmail.com
2021-06-19net: can: ems_usb: fix use-after-free in ems_usb_disconnect()Pavel Skripkin1-1/+2
In ems_usb_disconnect() dev pointer, which is netdev private data, is used after free_candev() call: | if (dev) { | unregister_netdev(dev->netdev); | free_candev(dev->netdev); | | unlink_all_urbs(dev); | | usb_free_urb(dev->intr_urb); | | kfree(dev->intr_in_buffer); | kfree(dev->tx_msg_buffer); | } Fix it by simply moving free_candev() at the end of the block. Fail log: | BUG: KASAN: use-after-free in ems_usb_disconnect | Read of size 8 at addr ffff88804e041008 by task kworker/1:2/2895 | | CPU: 1 PID: 2895 Comm: kworker/1:2 Not tainted 5.13.0-rc5+ #164 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.4 | Workqueue: usb_hub_wq hub_event | Call Trace: | dump_stack (lib/dump_stack.c:122) | print_address_description.constprop.0.cold (mm/kasan/report.c:234) | kasan_report.cold (mm/kasan/report.c:420 mm/kasan/report.c:436) | ems_usb_disconnect (drivers/net/can/usb/ems_usb.c:683 drivers/net/can/usb/ems_usb.c:1058) Fixes: 702171adeed3 ("ems_usb: Added support for EMS CPC-USB/ARM7 CAN/USB interface") Link: https://lore.kernel.org/r/20210617185130.5834-1-paskripkin@gmail.com Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-06-19can: j1939: j1939_sk_init(): set SOCK_RCU_FREE to call sk_destruct() after ↵Oleksij Rempel2-0/+7
RCU is done Set SOCK_RCU_FREE to let RCU to call sk_destruct() on completion. Without this patch, we will run in to j1939_can_recv() after priv was freed by j1939_sk_release()->j1939_sk_sock_destruct() Fixes: 25fe97cb7620 ("can: j1939: move j1939_priv_put() into sk_destruct callback") Link: https://lore.kernel.org/r/20210617130623.12705-1-o.rempel@pengutronix.de Cc: linux-stable <stable@vger.kernel.org> Reported-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Reported-by: syzbot+bdf710cfc41c186fdff3@syzkaller.appspotmail.com Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-06-19can: isotp: isotp_release(): omit unintended hrtimer restart on socket releaseOliver Hartkopp1-3/+4
When closing the isotp socket, the potentially running hrtimers are canceled before removing the subscription for CAN identifiers via can_rx_unregister(). This may lead to an unintended (re)start of a hrtimer in isotp_rcv_cf() and isotp_rcv_fc() in the case that a CAN frame is received by isotp_rcv() while the subscription removal is processed. However, isotp_rcv() is called under RCU protection, so after calling can_rx_unregister, we may call synchronize_rcu in order to wait for any RCU read-side critical sections to finish. This prevents the reception of CAN frames after hrtimer_cancel() and therefore the unintended (re)start of the hrtimers. Link: https://lore.kernel.org/r/20210618173713.2296-1-socketcan@hartkopp.net Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol") Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-06-19can: gw: synchronize rcu operations before removing gw job entryOliver Hartkopp1-0/+3
can_can_gw_rcv() is called under RCU protection, so after calling can_rx_unregister(), we have to call synchronize_rcu in order to wait for any RCU read-side critical sections to finish before removing the kmem_cache entry with the referenced gw job entry. Link: https://lore.kernel.org/r/20210618173645.2238-1-socketcan@hartkopp.net Fixes: c1aabdf379bc ("can-gw: add netlink based CAN routing") Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-06-19can: bcm: delay release of struct bcm_op after synchronize_rcu()Thadeu Lima de Souza Cascardo1-1/+6
can_rx_register() callbacks may be called concurrently to the call to can_rx_unregister(). The callbacks and callback data, though, are protected by RCU and the struct sock reference count. So the callback data is really attached to the life of sk, meaning that it should be released on sk_destruct. However, bcm_remove_op() calls tasklet_kill(), and RCU callbacks may be called under RCU softirq, so that cannot be used on kernels before the introduction of HRTIMER_MODE_SOFT. However, bcm_rx_handler() is called under RCU protection, so after calling can_rx_unregister(), we may call synchronize_rcu() in order to wait for any RCU read-side critical sections to finish. That is, bcm_rx_handler() won't be called anymore for those ops. So, we only free them, after we do that synchronize_rcu(). Fixes: ffd980f976e7 ("[CAN]: Add broadcast manager (bcm) protocol") Link: https://lore.kernel.org/r/20210619161813.2098382-1-cascardo@canonical.com Cc: linux-stable <stable@vger.kernel.org> Reported-by: syzbot+0f7e7e5e2f4f40fa89c0@syzkaller.appspotmail.com Reported-by: Norbert Slusarek <nslusarek@gmx.net> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-06-19Merge branch 'ezchip-fixes'David S. Miller1-4/+3
Pavel Skripkin says: ==================== net: ethernat: ezchip: bug fixing and code improvments While manual code reviewing, I found some error in ezchip driver. Two of them looks very dangerous: 1. use-after-free in nps_enet_remove Accessing netdev private data after free_netdev() 2. wrong error handling of platform_get_irq() It can cause passing negative irq to request_irq() Also, in 2nd patch I removed redundant check to increase execution speed and make code more straightforward. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-19net: ethernet: ezchip: fix error handlingPavel Skripkin1-1/+1
As documented at drivers/base/platform.c for platform_get_irq: * Gets an IRQ for a platform device and prints an error message if finding the * IRQ fails. Device drivers should check the return value for errors so as to * not pass a negative integer value to the request_irq() APIs. So, the driver should check that platform_get_irq() return value is _negative_, not that it's equal to zero, because -ENXIO (return value from request_irq() if irq was not found) will pass this check and it leads to passing negative irq to request_irq() Fixes: 0dd077093636 ("NET: Add ezchip ethernet driver") Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-19net: ethernet: ezchip: remove redundant checkPavel Skripkin1-2/+1
err varibale will be set everytime, when code gets into this path. This check will just slowdown the execution and that's all. Fixes: 0dd077093636 ("NET: Add ezchip ethernet driver") Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-19net: ethernet: ezchip: fix UAF in nps_enet_removePavel Skripkin1-1/+1
priv is netdev private data, but it is used after free_netdev(). It can cause use-after-free when accessing priv pointer. So, fix it by moving free_netdev() after netif_napi_del() call. Fixes: 0dd077093636 ("NET: Add ezchip ethernet driver") Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-19net: ethernet: aeroflex: fix UAF in greth_of_removePavel Skripkin1-1/+2
static int greth_of_remove(struct platform_device *of_dev) { ... struct greth_private *greth = netdev_priv(ndev); ... unregister_netdev(ndev); free_netdev(ndev); of_iounmap(&of_dev->resource[0], greth->regs, resource_size(&of_dev->resource[0])); ... } greth is netdev private data, but it is used after free_netdev(). It can cause use-after-free when accessing greth pointer. So, fix it by moving free_netdev() after of_iounmap() call. Fixes: d4c41139df6e ("net: Add Aeroflex Gaisler 10/100/1G Ethernet MAC driver") Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18Merge tag 'net-5.13-rc7' of ↵Linus Torvalds145-589/+1513
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes for 5.13-rc7, including fixes from wireless, bpf, bluetooth, netfilter and can. Current release - regressions: - mlxsw: spectrum_qdisc: Pass handle, not band number to find_class() to fix modifying offloaded qdiscs - lantiq: net: fix duplicated skb in rx descriptor ring - rtnetlink: fix regression in bridge VLAN configuration, empty info is not an error, bot-generated "fix" was not needed - libbpf: s/rx/tx/ typo on umem->rx_ring_setup_done to fix umem creation Current release - new code bugs: - ethtool: fix NULL pointer dereference during module EEPROM dump via the new netlink API - mlx5e: don't update netdev RQs with PTP-RQ, the special purpose queue should not be visible to the stack - mlx5e: select special PTP queue only for SKBTX_HW_TSTAMP skbs - mlx5e: verify dev is present in get devlink port ndo, avoid a panic Previous releases - regressions: - neighbour: allow NUD_NOARP entries to be force GCed - further fixes for fallout from reorg of WiFi locking (staging: rtl8723bs, mac80211, cfg80211) - skbuff: fix incorrect msg_zerocopy copy notifications - mac80211: fix NULL ptr deref for injected rate info - Revert "net/mlx5: Arm only EQs with EQEs" it may cause missed IRQs Previous releases - always broken: - bpf: more speculative execution fixes - netfilter: nft_fib_ipv6: skip ipv6 packets from any to link-local - udp: fix race between close() and udp_abort() resulting in a panic - fix out of bounds when parsing TCP options before packets are validated (in netfilter: synproxy, tc: sch_cake and mptcp) - mptcp: improve operation under memory pressure, add missing wake-ups - mptcp: fix double-lock/soft lookup in subflow_error_report() - bridge: fix races (null pointer deref and UAF) in vlan tunnel egress - ena: fix DMA mapping function issues in XDP - rds: fix memory leak in rds_recvmsg Misc: - vrf: allow larger MTUs - icmp: don't send out ICMP messages with a source address of 0.0.0.0 - cdc_ncm: switch to eth%d interface naming" * tag 'net-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (139 commits) net: ethernet: fix potential use-after-free in ec_bhf_remove selftests/net: Add icmp.sh for testing ICMP dummy address responses icmp: don't send out ICMP messages with a source address of 0.0.0.0 net: ll_temac: Avoid ndo_start_xmit returning NETDEV_TX_BUSY net: ll_temac: Fix TX BD buffer overwrite net: ll_temac: Add memory-barriers for TX BD access net: ll_temac: Make sure to free skb when it is completely used MAINTAINERS: add Guvenc as SMC maintainer bnxt_en: Call bnxt_ethtool_free() in bnxt_init_one() error path bnxt_en: Fix TQM fastpath ring backing store computation bnxt_en: Rediscover PHY capabilities after firmware reset cxgb4: fix wrong shift. mac80211: handle various extensible elements correctly mac80211: reset profile_periodicity/ema_ap cfg80211: avoid double free of PMSR request cfg80211: make certificate generation more robust mac80211: minstrel_ht: fix sample time check net: qed: Fix memcpy() overflow of qed_dcbx_params() net: cdc_eem: fix tx fixup skb leak net: hamradio: fix memory leak in mkiss_close ...
2021-06-18Merge tag 'for-5.13-rc6-tag' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "One more fix, for a space accounting bug in zoned mode. It happens when a block group is switched back rw->ro and unusable bytes (due to zoned constraints) are subtracted twice. It has user visible effects so I consider it important enough for late -rc inclusion and backport to stable" * tag 'for-5.13-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: zoned: fix negative space_info->bytes_readonly
2021-06-18Merge tag 'pci-v5.13-fixes-2' of ↵Linus Torvalds7-131/+306
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Clear 64-bit flag for host bridge windows below 4GB to fix a resource allocation regression added in -rc1 (Punit Agrawal) - Fix tegra194 MCFG quirk build regressions added in -rc1 (Jon Hunter) - Avoid secondary bus resets on TI KeyStone C667X devices (Antti Järvinen) - Avoid secondary bus resets on some NVIDIA GPUs (Shanker Donthineni) - Work around FLR erratum on Huawei Intelligent NIC VF (Chiqijun) - Avoid broken ATS on AMD Navi14 GPU (Evan Quan) - Trust Broadcom BCM57414 NIC to isolate functions even though it doesn't advertise ACS support (Sriharsha Basavapatna) - Work around AMD RS690 BIOSes that don't configure DMA above 4GB (Mikel Rychliski) - Fix panic during PIO transfer on Aardvark controller (Pali Rohár) * tag 'pci-v5.13-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: aardvark: Fix kernel panic during PIO transfer PCI: Add AMD RS690 quirk to enable 64-bit DMA PCI: Add ACS quirk for Broadcom BCM57414 NIC PCI: Mark AMD Navi14 GPU ATS as broken PCI: Work around Huawei Intelligent NIC VF FLR erratum PCI: Mark some NVIDIA GPUs to avoid bus reset PCI: Mark TI C667X to avoid bus reset PCI: tegra194: Fix MCFG quirk build regressions PCI: of: Clear 64-bit flag for non-prefetchable memory below 4GB
2021-06-18afs: Re-enable freezing once a page fault is interruptedMatthew Wilcox (Oracle)1-5/+8
If a task is killed during a page fault, it does not currently call sb_end_pagefault(), which means that the filesystem cannot be frozen at any time thereafter. This may be reported by lockdep like this: ==================================== WARNING: fsstress/10757 still has locks held! 5.13.0-rc4-build4+ #91 Not tainted ------------------------------------ 1 lock held by fsstress/10757: #0: ffff888104eac530 ( sb_pagefaults as filesystem freezing is modelled as a lock. Fix this by removing all the direct returns from within the function, and using 'ret' to indicate whether we were interrupted or successful. Fixes: 1cf7a1518aef ("afs: Implement shared-writeable mmap") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/20210616154900.1958373-1-willy@infradead.org/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-18net: ethernet: fix potential use-after-free in ec_bhf_removePavel Skripkin1-1/+3
static void ec_bhf_remove(struct pci_dev *dev) { ... struct ec_bhf_priv *priv = netdev_priv(net_dev); unregister_netdev(net_dev); free_netdev(net_dev); pci_iounmap(dev, priv->dma_io); pci_iounmap(dev, priv->io); ... } priv is netdev private data, but it is used after free_netdev(). It can cause use-after-free when accessing priv pointer. So, fix it by moving free_netdev() after pci_iounmap() calls. Fixes: 6af55ff52b02 ("Driver for Beckhoff CX5020 EtherCAT master module.") Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>