Age | Commit message (Collapse) | Author | Files | Lines |
|
Some drivers clear the 'ethtool_link_ksettings' struct in their
get_link_ksettings() callback, before populating it with actual values.
Such drivers will set the new 'link_mode' field to zero, resulting in
user space receiving wrong link mode information given that zero is a
valid value for the field.
Another problem is that some drivers (notably tun) can report random
values in the 'link_mode' field. This can result in a general protection
fault when the field is used as an index to the 'link_mode_params' array
[1].
This happens because such drivers implement their set_link_ksettings()
callback by simply overwriting their private copy of
'ethtool_link_ksettings' struct with the one they get from the stack,
which is not always properly initialized.
Fix these problems by removing 'link_mode' from 'ethtool_link_ksettings'
and instead have drivers call ethtool_params_from_link_mode() with the
current link mode. The function will derive the link parameters (e.g.,
speed) from the link mode and fill them in the 'ethtool_link_ksettings'
struct.
v3:
* Remove link_mode parameter and derive the link parameters in
the driver instead of passing link_mode parameter to ethtool
and derive it there.
v2:
* Introduce 'cap_link_mode_supported' instead of adding a
validity field to 'ethtool_link_ksettings' struct.
[1]
general protection fault, probably for non-canonical address 0xdffffc00f14cc32c: 0000 [#1] PREEMPT SMP KASAN
KASAN: probably user-memory-access in range [0x000000078a661960-0x000000078a661967]
CPU: 0 PID: 8452 Comm: syz-executor360 Not tainted 5.11.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__ethtool_get_link_ksettings+0x1a3/0x3a0 net/ethtool/ioctl.c:446
Code: b7 3e fa 83 fd ff 0f 84 30 01 00 00 e8 16 b0 3e fa 48 8d 3c ed 60 d5 69 8a 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03
+38 d0 7c 08 84 d2 0f 85 b9
RSP: 0018:ffffc900019df7a0 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff888026136008 RCX: 0000000000000000
RDX: 00000000f14cc32c RSI: ffffffff873439ca RDI: 000000078a661960
RBP: 00000000ffff8880 R08: 00000000ffffffff R09: ffff88802613606f
R10: ffffffff873439bc R11: 0000000000000000 R12: 0000000000000000
R13: ffff88802613606c R14: ffff888011d0c210 R15: ffff888011d0c210
FS: 0000000000749300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004b60f0 CR3: 00000000185c2000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
linkinfo_prepare_data+0xfd/0x280 net/ethtool/linkinfo.c:37
ethnl_default_notify+0x1dc/0x630 net/ethtool/netlink.c:586
ethtool_notify+0xbd/0x1f0 net/ethtool/netlink.c:656
ethtool_set_link_ksettings+0x277/0x330 net/ethtool/ioctl.c:620
dev_ethtool+0x2b35/0x45d0 net/ethtool/ioctl.c:2842
dev_ioctl+0x463/0xb70 net/core/dev_ioctl.c:440
sock_do_ioctl+0x148/0x2d0 net/socket.c:1060
sock_ioctl+0x477/0x6a0 net/socket.c:1177
vfs_ioctl fs/ioctl.c:48 [inline]
__do_sys_ioctl fs/ioctl.c:753 [inline]
__se_sys_ioctl fs/ioctl.c:739 [inline]
__x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: c8907043c6ac9 ("ethtool: Get link mode in use instead of speed and duplex parameters")
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5 fixes 2021-04-06
This series provides some fixes to mlx5 driver.
Please pull and let me know if there is any problem.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix remaining issues with kdoc in the ethtool headers.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a note on expected handling of reserved fields,
and references to all kdocs. This fixes a bunch
of kdoc warnings.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Extended link state structures and enums use kdoc headers
but then do not describe any of the members.
Convert to normal comments.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add reserved mapping to cover all the register in order to avoid setting
arbitrary values to newer FW which implements the reserved fields.
Fixes: 50b4a3c23646 ("net/mlx5: PPTB and PBMC register firmware command support")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Add reserved mapping to cover all the register in order to avoid
setting arbitrary values to newer FW which implements the reserved
fields.
Fixes: a58837f52d43 ("net/mlx5e: Expose FEC feilds and related capability bit")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The cited commit wrongly placed log_max_flow_counter field of
mlx5_ifc_flow_table_prop_layout_bits, align it to the HW spec intended
placement.
Fixes: 16f1c5bb3ed7 ("net/mlx5: Check device capability for maximum flow counters")
Signed-off-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") brought a ~10% performance drop.
The reason for the performance drop was that GRO was forced
to chain sk_buff (using skb_shinfo(skb)->frag_list), which
uses more memory but also cause packet consumers to go over
a lot of overhead handling all the tiny skbs.
It turns out that virtio_net page_to_skb() has a wrong strategy :
It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then
copies 128 bytes from the page, before feeding the packet to GRO stack.
This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") because GRO was using 2 frags per MSS,
meaning we were not packing MSS with 100% efficiency.
Fix is to pull only the ethernet header in page_to_skb()
Then, we change virtio_net_hdr_to_skb() to pull the missing
headers, instead of assuming they were already pulled by callers.
This fixes the performance regression, but could also allow virtio_net
to accept packets with more than 128bytes of headers.
Many thanks to Xuan Zhuo for his report, and his tests/help.
Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
Reported-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://www.spinics.net/lists/netdev/msg731397.html
Co-Developed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Alexei Starovoitov says:
====================
pull-request: bpf 2021-04-01
The following pull-request contains BPF updates for your *net* tree.
We've added 11 non-merge commits during the last 8 day(s) which contain
a total of 10 files changed, 151 insertions(+), 26 deletions(-).
The main changes are:
1) xsk creation fixes, from Ciara.
2) bpf_get_task_stack fix, from Dave.
3) trampoline in modules fix, from Jiri.
4) bpf_obj_get fix for links and progs, from Lorenz.
5) struct_ops progs must be gpl compatible fix, from Toke.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit f211ac154577ec9ccf07c15f18a6abf0d9bdb4ab.
We had similar attempt in the past, and we reverted it.
History:
64a146513f8f12ba204b7bf5cb7e9505594ead42 [NET]: Revert incorrect accept queue backlog changes.
8488df894d05d6fa41c2bd298c335f944bb0e401 [NET]: Fix bugs in "Whether sock accept queue is full" checking
I am adding a fat comment so that future attempts will
be much harder.
Fixes: f211ac154577 ("net: correct sk_acceptq_is_full()")
Cc: iuyacan <yacanliu@163.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2021-03-31
1) Fix ipv4 pmtu checks for xfrm anf vti interfaces.
From Eyal Birger.
2) There are situations where the socket passed to
xfrm_output_resume() is not the same as the one
attached to the skb. Use the socket passed to
xfrm_output_resume() to avoid lookup failures
when xfrm is used with VRFs.
From Evan Nimmo.
3) Make the xfrm_state_hash_generation sequence counter per
network namespace because but its write serialization
lock is also per network namespace. Write protection
is insufficient otherwise.
From Ahmed S. Darwish.
4) Fixup sctp featue flags when used with esp offload.
From Xin Long.
5) xfrm BEET mode doesn't support fragments for inner packets.
This is a limitation of the protocol, so no fix possible.
Warn at least to notify the user about that situation.
From Xin Long.
6) Fix NULL pointer dereference on policy lookup when
namespaces are uses in combination with esp offload.
7) Fix incorrect transformation on esp offload when
packets get segmented at layer 3.
8) Fix some user triggered usages of WARN_ONCE in
the xfrm compat layer.
From Dmitry Safonov.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 924a9bc362a5 ("net: check if protocol extracted by virtio_net_hdr_set_proto is correct")
added a call to dev_parse_header_protocol() but mac_header is not yet set.
This means that eth_hdr() reads complete garbage, and syzbot complained about it [1]
This patch resets mac_header earlier, to get more coverage about this change.
Audit of virtio_net_hdr_to_skb() callers shows that this change should be safe.
[1]
BUG: KASAN: use-after-free in eth_header_parse_protocol+0xdc/0xe0 net/ethernet/eth.c:282
Read of size 2 at addr ffff888017a6200b by task syz-executor313/8409
CPU: 1 PID: 8409 Comm: syz-executor313 Not tainted 5.12.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x141/0x1d7 lib/dump_stack.c:120
print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:232
__kasan_report mm/kasan/report.c:399 [inline]
kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:416
eth_header_parse_protocol+0xdc/0xe0 net/ethernet/eth.c:282
dev_parse_header_protocol include/linux/netdevice.h:3177 [inline]
virtio_net_hdr_to_skb.constprop.0+0x99d/0xcd0 include/linux/virtio_net.h:83
packet_snd net/packet/af_packet.c:2994 [inline]
packet_sendmsg+0x2325/0x52b0 net/packet/af_packet.c:3031
sock_sendmsg_nosec net/socket.c:654 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:674
sock_no_sendpage+0xf3/0x130 net/core/sock.c:2860
kernel_sendpage.part.0+0x1ab/0x350 net/socket.c:3631
kernel_sendpage net/socket.c:3628 [inline]
sock_sendpage+0xe5/0x140 net/socket.c:947
pipe_to_sendpage+0x2ad/0x380 fs/splice.c:364
splice_from_pipe_feed fs/splice.c:418 [inline]
__splice_from_pipe+0x43e/0x8a0 fs/splice.c:562
splice_from_pipe fs/splice.c:597 [inline]
generic_splice_sendpage+0xd4/0x140 fs/splice.c:746
do_splice_from fs/splice.c:767 [inline]
do_splice+0xb7e/0x1940 fs/splice.c:1079
__do_splice+0x134/0x250 fs/splice.c:1144
__do_sys_splice fs/splice.c:1350 [inline]
__se_sys_splice fs/splice.c:1332 [inline]
__x64_sys_splice+0x198/0x250 fs/splice.c:1332
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
Fixes: 924a9bc362a5 ("net: check if protocol extracted by virtio_net_hdr_set_proto is correct")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Balazs Nemeth <bnemeth@redhat.com>
Cc: Willem de Bruijn <willemb@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently the mentioned helper can end-up freeing the socket wmem
without waking-up any processes waiting for more write memory.
If the partially orphaned skb is attached to an UDP (or raw) socket,
the lack of wake-up can hang the user-space.
Even for TCP sockets not calling the sk destructor could have bad
effects on TSQ.
Address the issue using skb_orphan to release the sk wmem before
setting the new sock_efree destructor. Additionally bundle the
whole ownership update in a new helper, so that later other
potential users could avoid duplicate code.
v1 -> v2:
- use skb_orphan() instead of sort of open coding it (Eric)
- provide an helper for the ownership change (Eric)
Fixes: f6ba8d33cfbb ("netem: fix skb_orphan_partial()")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In commit ea7800565a12 ("can: add optional DLC element to Classical
CAN frame structure") the struct can_frame::can_dlc was put into an
anonymous union with another u8 variable.
For various reasons some members in struct can_frame and canfd_frame
including the first 8 byes of data are expected to have the same
memory layout. This is enforced by a BUILD_BUG_ON check in af_can.c.
Since the above mentioned commit this check fails on ARM kernels
compiled with the ARM OABI (which means CONFIG_AEABI not set). In this
case -mabi=apcs-gnu is passed to the compiler, which leads to a
structure size boundary of 32, instead of 8 compared to CONFIG_AEABI
enabled. This means the the union in struct can_frame takes 4 bytes
instead of the expected 1.
Rong Chen illustrates the problem with pahole in the ARM OABI case:
| struct can_frame {
| canid_t can_id; /* 0 4 */
| union {
| __u8 len; /* 4 1 */
| __u8 can_dlc; /* 4 1 */
| }; /* 4 4 */
| __u8 __pad; /* 8 1 */
| __u8 __res0; /* 9 1 */
| __u8 len8_dlc; /* 10 1 */
|
| /* XXX 5 bytes hole, try to pack */
|
| __u8 data[8]
| __attribute__((__aligned__(8))); /* 16 8 */
|
| /* size: 24, cachelines: 1, members: 6 */
| /* sum members: 19, holes: 1, sum holes: 5 */
| /* forced alignments: 1, forced holes: 1, sum forced holes: 5 */
| /* last cacheline: 24 bytes */
| } __attribute__((__aligned__(8)));
Marking the anonymous union as __attribute__((packed)) fixes the
BUILD_BUG_ON problem on these compilers.
Fixes: ea7800565a12 ("can: add optional DLC element to Classical CAN frame structure")
Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: Rong Chen <rong.a.chen@intel.com>
Link: https://lore.kernel.org/linux-can/2c82ec23-3551-61b5-1bd8-178c3407ee83@hartkopp.net/
Link: https://lore.kernel.org/r/20210325125850.1620-3-socketcan@hartkopp.net
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Currently module can be unloaded even if there's a trampoline
register in it. It's easily reproduced by running in parallel:
# while :; do ./test_progs -t module_attach; done
# while :; do rmmod bpf_testmod; sleep 0.5; done
Taking the module reference in case the trampoline's ip is
within the module code. Releasing it when the trampoline's
ip is unregistered.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210326105900.151466-1-jolsa@kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2021-03-25
This series contains updates to virtchnl header file and i40e driver.
Norbert removes added padding from virtchnl RSS structures as this
causes issues when iterating over the arrays.
Mateusz adds Asym_Pause as supported to allow these settings to be set
as the hardware supports it.
Eryk fixes an issue where encountering a VF reset alongside releasing
VFs could cause a call trace.
Arkadiusz moves TC setup before resource setup as previously it was
possible to enter with a null q_vector causing a kernel oops.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This fixes following syzbot report:
UBSAN: shift-out-of-bounds in ./include/net/red.h:237:23
shift exponent 32 is too large for 32-bit type 'unsigned int'
CPU: 1 PID: 8418 Comm: syz-executor170 Not tainted 5.12.0-rc4-next-20210324-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x141/0x1d7 lib/dump_stack.c:120
ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
__ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:327
red_set_parms include/net/red.h:237 [inline]
choke_change.cold+0x3c/0xc8 net/sched/sch_choke.c:414
qdisc_create+0x475/0x12f0 net/sched/sch_api.c:1247
tc_modify_qdisc+0x4c8/0x1a50 net/sched/sch_api.c:1663
rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5553
netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502
netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338
netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927
sock_sendmsg_nosec net/socket.c:654 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:674
____sys_sendmsg+0x6e8/0x810 net/socket.c:2350
___sys_sendmsg+0xf3/0x170 net/socket.c:2404
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2433
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x43f039
Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffdfa725168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043f039
RDX: 0000000000000000 RSI: 0000000020000040 RDI: 0000000000000004
RBP: 0000000000403020 R08: 0000000000400488 R09: 0000000000400488
R10: 0000000000400488 R11: 0000000000000246 R12: 00000000004030b0
R13: 0000000000000000 R14: 00000000004ac018 R15: 0000000000400488
Fixes: 8afa10cbe281 ("net_sched: red: Avoid illegal values")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Merge misc fixes from Andrew Morton:
"14 patches.
Subsystems affected by this patch series: mm (hugetlb, kasan, gup,
selftests, z3fold, kfence, memblock, and highmem), squashfs, ia64,
gcov, and mailmap"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mailmap: update Andrey Konovalov's email address
mm/highmem: fix CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP
mm: memblock: fix section mismatch warning again
kfence: make compatible with kmemleak
gcov: fix clang-11+ support
ia64: fix format strings for err_inject
ia64: mca: allocate early mca with GFP_ATOMIC
squashfs: fix xattr id and id lookup sanity checks
squashfs: fix inode lookup sanity checks
z3fold: prevent reclaim/free race for headless pages
selftests/vm: fix out-of-tree build
mm/mmu_notifiers: ensure range_end() is paired with range_start()
kasan: fix per-page tags for non-page_alloc pages
hugetlb_cgroup: fix imbalanced css_get and css_put pair for shared mappings
|
|
Remove padding from RSS structures. Previous layout
could lead to unwanted compiler optimizations
in loops when iterating over key and lut arrays.
Fixes: 65ece6de0114 ("virtchnl: Add missing explicit padding to structures")
Signed-off-by: Norbert Ciosek <norbertx.ciosek@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Commit 34dc2efb39a2 ("memblock: fix section mismatch warning") marked
memblock_bottom_up() and memblock_set_bottom_up() as __init, but they
could be referenced from non-init functions like
memblock_find_in_range_node() on architectures that enable
CONFIG_ARCH_KEEP_MEMBLOCK.
For such builds kernel test robot reports:
WARNING: modpost: vmlinux.o(.text+0x74fea4): Section mismatch in reference from the function memblock_find_in_range_node() to the function .init.text:memblock_bottom_up()
The function memblock_find_in_range_node() references the function __init memblock_bottom_up().
This is often because memblock_find_in_range_node lacks a __init annotation or the annotation of memblock_bottom_up is wrong.
Replace __init annotations with __init_memblock annotations so that the
appropriate section will be selected depending on
CONFIG_ARCH_KEEP_MEMBLOCK.
Link: https://lore.kernel.org/lkml/202103160133.UzhgY0wt-lkp@intel.com
Link: https://lkml.kernel.org/r/20210316171347.14084-1-rppt@kernel.org
Fixes: 34dc2efb39a2 ("memblock: fix section mismatch warning")
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
If one or more notifiers fails .invalidate_range_start(), invoke
.invalidate_range_end() for "all" notifiers. If there are multiple
notifiers, those that did not fail are expecting _start() and _end() to
be paired, e.g. KVM's mmu_notifier_count would become imbalanced.
Disallow notifiers that can fail _start() from implementing _end() so
that it's unnecessary to either track which notifiers rejected _start(),
or had already succeeded prior to a failed _start().
Note, the existing behavior of calling _start() on all notifiers even
after a previous notifier failed _start() was an unintented "feature".
Make it canon now that the behavior is depended on for correctness.
As of today, the bug is likely benign:
1. The only caller of the non-blocking notifier is OOM kill.
2. The only notifiers that can fail _start() are the i915 and Nouveau
drivers.
3. The only notifiers that utilize _end() are the SGI UV GRU driver
and KVM.
4. The GRU driver will never coincide with the i195/Nouveau drivers.
5. An imbalanced kvm->mmu_notifier_count only causes soft lockup in the
_guest_, and the guest is already doomed due to being an OOM victim.
Fix the bug now to play nice with future usage, e.g. KVM has a
potential use case for blocking memslot updates in KVM while an
invalidation is in-progress, and failure to unblock would result in said
updates being blocked indefinitely and hanging.
Found by inspection. Verified by adding a second notifier in KVM that
periodically returns -EAGAIN on non-blockable ranges, triggering OOM,
and observing that KVM exits with an elevated notifier count.
Link: https://lkml.kernel.org/r/20210311180057.1582638-1-seanjc@google.com
Fixes: 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifiers")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Ben Gardon <bgardon@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
To allow performing tag checks on page_alloc addresses obtained via
page_address(), tag-based KASAN modes store tags for page_alloc
allocations in page->flags.
Currently, the default tag value stored in page->flags is 0x00.
Therefore, page_address() returns a 0x00ffff... address for pages that
were not allocated via page_alloc.
This might cause problems. A particular case we encountered is a
conflict with KFENCE. If a KFENCE-allocated slab object is being freed
via kfree(page_address(page) + offset), the address passed to kfree()
will get tagged with 0x00 (as slab pages keep the default per-page
tags). This leads to is_kfence_address() check failing, and a KFENCE
object ending up in normal slab freelist, which causes memory
corruptions.
This patch changes the way KASAN stores tag in page-flags: they are now
stored xor'ed with 0xff. This way, KASAN doesn't need to initialize
per-page flags for every created page, which might be slow.
With this change, page_address() returns natively-tagged (with 0xff)
pointers for pages that didn't have tags set explicitly.
This patch fixes the encountered conflict with KFENCE and prevents more
similar issues that can occur in the future.
Link: https://lkml.kernel.org/r/1a41abb11c51b264511d9e71c303bb16d5cb367b.1615475452.git.andreyknvl@google.com
Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Branislav Rankov <Branislav.Rankov@arm.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The current implementation of hugetlb_cgroup for shared mappings could
have different behavior. Consider the following two scenarios:
1.Assume initial css reference count of hugetlb_cgroup is 1:
1.1 Call hugetlb_reserve_pages with from = 1, to = 2. So css reference
count is 2 associated with 1 file_region.
1.2 Call hugetlb_reserve_pages with from = 2, to = 3. So css reference
count is 3 associated with 2 file_region.
1.3 coalesce_file_region will coalesce these two file_regions into
one. So css reference count is 3 associated with 1 file_region
now.
2.Assume initial css reference count of hugetlb_cgroup is 1 again:
2.1 Call hugetlb_reserve_pages with from = 1, to = 3. So css reference
count is 2 associated with 1 file_region.
Therefore, we might have one file_region while holding one or more css
reference counts. This inconsistency could lead to imbalanced css_get()
and css_put() pair. If we do css_put one by one (i.g. hole punch case),
scenario 2 would put one more css reference. If we do css_put all
together (i.g. truncate case), scenario 1 will leak one css reference.
The imbalanced css_get() and css_put() pair would result in a non-zero
reference when we try to destroy the hugetlb cgroup. The hugetlb cgroup
directory is removed __but__ associated resource is not freed. This
might result in OOM or can not create a new hugetlb cgroup in a busy
workload ultimately.
In order to fix this, we have to make sure that one file_region must
hold exactly one css reference. So in coalesce_file_region case, we
should release one css reference before coalescence. Also only put css
reference when the entire file_region is removed.
The last thing to note is that the caller of region_add() will only hold
one reference to h_cg->css for the whole contiguous reservation region.
But this area might be scattered when there are already some
file_regions reside in it. As a result, many file_regions may share only
one h_cg->css reference. In order to ensure that one file_region must
hold exactly one css reference, we should do css_get() for each
file_region and release the reference held by caller when they are done.
[linmiaohe@huawei.com: fix imbalanced css_get and css_put pair for shared mappings]
Link: https://lkml.kernel.org/r/20210316023002.53921-1-linmiaohe@huawei.com
Link: https://lkml.kernel.org/r/20210301120540.37076-1-linmiaohe@huawei.com
Fixes: 075a61d07a8e ("hugetlb_cgroup: add accounting for shared mappings")
Reported-by: kernel test robot <lkp@intel.com> (auto build test ERROR)
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Wanpeng Li <liwp.linux@gmail.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Pull networking fixes from David Miller:
"Various fixes, all over:
1) Fix overflow in ptp_qoriq_adjfine(), from Yangbo Lu.
2) Always store the rx queue mapping in veth, from Maciej
Fijalkowski.
3) Don't allow vmlinux btf in map_create, from Alexei Starovoitov.
4) Fix memory leak in octeontx2-af from Colin Ian King.
5) Use kvalloc in bpf x86 JIT for storing jit'd addresses, from
Yonghong Song.
6) Fix tx ptp stats in mlx5, from Aya Levin.
7) Check correct ip version in tun decap, fropm Roi Dayan.
8) Fix rate calculation in mlx5 E-Switch code, from arav Pandit.
9) Work item memork leak in mlx5, from Shay Drory.
10) Fix ip6ip6 tunnel crash with bpf, from Daniel Borkmann.
11) Lack of preemptrion awareness in macvlan, from Eric Dumazet.
12) Fix data race in pxa168_eth, from Pavel Andrianov.
13) Range validate stab in red_check_params(), from Eric Dumazet.
14) Inherit vlan filtering setting properly in b53 driver, from
Florian Fainelli.
15) Fix rtnl locking in igc driver, from Sasha Neftin.
16) Pause handling fixes in igc driver, from Muhammad Husaini
Zulkifli.
17) Missing rtnl locking in e1000_reset_task, from Vitaly Lifshits.
18) Use after free in qlcnic, from Lv Yunlong.
19) fix crash in fritzpci mISDN, from Tong Zhang.
20) Premature rx buffer reuse in igb, from Li RongQing.
21) Missing termination of ip[a driver message handler arrays, from
Alex Elder.
22) Fix race between "x25_close" and "x25_xmit"/"x25_rx" in hdlc_x25
driver, from Xie He.
23) Use after free in c_can_pci_remove(), from Tong Zhang.
24) Uninitialized variable use in nl80211, from Jarod Wilson.
25) Off by one size calc in bpf verifier, from Piotr Krysiuk.
26) Use delayed work instead of deferrable for flowtable GC, from
Yinjun Zhang.
27) Fix infinite loop in NPC unmap of octeontx2 driver, from
Hariprasad Kelam.
28) Fix being unable to change MTU of dwmac-sun8i devices due to lack
of fifo sizes, from Corentin Labbe.
29) DMA use after free in r8169 with WoL, fom Heiner Kallweit.
30) Mismatched prototypes in isdn-capi, from Arnd Bergmann.
31) Fix psample UAPI breakage, from Ido Schimmel"
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (171 commits)
psample: Fix user API breakage
math: Export mul_u64_u64_div_u64
ch_ktls: fix enum-conversion warning
octeontx2-af: Fix memory leak of object buf
ptp_qoriq: fix overflow in ptp_qoriq_adjfine() u64 calcalation
net: bridge: don't notify switchdev for local FDB addresses
net/sched: act_ct: clear post_ct if doing ct_clear
net: dsa: don't assign an error value to tag_ops
isdn: capi: fix mismatched prototypes
net/mlx5: SF, do not use ecpu bit for vhca state processing
net/mlx5e: Fix division by 0 in mlx5e_select_queue
net/mlx5e: Fix error path for ethtool set-priv-flag
net/mlx5e: Offload tuple rewrite for non-CT flows
net/mlx5e: Allow to match on MPLS parameters only for MPLS over UDP
net/mlx5: Add back multicast stats for uplink representor
net: ipconfig: ic_dev can be NULL in ic_close_devs
MAINTAINERS: Combine "QLOGIC QLGE 10Gb ETHERNET DRIVER" sections into one
docs: networking: Fix a typo
r8169: fix DMA being used after buffer free if WoL is enabled
net: ipa: fix init header command validation
...
|
|
Cited commit added a new attribute before the existing group reference
count attribute, thereby changing its value and breaking existing
applications on new kernels.
Before:
# psample -l
libpsample ERROR psample_group_foreach: failed to recv message: Operation not supported
After:
# psample -l
Group Num Refcount Group Seq
1 1 0
Fix by restoring the value of the old attribute and remove the
misleading comments from the enumerator to avoid future bugs.
Cc: stable@vger.kernel.org
Fixes: d8bed686ab96 ("net: psample: Add tunnel support")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reported-by: Adiel Bidani <adielb@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When xfrm interfaces are used in combination with namespaces
and ESP offload, we get a dst_entry NULL pointer dereference.
This is because we don't have a dst_entry attached in the ESP
offloading case and we need to do a policy lookup before the
namespace transition.
Fix this by expicit checking of skb_dst(skb) before accessing it.
Fixes: f203b76d78092 ("xfrm: Add virtual xfrm interfaces")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
This is the killable version of wait_on_page_writeback.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: kafs-testing@auristor.com
cc: linux-afs@lists.infradead.org
cc: linux-mm@kvack.org
Link: https://lore.kernel.org/r/20210320054104.1300774-3-willy@infradead.org
|
|
Cachefiles was relying on wait_page_key and wait_bit_key being the
same layout, which is fragile. Now that wait_page_key is exposed in
the pagemap.h header, we can remove that fragility
A comment on the need to maintain structure layout equivalence was added by
Linus[1] and that is no longer applicable.
Fixes: 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: kafs-testing@auristor.com
cc: linux-cachefs@redhat.com
cc: linux-mm@kvack.org
Link: https://lore.kernel.org/r/20210320054104.1300774-2-willy@infradead.org/
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3510ca20ece0150af6b10c77a74ff1b5c198e3e2 [1]
|
|
A sequence counter write section must be serialized or its internal
state can get corrupted. A plain seqcount_t does not contain the
information of which lock must be held to guaranteee write side
serialization.
For xfrm_state_hash_generation, use seqcount_spinlock_t instead of plain
seqcount_t. This allows to associate the spinlock used for write
serialization with the sequence counter. It thus enables lockdep to
verify that the write serialization lock is indeed held before entering
the sequence counter write section.
If lockdep is disabled, this lock association is compiled out and has
neither storage size nor runtime overhead.
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
A sequence counter write section must be serialized or its internal
state can get corrupted. The "xfrm_state_hash_generation" seqcount is
global, but its write serialization lock (net->xfrm.xfrm_state_lock) is
instantiated per network namespace. The write protection is thus
insufficient.
To provide full protection, localize the sequence counter per network
namespace instead. This should be safe as both the seqcount read and
write sections access data exclusively within the network namespace. It
also lays the foundation for transforming "xfrm_state_hash_generation"
data type from seqcount_t to seqcount_LOCKNAME_t in further commits.
Fixes: b65e3d7be06f ("xfrm: state: add sequence count to detect hash resizes")
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB and Thunderbolt driver fixes from Greg KH:
"Here are some small Thunderbolt and USB driver fixes for some reported
issues:
- thunderbolt fixes for minor problems
- typec fixes for power issues
- usb-storage quirk addition
- usbip bugfix
- dwc3 bugfix when stopping transfers
- cdnsp bugfix for isoc transfers
- gadget use-after-free fix
All have been in linux-next this week with no reported issues"
* tag 'usb-5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: typec: tcpm: Skip sink_cap query only when VDM sm is busy
usb: dwc3: gadget: Prevent EP queuing while stopping transfers
usb: typec: tcpm: Invoke power_supply_changed for tcpm-source-psy-
usb: typec: Remove vdo[3] part of tps6598x_rx_identity_reg struct
usb-storage: Add quirk to defeat Kindle's automatic unload
usb: gadget: configfs: Fix KASAN use-after-free
usbip: Fix incorrect double assignment to udc->ud.tcp_rx
usb: cdnsp: Fixes incorrect value in ISOC TRB
thunderbolt: Increase runtime PM reference count on DP tunnel discovery
thunderbolt: Initialize HopID IDAs in tb_switch_alloc()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
- Get static calls & modules right. Hopefully.
- WW mutex fixes
* tag 'locking-urgent-2021-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
static_call: Fix static_call_update() sanity check
static_call: Align static_call_is_init() patching condition
static_call: Fix static_call_set_init()
locking/ww_mutex: Fix acquire/release imbalance in ww_acquire_init()/ww_acquire_fini()
locking/ww_mutex: Simplify use_ww_ctx & ww_ctx handling
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:
- another missing RT_PROP table related fix, to ensure that the
efivarfs pseudo filesystem fails gracefully if variable services
are unsupported
- use the correct alignment for literal EFI GUIDs
- fix a use after unmap issue in the memreserve code
* tag 'efi-urgent-2021-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi: use 32-bit alignment for efi_guid_t literals
firmware/efi: Fix a use after bug in efi_mem_reserve_persistent
efivars: respect EFI_UNSUPPORTED return from firmware
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
"The freshest pile of shiny x86 fixes for 5.12:
- Add the arch-specific mapping between physical and logical CPUs to
fix devicetree-node lookups
- Restore the IRQ2 ignore logic
- Fix get_nr_restart_syscall() to return the correct restart syscall
number. Split in a 4-patches set to avoid kABI breakage when
backporting to dead kernels"
* tag 'x86_urgent_for_v5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/apic/of: Fix CPU devicetree-node lookups
x86/ioapic: Ignore IRQ2 again
x86: Introduce restart_block->arch_data to remove TS_COMPAT_RESTART
x86: Introduce TS_COMPAT_RESTART to fix get_nr_restart_syscall()
x86: Move TS_COMPAT back to asm/thread_info.h
kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data()
|
|
Alexei Starovoitov says:
====================
pull-request: bpf 2021-03-20
The following pull-request contains BPF updates for your *net* tree.
We've added 5 non-merge commits during the last 3 day(s) which contain
a total of 8 files changed, 155 insertions(+), 12 deletions(-).
The main changes are:
1) Use correct nops in fexit trampoline, from Stanislav.
2) Fix BTF dump, from Jean-Philippe.
3) Fix umd memory leak, from Zqiang.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull io_uring fixes from Jens Axboe:
"Quieter week this time, which was both expected and desired. About
half of the below is fixes for this release, the other half are just
fixes in general. In detail:
- Fix the freezing of IO threads, by making the freezer not send them
fake signals. Make them freezable by default.
- Like we did for personalities, move the buffer IDR to xarray. Kills
some code and avoids a use-after-free on teardown.
- SQPOLL cleanups and fixes (Pavel)
- Fix linked timeout race (Pavel)
- Fix potential completion post use-after-free (Pavel)
- Cleanup and move internal structures outside of general kernel view
(Stefan)
- Use MSG_SIGNAL for send/recv from io_uring (Stefan)"
* tag 'io_uring-5.12-2021-03-19' of git://git.kernel.dk/linux-block:
io_uring: don't leak creds on SQO attach error
io_uring: use typesafe pointers in io_uring_task
io_uring: remove structures from include/linux/io_uring.h
io_uring: imply MSG_NOSIGNAL for send[msg]()/recv[msg]() calls
io_uring: fix sqpoll cancellation via task_work
io_uring: add generic callback_head helpers
io_uring: fix concurrent parking
io_uring: halt SQO submission on ctx exit
io_uring: replace sqd rw_semaphore with mutex
io_uring: fix complete_post use ctx after free
io_uring: fix ->flags races by linked timeouts
io_uring: convert io_buffer_idr to XArray
io_uring: allow IO worker threads to be frozen
kernel: freezer should treat PF_IO_WORKER like PF_KTHREAD for freezing
|
|
The syzbot reported a memleak as follows:
BUG: memory leak
unreferenced object 0xffff888101b41d00 (size 120):
comm "kworker/u4:0", pid 8, jiffies 4294944270 (age 12.780s)
backtrace:
[<ffffffff8125dc56>] alloc_pid+0x66/0x560
[<ffffffff81226405>] copy_process+0x1465/0x25e0
[<ffffffff81227943>] kernel_clone+0xf3/0x670
[<ffffffff812281a1>] kernel_thread+0x61/0x80
[<ffffffff81253464>] call_usermodehelper_exec_work
[<ffffffff81253464>] call_usermodehelper_exec_work+0xc4/0x120
[<ffffffff812591c9>] process_one_work+0x2c9/0x600
[<ffffffff81259ab9>] worker_thread+0x59/0x5d0
[<ffffffff812611c8>] kthread+0x178/0x1b0
[<ffffffff8100227f>] ret_from_fork+0x1f/0x30
unreferenced object 0xffff888110ef5c00 (size 232):
comm "kworker/u4:0", pid 8414, jiffies 4294944270 (age 12.780s)
backtrace:
[<ffffffff8154a0cf>] kmem_cache_zalloc
[<ffffffff8154a0cf>] __alloc_file+0x1f/0xf0
[<ffffffff8154a809>] alloc_empty_file+0x69/0x120
[<ffffffff8154a8f3>] alloc_file+0x33/0x1b0
[<ffffffff8154ab22>] alloc_file_pseudo+0xb2/0x140
[<ffffffff81559218>] create_pipe_files+0x138/0x2e0
[<ffffffff8126c793>] umd_setup+0x33/0x220
[<ffffffff81253574>] call_usermodehelper_exec_async+0xb4/0x1b0
[<ffffffff8100227f>] ret_from_fork+0x1f/0x30
After the UMD process exits, the pipe_to_umh/pipe_from_umh and
tgid need to be released.
Fixes: d71fa5c9763c ("bpf: Add kernel module with user mode driver that populates bpffs.")
Reported-by: syzbot+44908bb56d2bfe56b28e@syzkaller.appspotmail.com
Signed-off-by: Zqiang <qiang.zhang@windriver.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210317030915.2865-1-qiang.zhang@windriver.com
|
|
s/recalcultion/recalculation/
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull workqueue tracing fix from Steven Rostedt:
"Fix workqueue trace event unsafe string reference
After adding a verifier to test all strings printed in trace events to
make sure they either point to a string on the ring buffer, or to read
only core kernel memory, it triggered on a workqueue trace event. The
trace event workqueue_queue_work references the allocated name of the
workqueue in the output. If the workqueue is freed before the trace is
read, then the trace will dereference freed memory.
Update the trace event to use the __string(), __assign_str(), and
__get_str() helpers to handle such cases"
* tag 'trace-v5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
workqueue/tracing: Copy workqueue name to buffer in trace event
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi into efi/urgent
Pull EFI fixes from Ard Biesheuvel:
"- another missing RT_PROP table related fix, to ensure that the efivarfs
pseudo filesystem fails gracefully if variable services are unsupported
- use the correct alignment for literal EFI GUIDs
- fix a use after unmap issue in the memreserve code"
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Commit 494c704f9af0 ("efi: Use 32-bit alignment for efi_guid_t") updated
the type definition of efi_guid_t to ensure that it always appears
sufficiently aligned (the UEFI spec is ambiguous about this, but given
the fact that its EFI_GUID type is defined in terms of a struct carrying
a uint32_t, the natural alignment is definitely >= 32 bits).
However, we missed the EFI_GUID() macro which is used to instantiate
efi_guid_t literals: that macro is still based on the guid_t type,
which does not have a minimum alignment at all. This results in warnings
such as
In file included from drivers/firmware/efi/mokvar-table.c:35:
include/linux/efi.h:1093:34: warning: passing 1-byte aligned argument to
4-byte aligned parameter 2 of 'get_var' may result in an unaligned pointer
access [-Walign-mismatch]
status = get_var(L"SecureBoot", &EFI_GLOBAL_VARIABLE_GUID, NULL, &size,
^
include/linux/efi.h:1101:24: warning: passing 1-byte aligned argument to
4-byte aligned parameter 2 of 'get_var' may result in an unaligned pointer
access [-Walign-mismatch]
get_var(L"SetupMode", &EFI_GLOBAL_VARIABLE_GUID, NULL, &size, &setupmode);
The distinction only matters on CPUs that do not support misaligned loads
fully, but 32-bit ARM's load-multiple instructions fall into that category,
and these are likely to be emitted by the compiler that built the firmware
for loading word-aligned 128-bit GUIDs from memory
So re-implement the initializer in terms of our own efi_guid_t type, so that
the alignment becomes a property of the literal's type.
Fixes: 494c704f9af0 ("efi: Use 32-bit alignment for efi_guid_t")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://github.com/ClangBuiltLinux/linux/issues/1327
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
1) Several patches to testore use of memory barriers instead of RCU to
ensure consistent access to ruleset, from Mark Tomlinson.
2) Fix dump of expectation via ctnetlink, from Florian Westphal.
3) GRE helper works for IPv6, from Ludovic Senecaux.
4) Set error on unsupported flowtable flags.
5) Use delayed instead of deferrable workqueue in the flowtable,
from Yinjun Zhang.
6) Fix spurious EEXIST in case of add-after-delete flowtable in
the same batch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull drm fixes from Dave Airlie:
"Regular fixes pull, pretty small set of fixes, a couple of i915 and
amdgpu, one ttm, one nouveau and one omap. Probably smaller than usual
for this time, so we'll see if something pops up next week or if this
will continue to stay small.
Summary:
ttm:
- Make ttm_bo_unpin() not wraparound on too many unpins
omap:
- Fix coccicheck warning in omap
amdgpu:
- DCN 3.0 gamma fixes
- DCN 2.1 corrupt screen fix
i915:
- Workaround async flip + VT-d frame corruption on HSW/BDW
- Fix NMI watchdog crash due to uninitialized OA buffer use on gen12+
nouveau:
- workaround oops with bo syncing"
* tag 'drm-fixes-2021-03-19' of git://anongit.freedesktop.org/drm/drm:
nouveau: Skip unvailable ttm page entries
drm/amd/display: Remove MPC gamut remap logic for DCN30
drm/amd/display: Correct algorithm for reversed gamma
drm/omap: dsi: fix unsigned expression compared with zero
i915/perf: Start hrtimer only if sampling the OA buffer
drm/i915: Workaround async flip + VT-d corruption on HSW/BDW
drm/amd/display: Copy over soc values before bounding box creation
drm/ttm: make ttm_bo_unpin more defensive
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-fixes for v5.12-rc4:
- Make ttm_bo_unpin() not wraparound on too many unpins.
- Fix coccicheck warning in omap.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/a0e13bbb-6ba6-ff24-4db8-0e02e605de18@linux.intel.com
|
|
Pull VFIO fixes from Alex Williamson:
- Fix 32-bit issue with new unmap-all flag (Steve Sistare)
- Various Kconfig changes for better coverage (Jason Gunthorpe)
- Fix to batch pinning support (Daniel Jordan)
* tag 'vfio-v5.12-rc4' of git://github.com/awilliam/linux-vfio:
vfio/type1: fix vaddr_get_pfns() return in vfio_pin_page_external()
vfio: Depend on MMU
ARM: amba: Allow some ARM_AMBA users to compile with COMPILE_TEST
vfio-platform: Add COMPILE_TEST to VFIO_PLATFORM
vfio: IOMMU_API should be selected
vfio/type1: fix unmap all on ILP32
|
|
Pull virtio fixes from Michael Tsirkin:
"Some fixes and cleanups all over the place"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vhost-vdpa: set v->config_ctx to NULL if eventfd_ctx_fdget() fails
vhost-vdpa: fix use-after-free of v->config_ctx
vhost: Fix vhost_vq_reset()
vhost_vdpa: fix the missing irq_bypass_unregister_producer() invocation
vdpa_sim: Skip typecasting from void*
virtio: remove export for virtio_config_{enable, disable}
virtio-mmio: Use to_virtio_mmio_device() to simply code
vdpa: set the virtqueue num during register
|
|
The trace event "workqueue_queue_work" references an unsafe string in
dereferencing the name of the workqueue. As the name is allocated, it
could later be freed, and the pointer to that string could stay on the
tracing buffer. If the trace buffer is read after the string is freed, it
will reference an unsafe pointer.
I added a new verifier to make sure that all strings referenced in the
output of the trace buffer is safe to read and this triggered on the
workqueue_queue_work trace event:
workqueue_queue_work: work struct=00000000b2b235c7 function=gc_worker workqueue=(0xffff888100051160:events_power_efficient)[UNSAFE-MEMORY] req_cpu=256 cpu=1
workqueue_queue_work: work struct=00000000c344caec function=flush_to_ldisc workqueue=(0xffff888100054d60:events_unbound)[UNSAFE-MEMORY] req_cpu=256 cpu=4294967295
workqueue_queue_work: work struct=00000000b2b235c7 function=gc_worker workqueue=(0xffff888100051160:events_power_efficient)[UNSAFE-MEMORY] req_cpu=256 cpu=1
workqueue_queue_work: work struct=000000000b238b3f function=vmstat_update workqueue=(0xffff8881000c3760:mm_percpu_wq)[UNSAFE-MEMORY] req_cpu=1 cpu=1
Also, if this event is read via a user space application like perf or
trace-cmd, the name would only be an address and useless information:
workqueue_queue_work: work struct=0xffff953f80b4b918 function=disk_events_workfn workqueue=ffff953f8005d378 req_cpu=8192 cpu=5
Cc: Zqiang <qiang.zhang@windriver.com>
Cc: Tejun Heo <tj@kernel.org>
Fixes: 7bf9c4a88e3e3 ("workqueue: tracing the name of the workqueue instead of it's address")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Link: https://lore.kernel.org/r/8c1d14f3748105f4caeda01716d47af2fa41d11c.1615809009.git.metze@samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Daniel Borkmann says:
====================
pull-request: bpf 2021-03-18
The following pull-request contains BPF updates for your *net* tree.
We've added 10 non-merge commits during the last 4 day(s) which contain
a total of 14 files changed, 336 insertions(+), 94 deletions(-).
The main changes are:
1) Fix fexit/fmod_ret trampoline for sleepable programs, and also fix a ftrace
splat in modify_ftrace_direct() on address change, from Alexei Starovoitov.
2) Fix two oob speculation possibilities that allows unprivileged to leak mem
via side-channel, from Piotr Krysiuk and Daniel Borkmann.
3) Fix libbpf's netlink handling wrt SOCK_CLOEXEC, from Kumar Kartikeya Dwivedi.
4) Fix libbpf's error handling on failure in getting section names, from Namhyung Kim.
5) Fix tunnel collect_md BPF selftest wrt Geneve option handling, from Hangbin Liu.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|