summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2023-01-04pinctrl: nomadik: Add missing header(s)Andy Shevchenko8-31/+54
Do not imply that some of the generic headers may be always included. Instead, include explicitly what we are direct user of. While at it, sort headers alphabetically. Fixes: e5530adc17a7 ("pinctrl: Clean up headers") Cc: Randy Dunlap <rdunlap@infradead.org> Reported-by: Arnd Bergmann <arnd@arndb.de> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20221229100746.35047-1-andriy.shevchenko@linux.intel.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2023-01-03dt-bindings: soundwire: qcom,soundwire: correct sizes related to number of portsKrzysztof Kozlowski1-5/+5
There are several properties depending on number of ports. Some of them had maximum limit of 5 and some of 8. SM8450 AudioReach comes with 8 ports, so fix the limits: sm8450-sony-xperia-nagara-pdx224.dtb: soundwire-controller@3250000: qcom,ports-word-length: 'oneOf' conditional failed, one must be fixed: [[255, 255, 255, 255, 255, 255, 255, 255]] is too short [255, 255, 255, 255, 255, 255, 255, 255] is too long Fixes: febc50b82bc9 ("dt-bindings: soundwire: Convert text bindings to DT Schema") Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20221223132159.81211-1-krzysztof.kozlowski@linaro.org Signed-off-by: Rob Herring <robh@kernel.org>
2023-01-03of/fdt: run soc memory setup when early_init_dt_scan_memory failsAndreas Rammhold2-3/+5
If memory has been found early_init_dt_scan_memory now returns 1. If it hasn't found any memory it will return 0, allowing other memory setup mechanisms to carry on. Previously early_init_dt_scan_memory always returned 0 without distinguishing between any kind of memory setup being done or not. Any code path after the early_init_dt_scan memory call in the ramips plat_mem_setup code wouldn't be executed anymore. Making early_init_dt_scan_memory the only way to initialize the memory. Some boards, including my mt7621 based Cudy X6 board, depend on memory initialization being done via the soc_info.mem_detect function pointer. Those wouldn't be able to obtain memory and panic the kernel during early bootup with the message "early_init_dt_alloc_memory_arch: Failed to allocate 12416 bytes align=0x40". Fixes: 1f012283e936 ("of/fdt: Rework early_init_dt_scan_memory() to call directly") Cc: stable@vger.kernel.org Signed-off-by: Andreas Rammhold <andreas@rammhold.de> Link: https://lore.kernel.org/r/20221223112748.2935235-1-andreas@rammhold.de Signed-off-by: Rob Herring <robh@kernel.org>
2023-01-03drm/amd/display: Uninitialized variables causing 4k60 UCLK to stay at DPM1 ↵Samson Tam1-3/+3
and not DPM0 [Why] SwathSizePerSurfaceY[] and SwathSizePerSurfaceC[] values are uninitialized because we are using += instead of = operator. [How] Assign values in loop with = operator. Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Samson Tam <samson.tam@amd.com> Reviewed-by: Aric Cyr <aric.cyr@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x, 6.1.x
2023-01-03drm/amdkfd: Fix kernel warning during topology setupMukul Joshi1-1/+1
This patch fixes the following kernel warning seen during driver load by correctly initializing the p2plink attr before creating the sysfs file: [ +0.002865] ------------[ cut here ]------------ [ +0.002327] kobject: '(null)' (0000000056260cfb): is not initialized, yet kobject_put() is being called. [ +0.004780] WARNING: CPU: 32 PID: 1006 at lib/kobject.c:718 kobject_put+0xaa/0x1c0 [ +0.001361] Call Trace: [ +0.001234] <TASK> [ +0.001067] kfd_remove_sysfs_node_entry+0x24a/0x2d0 [amdgpu] [ +0.003147] kfd_topology_update_sysfs+0x3d/0x750 [amdgpu] [ +0.002890] kfd_topology_add_device+0xbd7/0xc70 [amdgpu] [ +0.002844] ? lock_release+0x13c/0x2e0 [ +0.001936] ? smu_cmn_send_smc_msg_with_param+0x1e8/0x2d0 [amdgpu] [ +0.003313] ? amdgpu_dpm_get_mclk+0x54/0x60 [amdgpu] [ +0.002703] kgd2kfd_device_init.cold+0x39f/0x4ed [amdgpu] [ +0.002930] amdgpu_amdkfd_device_init+0x13d/0x1f0 [amdgpu] [ +0.002944] amdgpu_device_init.cold+0x1464/0x17b4 [amdgpu] [ +0.002970] ? pci_bus_read_config_word+0x43/0x80 [ +0.002380] amdgpu_driver_load_kms+0x15/0x100 [amdgpu] [ +0.002744] amdgpu_pci_probe+0x147/0x370 [amdgpu] [ +0.002522] local_pci_probe+0x40/0x80 [ +0.001896] work_for_cpu_fn+0x10/0x20 [ +0.001892] process_one_work+0x26e/0x5a0 [ +0.002029] worker_thread+0x1fd/0x3e0 [ +0.001890] ? process_one_work+0x5a0/0x5a0 [ +0.002115] kthread+0xea/0x110 [ +0.001618] ? kthread_complete_and_exit+0x20/0x20 [ +0.002422] ret_from_fork+0x1f/0x30 [ +0.001808] </TASK> [ +0.001103] irq event stamp: 59837 [ +0.001718] hardirqs last enabled at (59849): [<ffffffffb30fab12>] __up_console_sem+0x52/0x60 [ +0.004414] hardirqs last disabled at (59860): [<ffffffffb30faaf7>] __up_console_sem+0x37/0x60 [ +0.004414] softirqs last enabled at (59654): [<ffffffffb307d9c7>] irq_exit_rcu+0xd7/0x130 [ +0.004205] softirqs last disabled at (59649): [<ffffffffb307d9c7>] irq_exit_rcu+0xd7/0x130 [ +0.004203] ---[ end trace 0000000000000000 ]--- Fixes: 0f28cca87e9a ("drm/amdkfd: Extend KFD device topology to surface peer-to-peer links") Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-01-03Merge tag 'drm-misc-next-fixes-2023-01-03' of ↵Daniel Vetter4-6/+8
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Maxime writes: "The drm-misc-next-fixes leftovers. It addresses a bug in drm/scheduler ending up causing a lockup, and reduces the stack usage of some drm/mm kunit tests." Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20230103144926.bmjjni3xnuis2jmq@houat
2023-01-03perf lock contention: Fix core dump related to not finding the ↵Thomas Richter1-0/+2
"__sched_text_end" symbol on s/390 The test case perf lock contention dumps core on s390. Run the following commands: # ./perf lock record -- ./perf bench sched messaging # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 10 groups == 400 processes run Total time: 2.799 [sec] [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.073 MB perf.data (100 samples) ] # # ./perf lock contention Segmentation fault (core dumped) # The function call stack is lengthy, here are the top 5 functions: # gdb ./perf core.24048 GNU gdb (GDB) Fedora Linux 12.1-6.fc37 Core was generated by `./perf lock contention'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28, addr=1789230) at util/machine.c:3356 3356 machine->sched.text_end = kmap->unmap_ip(kmap, sym->start); (gdb) where #0 0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28, addr=1789230) at util/machine.c:3356 #1 0x000000000109f244 in callchain_id (evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:957 #2 0x000000000109e094 in get_key_by_aggr_mode (key=0x3ffea4f7290, addr=27758136, evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:586 #3 0x000000000109f4d0 in report_lock_contention_begin_event (evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:1004 #4 0x00000000010a00ae in evsel__process_contention_begin (evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:1254 #5 0x00000000010a0e14 in process_sample_event (tool=0x3ffea4f8480, event=0x3ff85601ef8, sample=0x3ffea4f77d0, evsel=0x30313e0, machine=0x3029e28) at builtin-lock.c:1464 ..... The issue is in function machine__is_lock_function() in file ./util/machine.c lines 3355: /* should not fail from here */ sym = machine__find_kernel_symbol_by_name(machine, "__sched_text_end", &kmap); machine->sched.text_end = kmap->unmap_ip(kmap, sym->start) On s390 the symbol __sched_text_end is *NOT* in the symbol list and the resulting pointer sym is set to NULL. The sym->start is then a NULL pointer access and generates the core dump. The reason why __sched_text_end is not in the symbol list on s390 is simple: When the symbol list is created at perf start up with function calls dso__load +--> dso__load_vmlinux_path +--> dso__load_vmlinux +--> dso__load_sym +--> dso__load_sym_internal (reads kernel symbols) +--> symbols__fixup_end +--> symbols__fixup_duplicate The issue is in function symbols__fixup_duplicate(). It deletes all symbols with have the same address. On s390: # nm -g ~/linux/vmlinux| fgrep c68390 0000000000c68390 T __cpuidle_text_start 0000000000c68390 T __sched_text_end # two symbols have identical addresses and __sched_text_end is considered duplicate (in ascending sort order) and removed from the symbol list. Therefore it is missing and an invalid pointer reference occurs. The code checks for symbol __sched_text_start and when it exists assumes symbol __sched_text_end is also in the symbol table. However this is not the case on s390. Same situation exists for symbol __lock_text_start: 0000000000c68770 T __cpuidle_text_end 0000000000c68770 T __lock_text_start This symbol is also removed from the symbol table but used in function machine__is_lock_function(). To fix this and keep duplicate symbols in the symbol table, set symbol_conf.allow_aliases to true. This prevents the removal of duplicate symbols in function symbols__fixup_duplicate(). Output After: # ./perf lock contention contended total wait max wait avg wait type caller 48 124.39 ms 123.99 ms 2.59 ms rwsem:W unlink_anon_vmas+0x24a 47 83.68 ms 83.26 ms 1.78 ms rwsem:W free_pgtables+0x132 5 41.22 us 10.55 us 8.24 us rwsem:W free_pgtables+0x140 4 40.12 us 20.55 us 10.03 us rwsem:W copy_process+0x1ac8 # Fixes: 0d2997f750d1de39 ("perf lock: Look up callchain for the contended locks") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20221230102627.2410847-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-03perf build: Don't propagate subdir to submakes for install_headersIan Rogers1-5/+5
subdir is added to the OUTPUT which fails as part of building install_headers when passed from "make -C tools perf_install". Committer testing: The original reporter (see the Link: below) had trouble with this: $ make -C tools perf_install That ended up with errors like this: /var/home/acme/git/perf-urgent/tools/scripts/Makefile.include:17: *** output directory "/var/home/acme/git/perf-urgent/tools/perf/libperf/perf/" does not exist. Stop. With this patch applied we now get it installed at: INSTALL /var/home/acme/git/perf-urgent/tools/perf/libperf/include/perf/bpf_perf.h As expected: $ ls -la /var/home/acme/git/perf-urgent/tools/perf/libperf/include/perf/bpf_perf.h -rw-r--r--. 1 acme acme 1146 Jan 3 15:42 /var/home/acme/git/perf-urgent/tools/perf/libperf/include/perf/bpf_perf.h And if we clean tools with: $ make -C tools clean it gets cleaned up: $ ls -la /var/home/acme/git/perf-urgent/tools/perf/libperf/include/perf/bpf_perf.h ls: cannot access '/var/home/acme/git/perf-urgent/tools/perf/libperf/include/perf/bpf_perf.h': No such file or directory $ Fixes: 746bd29e348f99b4 ("perf build: Use tools/lib headers from install path") Reported-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/fa4b3115-d555-3d7f-54d1-018002e99350@secunet.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-03xfs: xfs_qm: remove unnecessary ‘0’ values from errorLi zeming1-1/+1
error is assigned first, so it does not need to initialize the assignment. Signed-off-by: Li zeming <zeming@nfschina.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2023-01-03xfs: Fix deadlock on xfs_inodegc_workerWu Guanghao1-0/+10
We are doing a test about deleting a large number of files when memory is low. A deadlock problem was found. [ 1240.279183] -> #1 (fs_reclaim){+.+.}-{0:0}: [ 1240.280450] lock_acquire+0x197/0x460 [ 1240.281548] fs_reclaim_acquire.part.0+0x20/0x30 [ 1240.282625] kmem_cache_alloc+0x2b/0x940 [ 1240.283816] xfs_trans_alloc+0x8a/0x8b0 [ 1240.284757] xfs_inactive_ifree+0xe4/0x4e0 [ 1240.285935] xfs_inactive+0x4e9/0x8a0 [ 1240.286836] xfs_inodegc_worker+0x160/0x5e0 [ 1240.287969] process_one_work+0xa19/0x16b0 [ 1240.289030] worker_thread+0x9e/0x1050 [ 1240.290131] kthread+0x34f/0x460 [ 1240.290999] ret_from_fork+0x22/0x30 [ 1240.291905] [ 1240.291905] -> #0 ((work_completion)(&gc->work)){+.+.}-{0:0}: [ 1240.293569] check_prev_add+0x160/0x2490 [ 1240.294473] __lock_acquire+0x2c4d/0x5160 [ 1240.295544] lock_acquire+0x197/0x460 [ 1240.296403] __flush_work+0x6bc/0xa20 [ 1240.297522] xfs_inode_mark_reclaimable+0x6f0/0xdc0 [ 1240.298649] destroy_inode+0xc6/0x1b0 [ 1240.299677] dispose_list+0xe1/0x1d0 [ 1240.300567] prune_icache_sb+0xec/0x150 [ 1240.301794] super_cache_scan+0x2c9/0x480 [ 1240.302776] do_shrink_slab+0x3f0/0xaa0 [ 1240.303671] shrink_slab+0x170/0x660 [ 1240.304601] shrink_node+0x7f7/0x1df0 [ 1240.305515] balance_pgdat+0x766/0xf50 [ 1240.306657] kswapd+0x5bd/0xd20 [ 1240.307551] kthread+0x34f/0x460 [ 1240.308346] ret_from_fork+0x22/0x30 [ 1240.309247] [ 1240.309247] other info that might help us debug this: [ 1240.309247] [ 1240.310944] Possible unsafe locking scenario: [ 1240.310944] [ 1240.312379] CPU0 CPU1 [ 1240.313363] ---- ---- [ 1240.314433] lock(fs_reclaim); [ 1240.315107] lock((work_completion)(&gc->work)); [ 1240.316828] lock(fs_reclaim); [ 1240.318088] lock((work_completion)(&gc->work)); [ 1240.319203] [ 1240.319203] *** DEADLOCK *** ... [ 2438.431081] Workqueue: xfs-inodegc/sda xfs_inodegc_worker [ 2438.432089] Call Trace: [ 2438.432562] __schedule+0xa94/0x1d20 [ 2438.435787] schedule+0xbf/0x270 [ 2438.436397] schedule_timeout+0x6f8/0x8b0 [ 2438.445126] wait_for_completion+0x163/0x260 [ 2438.448610] __flush_work+0x4c4/0xa40 [ 2438.455011] xfs_inode_mark_reclaimable+0x6ef/0xda0 [ 2438.456695] destroy_inode+0xc6/0x1b0 [ 2438.457375] dispose_list+0xe1/0x1d0 [ 2438.458834] prune_icache_sb+0xe8/0x150 [ 2438.461181] super_cache_scan+0x2b3/0x470 [ 2438.461950] do_shrink_slab+0x3cf/0xa50 [ 2438.462687] shrink_slab+0x17d/0x660 [ 2438.466392] shrink_node+0x87e/0x1d40 [ 2438.467894] do_try_to_free_pages+0x364/0x1300 [ 2438.471188] try_to_free_pages+0x26c/0x5b0 [ 2438.473567] __alloc_pages_slowpath.constprop.136+0x7aa/0x2100 [ 2438.482577] __alloc_pages+0x5db/0x710 [ 2438.485231] alloc_pages+0x100/0x200 [ 2438.485923] allocate_slab+0x2c0/0x380 [ 2438.486623] ___slab_alloc+0x41f/0x690 [ 2438.490254] __slab_alloc+0x54/0x70 [ 2438.491692] kmem_cache_alloc+0x23e/0x270 [ 2438.492437] xfs_trans_alloc+0x88/0x880 [ 2438.493168] xfs_inactive_ifree+0xe2/0x4e0 [ 2438.496419] xfs_inactive+0x4eb/0x8b0 [ 2438.497123] xfs_inodegc_worker+0x16b/0x5e0 [ 2438.497918] process_one_work+0xbf7/0x1a20 [ 2438.500316] worker_thread+0x8c/0x1060 [ 2438.504938] ret_from_fork+0x22/0x30 When the memory is insufficient, xfs_inonodegc_worker will trigger memory reclamation when memory is allocated, then flush_work() may be called to wait for the work to complete. This causes a deadlock. So use memalloc_nofs_save() to avoid triggering memory reclamation in xfs_inodegc_worker. Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2023-01-03xfs: get root inode correctly at bulkstatHironori Shiina1-2/+2
The root inode number should be set to `breq->startino` for getting stat information of the root when XFS_BULK_IREQ_SPECIAL_ROOT is used. Otherwise, the inode search is started from 1 (XFS_BULK_IREQ_SPECIAL_ROOT) and the inode with the lowest number in a filesystem is returned. Fixes: bf3cb3944792 ("xfs: allow single bulkstat of special inodes") Signed-off-by: Hironori Shiina <shiina.hironori@fujitsu.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2023-01-03xfs: fix off-by-one error in xfs_btree_space_to_heightDarrick J. Wong1-1/+6
Lately I've been stress-testing extreme-sized rmap btrees by using the (new) xfs_db bmap_inflate command to clone bmbt mappings billions of times and then using xfs_repair to build new rmap and refcount btrees. This of course is /much/ faster than actually FICLONEing a file billions of times. Unfortunately, xfs_repair fails in xfs_btree_bload_compute_geometry with EOVERFLOW, which indicates that xfs_mount.m_rmap_maxlevels is not sufficiently large for the test scenario. For a 1TB filesystem (~67 million AG blocks, 4 AGs) the btheight command reports: $ xfs_db -c 'btheight -n 4400801200 -w min rmapbt' /dev/sda rmapbt: worst case per 4096-byte block: 84 records (leaf) / 45 keyptrs (node) level 0: 4400801200 records, 52390491 blocks level 1: 52390491 records, 1164234 blocks level 2: 1164234 records, 25872 blocks level 3: 25872 records, 575 blocks level 4: 575 records, 13 blocks level 5: 13 records, 1 block 6 levels, 53581186 blocks total The AG is sufficiently large to build this rmap btree. Unfortunately, m_rmap_maxlevels is 5. Augmenting the loop in the space->height function to report height, node blocks, and blocks remaining produces this: ht 1 node_blocks 45 blockleft 67108863 ht 2 node_blocks 2025 blockleft 67108818 ht 3 node_blocks 91125 blockleft 67106793 ht 4 node_blocks 4100625 blockleft 67015668 final height: 5 The goal of this function is to compute the maximum height btree that can be stored in the given number of ondisk fsblocks. Starting with the top level of the tree, each iteration through the loop adds the fanout factor of the next level down until we run out of blocks. IOWs, maximum height is achieved by using the smallest fanout factor that can apply to that level. However, the loop setup is not correct. Top level btree blocks are allowed to contain fewer than minrecs items, so the computation is incorrect because the first time through the loop it should be using a fanout factor of 2. With this corrected, the above becomes: ht 1 node_blocks 2 blockleft 67108863 ht 2 node_blocks 90 blockleft 67108861 ht 3 node_blocks 4050 blockleft 67108771 ht 4 node_blocks 182250 blockleft 67104721 ht 5 node_blocks 8201250 blockleft 66922471 final height: 6 Fixes: 9ec691205e7d ("xfs: compute the maximum height of the rmap btree when reflink enabled") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-01-03perf/x86/rapl: Treat Tigerlake like IcelakeChris Wilson1-0/+2
Since Tigerlake seems to have inherited its cstates and other RAPL power caps from Icelake, assume it also follows Icelake for its RAPL events. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Zhang Rui <rui.zhang@intel.com> Link: https://lore.kernel.org/r/20221228113454.1199118-1-rodrigo.vivi@intel.com
2023-01-03x86/insn: Avoid namespace clash by separating instruction decoder MMIO type ↵Jason A. Donenfeld4-41/+41
from MMIO trace type Both <linux/mmiotrace.h> and <asm/insn-eval.h> define various MMIO_ enum constants, whose namespace overlaps. Rename the <asm/insn-eval.h> ones to have a INSN_ prefix, so that the headers can be used from the same source file. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230101162910.710293-2-Jason@zx2c4.com
2023-01-03f2fs: let's avoid panic if extent_tree is not createdJaegeuk Kim1-1/+2
This patch avoids the below panic. pc : __lookup_extent_tree+0xd8/0x760 lr : f2fs_do_write_data_page+0x104/0x87c sp : ffffffc010cbb3c0 x29: ffffffc010cbb3e0 x28: 0000000000000000 x27: ffffff8803e7f020 x26: ffffff8803e7ed40 x25: ffffff8803e7f020 x24: ffffffc010cbb460 x23: ffffffc010cbb480 x22: 0000000000000000 x21: 0000000000000000 x20: ffffffff22e90900 x19: 0000000000000000 x18: ffffffc010c5d080 x17: 0000000000000000 x16: 0000000000000020 x15: ffffffdb1acdbb88 x14: ffffff888759e2b0 x13: 0000000000000000 x12: ffffff802da49000 x11: 000000000a001200 x10: ffffff8803e7ed40 x9 : ffffff8023195800 x8 : ffffff802da49078 x7 : 0000000000000001 x6 : 0000000000000000 x5 : 0000000000000006 x4 : ffffffc010cbba28 x3 : 0000000000000000 x2 : ffffffc010cbb480 x1 : 0000000000000000 x0 : ffffff8803e7ed40 Call trace: __lookup_extent_tree+0xd8/0x760 f2fs_do_write_data_page+0x104/0x87c f2fs_write_single_data_page+0x420/0xb60 f2fs_write_cache_pages+0x418/0xb1c __f2fs_write_data_pages+0x428/0x58c f2fs_write_data_pages+0x30/0x40 do_writepages+0x88/0x190 __writeback_single_inode+0x48/0x448 writeback_sb_inodes+0x468/0x9e8 __writeback_inodes_wb+0xb8/0x2a4 wb_writeback+0x33c/0x740 wb_do_writeback+0x2b4/0x400 wb_workfn+0xe4/0x34c process_one_work+0x24c/0x5bc worker_thread+0x3e8/0xa50 kthread+0x150/0x1b4 Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-03f2fs: should use a temp extent_info for lookupJaegeuk Kim1-6/+7
Otherwise, __lookup_extent_tree() will override the given extent_info which will be used by caller. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-03f2fs: don't mix to use union values in extent_infoJaegeuk Kim1-8/+8
Let's explicitly use the defined values in block_age case only. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-03f2fs: initialize extent_cache parameterJaegeuk Kim4-4/+4
This can avoid confusing tracepoint values. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-03f2fs: fix to avoid NULL pointer dereference in f2fs_issue_flush()Chao Yu1-7/+4
With below two cases, it will cause NULL pointer dereference when accessing SM_I(sbi)->fcc_info in f2fs_issue_flush(). a) If kthread_run() fails in f2fs_create_flush_cmd_control(), it will release SM_I(sbi)->fcc_info, - mount -o noflush_merge /dev/vda /mnt/f2fs - mount -o remount,flush_merge /dev/vda /mnt/f2fs -- kthread_run() fails - dd if=/dev/zero of=/mnt/f2fs/file bs=4k count=1 conv=fsync b) we will never allocate memory for SM_I(sbi)->fcc_info w/ below testcase, - mount -o ro /dev/vda /mnt/f2fs - mount -o rw,remount /dev/vda /mnt/f2fs - dd if=/dev/zero of=/mnt/f2fs/file bs=4k count=1 conv=fsync In order to fix this issue, let change as below: - fix error path handling in f2fs_create_flush_cmd_control(). - allocate SM_I(sbi)->fcc_info even if readonly is on. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-03x86/asm: Fix an assembler warning with current binutilsMikulas Patocka1-1/+1
Fix a warning: "found `movsd'; assuming `movsl' was meant" Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: linux-kernel@vger.kernel.org
2023-01-03firmware: arm_scmi: Fix virtio channels cleanup on shutdownCristian Marussi1-1/+6
When unloading the SCMI core stack module, configured to use the virtio SCMI transport, LOCKDEP reports the splat down below about unsafe locks dependencies. In order to avoid this possible unsafe locking scenario call upfront virtio_break_device() before getting hold of vioch->lock. ===================================================== WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected 6.1.0-00067-g6b934395ba07-dirty #4 Not tainted ----------------------------------------------------- rmmod/307 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: ffff000080c510e0 (&dev->vqs_list_lock){+.+.}-{3:3}, at: virtio_break_device+0x28/0x68 and this task is already holding: ffff00008288ada0 (&channels[i].lock){-.-.}-{3:3}, at: virtio_chan_free+0x60/0x168 [scmi_module] which would create a new lock dependency: (&channels[i].lock){-.-.}-{3:3} -> (&dev->vqs_list_lock){+.+.}-{3:3} but this new dependency connects a HARDIRQ-irq-safe lock: (&channels[i].lock){-.-.}-{3:3} ... which became HARDIRQ-irq-safe at: lock_acquire+0x128/0x398 _raw_spin_lock_irqsave+0x78/0x140 scmi_vio_complete_cb+0xb4/0x3b8 [scmi_module] vring_interrupt+0x84/0x120 vm_interrupt+0x94/0xe8 __handle_irq_event_percpu+0xb4/0x3d8 handle_irq_event_percpu+0x20/0x68 handle_irq_event+0x50/0xb0 handle_fasteoi_irq+0xac/0x138 generic_handle_domain_irq+0x34/0x50 gic_handle_irq+0xa0/0xd8 call_on_irq_stack+0x2c/0x54 do_interrupt_handler+0x8c/0x90 el1_interrupt+0x40/0x78 el1h_64_irq_handler+0x18/0x28 el1h_64_irq+0x64/0x68 _raw_write_unlock_irq+0x48/0x80 ep_start_scan+0xf0/0x128 do_epoll_wait+0x390/0x858 do_compat_epoll_pwait.part.34+0x1c/0xb8 __arm64_sys_epoll_pwait+0x80/0xd0 invoke_syscall+0x4c/0x110 el0_svc_common.constprop.3+0x98/0x120 do_el0_svc+0x34/0xd0 el0_svc+0x40/0x98 el0t_64_sync_handler+0x98/0xc0 el0t_64_sync+0x170/0x174 to a HARDIRQ-irq-unsafe lock: (&dev->vqs_list_lock){+.+.}-{3:3} ... which became HARDIRQ-irq-unsafe at: ... lock_acquire+0x128/0x398 _raw_spin_lock+0x58/0x70 __vring_new_virtqueue+0x130/0x1c0 vring_create_virtqueue+0xc4/0x2b8 vm_find_vqs+0x20c/0x430 init_vq+0x308/0x390 virtblk_probe+0x114/0x9b0 virtio_dev_probe+0x1a4/0x248 really_probe+0xc8/0x3a8 __driver_probe_device+0x84/0x190 driver_probe_device+0x44/0x110 __driver_attach+0x104/0x1e8 bus_for_each_dev+0x7c/0xd0 driver_attach+0x2c/0x38 bus_add_driver+0x1e4/0x258 driver_register+0x6c/0x128 register_virtio_driver+0x2c/0x48 virtio_blk_init+0x70/0xac do_one_initcall+0x84/0x420 kernel_init_freeable+0x2d0/0x340 kernel_init+0x2c/0x138 ret_from_fork+0x10/0x20 other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&dev->vqs_list_lock); local_irq_disable(); lock(&channels[i].lock); lock(&dev->vqs_list_lock); <Interrupt> lock(&channels[i].lock); *** DEADLOCK *** ================ Fixes: 42e90eb53bf3f ("firmware: arm_scmi: Add a virtio channel refcount") Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20221222183823.518856-6-cristian.marussi@arm.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2023-01-03firmware: arm_scmi: Harden shared memory access in fetch_notificationCristian Marussi1-1/+3
A misbheaving SCMI platform firmware could reply with out-of-spec notifications, shorter than the mimimum size comprising a header. Fixes: d5141f37c42e ("firmware: arm_scmi: Add notifications support in transport layer") Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20221222183823.518856-4-cristian.marussi@arm.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2023-01-03firmware: arm_scmi: Harden shared memory access in fetch_responseCristian Marussi1-2/+3
A misbheaving SCMI platform firmware could reply with out-of-spec messages, shorter than the mimimum size comprising a header and a status field. Harden shmem_fetch_response to properly truncate such a bad messages. Fixes: 5c8a47a5a91d ("firmware: arm_scmi: Make scmi core independent of the transport type") Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20221222183823.518856-3-cristian.marussi@arm.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2023-01-03firmware: arm_scmi: Clear stale xfer->hdr.statusCristian Marussi1-0/+2
Stale error status reported from a previous message transaction must be cleared before starting a new transaction to avoid being confusingly reported in the following SCMI message dump traces. Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20221222183823.518856-2-cristian.marussi@arm.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2023-01-03EDAC/highbank: Fix memory leak in highbank_mc_probe()Miaoqian Lin1-2/+5
When devres_open_group() fails, it returns -ENOMEM without freeing memory allocated by edac_mc_alloc(). Call edac_mc_free() on the error handling path to avoid a memory leak. [ bp: Massage commit message. ] Fixes: a1b01edb2745 ("edac: add support for Calxeda highbank memory controller") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Link: https://lore.kernel.org/r/20221229054825.1361993-1-linmq006@gmail.com
2023-01-03regulator: qcom-rpmh: PM8550 ldo11 regulator is an nldoNeil Armstrong1-1/+1
This fixes the definition of the PM8550 ldo11 regulator matching it's capabilities since this LDO is designed to work between 1,2V and 1,5V. Fixes: e6e3776d682d ("regulator: qcom-rpmh: Add support for PM8550 regulators") Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Reviewed-by: Abel Vesa <abel.vesa@linaro.org> Link: https://lore.kernel.org/r/20230102-topic-sm8550-upstream-fixes-reg-l11b-nldo-v1-1-d97def246338@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2023-01-03btrfs: fix compat_ro checks against remountQu Wenruo3-5/+7
[BUG] Even with commit 81d5d61454c3 ("btrfs: enhance unsupported compat RO flags handling"), btrfs can still mount a fs with unsupported compat_ro flags read-only, then remount it RW: # btrfs ins dump-super /dev/loop0 | grep compat_ro_flags -A 3 compat_ro_flags 0x403 ( FREE_SPACE_TREE | FREE_SPACE_TREE_VALID | unknown flag: 0x400 ) # mount /dev/loop0 /mnt/btrfs mount: /mnt/btrfs: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error. dmesg(1) may have more information after failed mount system call. ^^^ RW mount failed as expected ^^^ # dmesg -t | tail -n5 loop0: detected capacity change from 0 to 1048576 BTRFS: device fsid cb5b82f5-0fdd-4d81-9b4b-78533c324afa devid 1 transid 7 /dev/loop0 scanned by mount (1146) BTRFS info (device loop0): using crc32c (crc32c-intel) checksum algorithm BTRFS info (device loop0): using free space tree BTRFS error (device loop0): cannot mount read-write because of unknown compat_ro features (0x403) BTRFS error (device loop0): open_ctree failed # mount /dev/loop0 -o ro /mnt/btrfs # mount -o remount,rw /mnt/btrfs ^^^ RW remount succeeded unexpectedly ^^^ [CAUSE] Currently we use btrfs_check_features() to check compat_ro flags against our current mount flags. That function get reused between open_ctree() and btrfs_remount(). But for btrfs_remount(), the super block we passed in still has the old mount flags, thus btrfs_check_features() still believes we're mounting read-only. [FIX] Replace the existing @sb argument with @is_rw_mount. As originally we only use @sb to determine if the mount is RW. Now it's callers' responsibility to determine if the mount is RW, and since there are only two callers, the check is pretty simple: - caller in open_ctree() Just pass !sb_rdonly(). - caller in btrfs_remount() Pass !(*flags & SB_RDONLY), as our check should be against the new flags. Now we can correctly reject the RW remount: # mount /dev/loop0 -o ro /mnt/btrfs # mount -o remount,rw /mnt/btrfs mount: /mnt/btrfs: mount point not mounted or bad option. dmesg(1) may have more information after failed mount system call. # dmesg -t | tail -n 1 BTRFS error (device loop0: state M): cannot mount read-write because of unknown compat_ro features (0x403) Reported-by: Chung-Chiang Cheng <shepjeng@gmail.com> Fixes: 81d5d61454c3 ("btrfs: enhance unsupported compat RO flags handling") CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-01-03btrfs: always report error in run_one_delayed_ref()Qu Wenruo1-2/+5
Currently we have a btrfs_debug() for run_one_delayed_ref() failure, but if end users hit such problem, there will be no chance that btrfs_debug() is enabled. This can lead to very little useful info for debugging. This patch will: - Add extra info for error reporting Including: * logical bytenr * num_bytes * type * action * ref_mod - Replace the btrfs_debug() with btrfs_err() - Move the error reporting into run_one_delayed_ref() This is to avoid use-after-free, the @node can be freed in the caller. This error should only be triggered at most once. As if run_one_delayed_ref() failed, we trigger the error message, then causing the call chain to error out: btrfs_run_delayed_refs() `- btrfs_run_delayed_refs() `- btrfs_run_delayed_refs_for_head() `- run_one_delayed_ref() And we will abort the current transaction in btrfs_run_delayed_refs(). If we have to run delayed refs for the abort transaction, run_one_delayed_ref() will just cleanup the refs and do nothing, thus no new error messages would be output. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-01-03ALSA: hda - Enable headset mic on another Dell laptop with ALC3254Chris Chiu1-0/+1
There is another Dell Latitude laptop (1028:0c03) with Realtek codec ALC3254 which needs the ALC269_FIXUP_DELL4_MIC_NO_PRESENCE instead of the default matched ALC269_FIXUP_DELL1_MIC_NO_PRESENCE. Apply correct fixup for this particular model to enable headset mic. Signed-off-by: Chris Chiu <chris.chiu@canonical.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20230103095332.730677-1-chris.chiu@canonical.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-01-03ALSA: hda/realtek - Turn on power earlyYuchi Yang1-14/+16
Turn on power early to avoid wrong state for power relation register. This can earlier update JD state when resume back. Signed-off-by: Yuchi Yang <yangyuchi66@gmail.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/e35d8f4fa18f4448a2315cc7d4a3715f@realtek.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-01-03btrfs: handle case when repair happens with dev-replaceQu Wenruo1-1/+10
[BUG] There is a bug report that a BUG_ON() in btrfs_repair_io_failure() (originally repair_io_failure() in v6.0 kernel) got triggered when replacing a unreliable disk: BTRFS warning (device sda1): csum failed root 257 ino 2397453 off 39624704 csum 0xb0d18c75 expected csum 0x4dae9c5e mirror 3 kernel BUG at fs/btrfs/extent_io.c:2380! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 9 PID: 3614331 Comm: kworker/u257:2 Tainted: G OE 6.0.0-5-amd64 #1 Debian 6.0.10-2 Hardware name: Micro-Star International Co., Ltd. MS-7C60/TRX40 PRO WIFI (MS-7C60), BIOS 2.70 07/01/2021 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs] RIP: 0010:repair_io_failure+0x24a/0x260 [btrfs] Call Trace: <TASK> clean_io_failure+0x14d/0x180 [btrfs] end_bio_extent_readpage+0x412/0x6e0 [btrfs] ? __switch_to+0x106/0x420 process_one_work+0x1c7/0x380 worker_thread+0x4d/0x380 ? rescuer_thread+0x3a0/0x3a0 kthread+0xe9/0x110 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x22/0x30 [CAUSE] Before the BUG_ON(), we got some read errors from the replace target first, note the mirror number (3, which is beyond RAID1 duplication, thus it's read from the replace target device). Then at the BUG_ON() location, we are trying to writeback the repaired sectors back the failed device. The check looks like this: ret = btrfs_map_block(fs_info, BTRFS_MAP_WRITE, logical, &map_length, &bioc, mirror_num); if (ret) goto out_counter_dec; BUG_ON(mirror_num != bioc->mirror_num); But inside btrfs_map_block(), we can modify bioc->mirror_num especially for dev-replace: if (dev_replace_is_ongoing && mirror_num == map->num_stripes + 1 && !need_full_stripe(op) && dev_replace->tgtdev != NULL) { ret = get_extra_mirror_from_replace(fs_info, logical, *length, dev_replace->srcdev->devid, &mirror_num, &physical_to_patch_in_first_stripe); patch_the_first_stripe_for_dev_replace = 1; } Thus if we're repairing the replace target device, we're going to trigger that BUG_ON(). But in reality, the read failure from the replace target device may be that, our replace hasn't reached the range we're reading, thus we're reading garbage, but with replace running, the range would be properly filled later. Thus in that case, we don't need to do anything but let the replace routine to handle it. [FIX] Instead of a BUG_ON(), just skip the repair if we're repairing the device replace target device. Reported-by: 小太 <nospam@kota.moe> Link: https://lore.kernel.org/linux-btrfs/CACsxjPYyJGQZ+yvjzxA1Nn2LuqkYqTCcUH43S=+wXhyf8S00Ag@mail.gmail.com/ CC: stable@vger.kernel.org # 6.0+ Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-01-03btrfs: fix off-by-one in delalloc search during lseekFilipe Manana2-2/+2
During lseek, when searching for delalloc in a range that represents a hole and that range has a length of 1 byte, we end up not doing the actual delalloc search in the inode's io tree, resulting in not correctly reporting the offset with data or a hole. This actually only happens when the start offset is 0 because with any other start offset we round it down by sector size. Reproducer: $ mkfs.btrfs -f /dev/sdc $ mount /dev/sdc /mnt/sdc $ xfs_io -f -c "pwrite -q 0 1" /mnt/sdc/foo $ xfs_io -c "seek -d 0" /mnt/sdc/foo Whence Result DATA EOF It should have reported an offset of 0 instead of EOF. Fix this by updating btrfs_find_delalloc_in_range() and count_range_bits() to deal with inclusive ranges properly. These functions are already supposed to work with inclusive end offsets, they just got it wrong in a couple places due to off-by-one mistakes. A test case for fstests will be added later. Reported-by: Joan Bruguera Micó <joanbrugueram@gmail.com> Link: https://lore.kernel.org/linux-btrfs/20221223020509.457113-1-joanbrugueram@gmail.com/ Fixes: b6e833567ea1 ("btrfs: make hole and data seeking a lot more efficient") CC: stable@vger.kernel.org # 6.1 Tested-by: Joan Bruguera Micó <joanbrugueram@gmail.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-01-03btrfs: fix false alert on bad tree level checkQu Wenruo1-5/+25
[BUG] There is a bug report that on a RAID0 NVMe btrfs system, under heavy write load the filesystem can flip RO randomly. With extra debugging, it shows some tree blocks failed to pass their level checks, and if that happens at critical path of a transaction, we abort the transaction: BTRFS error (device nvme0n1p3): level verify failed on logical 5446121209856 mirror 1 wanted 0 found 1 BTRFS error (device nvme0n1p3: state A): Transaction aborted (error -5) BTRFS: error (device nvme0n1p3: state A) in btrfs_finish_ordered_io:3343: errno=-5 IO failure BTRFS info (device nvme0n1p3: state EA): forced readonly [CAUSE] The reporter has already bisected to commit 947a629988f1 ("btrfs: move tree block parentness check into validate_extent_buffer()"). And with extra debugging, it shows we can have btrfs_tree_parent_check filled with all zeros in the following call trace: submit_one_bio+0xd4/0xe0 submit_extent_page+0x142/0x550 read_extent_buffer_pages+0x584/0x9c0 ? __pfx_end_bio_extent_readpage+0x10/0x10 ? folio_unlock+0x1d/0x50 btrfs_read_extent_buffer+0x98/0x150 read_tree_block+0x43/0xa0 read_block_for_search+0x266/0x370 btrfs_search_slot+0x351/0xd30 ? lock_is_held_type+0xe8/0x140 btrfs_lookup_csum+0x63/0x150 btrfs_csum_file_blocks+0x197/0x6c0 ? sched_clock_cpu+0x9f/0xc0 ? lock_release+0x14b/0x440 ? _raw_read_unlock+0x29/0x50 btrfs_finish_ordered_io+0x441/0x860 btrfs_work_helper+0xfe/0x400 ? lock_is_held_type+0xe8/0x140 process_one_work+0x294/0x5b0 worker_thread+0x4f/0x3a0 ? __pfx_worker_thread+0x10/0x10 kthread+0xf5/0x120 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2c/0x50 Currently we only copy the btrfs_tree_parent_check structure into bbio at read_extent_buffer_pages() after we have assembled the bbio. But as shown above, submit_extent_page() itself can already submit the bbio, leaving the bbio->parent_check uninitialized, and cause the false alert. [FIX] Instead of copying @check into bbio after bbio is assembled, we pass @check in btrfs_bio_ctrl::parent_check, and copy the content of parent_check in submit_one_bio() for metadata read. By this we should be able to pass the needed info for metadata endio verification, and fix the false alert. Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Link: https://lore.kernel.org/linux-btrfs/CABXGCsNzVxo4iq-tJSGm_kO1UggHXgq6CdcHDL=z5FL4njYXSQ@mail.gmail.com/ Fixes: 947a629988f1 ("btrfs: move tree block parentness check into validate_extent_buffer()") Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-01-03btrfs: add error message for metadata level mismatchQu Wenruo1-0/+3
From a recent regression report, we found that after commit 947a629988f1 ("btrfs: move tree block parentness check into validate_extent_buffer()") if we have a level mismatch (false alert though), there is no error message at all. This makes later debugging harder. This patch will add the proper error message for such case. Link: https://lore.kernel.org/linux-btrfs/CABXGCsNzVxo4iq-tJSGm_kO1UggHXgq6CdcHDL=z5FL4njYXSQ@mail.gmail.com/ Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-01-03btrfs: fix ASSERT em->len condition in btrfs_get_extentTanmay Bhushan1-1/+1
The em->len value is supposed to be verified in the assertion condition though we expect it to be same as the sectorsize. Fixes: a196a8944f77 ("btrfs: do not reset extent map members for inline extents read") Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Tanmay Bhushan <007047221b@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-01-03perf test record_probe_libc_inet_pton: Fix failure due to extra inet_pton() ↵Arnaldo Carvalho de Melo1-1/+1
backtrace in glibc >= 2.35 Starting with glibc 2.35 there are extra inet_pton() calls when doing a IPv6 ping as in one of the 'perf test' entry, which makes it fail: # perf test inet_pton 89: probe libc's inet_pton & backtrace it with ping : FAILED! # If we look at what this script is expecting (commenting out the removal of the temporary files in it): # cat /tmp/expected.aT6 ping[][0-9 \.:]+probe_libc:inet_pton: \([[:xdigit:]]+\) .*inet_pton\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib64/libc.so.6|inlined\)$ getaddrinfo\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib64/libc.so.6\)$ .*(\+0x[[:xdigit:]]+|\[unknown\])[[:space:]]\(.*/bin/ping.*\)$ # And looking at what we are getting out of 'perf script', to match with the above: # cat /tmp/perf.script.IUC ping 623883 [006] 265438.471610: probe_libc:inet_pton: (7f32bcf314c0) 1314c0 __GI___inet_pton+0x0 (/usr/lib64/libc.so.6) 29510 __libc_start_call_main+0x80 (/usr/lib64/libc.so.6) ping 623883 [006] 265438.471664: probe_libc:inet_pton: (7f32bcf314c0) 1314c0 __GI___inet_pton+0x0 (/usr/lib64/libc.so.6) fa6c6 getaddrinfo+0x126 (/usr/lib64/libc.so.6) 491e [unknown] (/usr/bin/ping) # We see that its just the first call to inet_pton() that didn't came thru getaddrinfo(), so if we ignore the first the script matches what it expects, testing that using 'perf probe' + 'perf record' + 'perf script' with callchains on userspace targets is producing the expected results. Since we don't have a 'perf script --skip' to help us here, use tac + grep to do that, resulting in a one liner that makes this script work on both older glibc versions as well as with 2.35. With it, on fedora 36, x86, glibc 2.35: # perf test inet_pton 90: probe libc's inet_pton & backtrace it with ping : Ok # perf test -v inet_pton 90: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 627197 ping 627220 1 267956.962402: probe_libc:inet_pton_1: (7f488bf314c0) 1314c0 __GI___inet_pton+0x0 (/usr/lib64/libc.so.6) fa6c6 getaddrinfo+0x126 (/usr/lib64/libc.so.6) 491e n (/usr/bin/ping) test child finished with 0 ---- end ---- probe libc's inet_pton & backtrace it with ping: Ok # And on Ubuntu 22.04.1 LTS on a Libre Computer ROC-RK3399-PC arm64 system: Before this patch it works (see that the script used has no 'tac' to remove the first event): root@roc-rk3399-pc:~# dpkg -l | grep libc-bin ii libc-bin 2.35-0ubuntu3.1 arm64 GNU C Library: Binaries root@roc-rk3399-pc:~# grep -w tac ~acme/libexec/perf-core/tests/shell/record+probe_libc_inet_pton.sh root@roc-rk3399-pc:~# perf test inet_pton 86: probe libc's inet_pton & backtrace it with ping : Ok root@roc-rk3399-pc:~# perf test -v inet_pton 86: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 1375 ping 1399 [000] 4114.417450: probe_libc:inet_pton: (ffffb3e26120) 106120 inet_pton+0x0 (/usr/lib/aarch64-linux-gnu/libc.so.6) d18bc getaddrinfo+0xec (/usr/lib/aarch64-linux-gnu/libc.so.6) 2b68 [unknown] (/usr/bin/ping) test child finished with 0 ---- end ---- probe libc's inet_pton & backtrace it with ping: Ok root@roc-rk3399-pc:~# And after it continues to work: root@roc-rk3399-pc:~# grep -w tac ~acme/libexec/perf-core/tests/shell/record+probe_libc_inet_pton.sh perf script -i $perf_data | tac | grep -m1 ^ping -B9 | tac > $perf_script root@roc-rk3399-pc:~# perf test inet_pton 86: probe libc's inet_pton & backtrace it with ping : Ok root@roc-rk3399-pc:~# perf test -v inet_pton 86: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 6995 ping 7019 [005] 4832.160741: probe_libc:inet_pton: (ffffa62e6120) 106120 inet_pton+0x0 (/usr/lib/aarch64-linux-gnu/libc.so.6) d18bc getaddrinfo+0xec (/usr/lib/aarch64-linux-gnu/libc.so.6) 2b68 [unknown] (/usr/bin/ping) test child finished with 0 ---- end ---- probe libc's inet_pton & backtrace it with ping: Ok root@roc-rk3399-pc:~# Reported-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: http://lore.kernel.org/lkml/Y7QyPkPlDYip3cZH@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-03drm/scheduler: Fix lockup in drm_sched_entity_kill()Dmitry Osipenko2-3/+3
The drm_sched_entity_kill() is invoked twice by drm_sched_entity_destroy() while userspace process is exiting or being killed. First time it's invoked when sched entity is flushed and second time when entity is released. This causes a lockup within wait_for_completion(entity_idle) due to how completion API works. Calling wait_for_completion() more times than complete() was invoked is a error condition that causes lockup because completion internally uses counter for complete/wait calls. The complete_all() must be used instead in such cases. This patch fixes lockup of Panfrost driver that is reproducible by killing any application in a middle of 3d drawing operation. Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini") Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Reviewed-by: Christian König <christian.koenig@amd.com> Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck Link: https://patchwork.freedesktop.org/patch/msgid/20221123001303.533968-1-dmitry.osipenko@collabora.com
2023-01-03reset: uniphier-glue: Fix possible null-ptr-derefHui Tang1-3/+1
It will cause null-ptr-deref when resource_size(res) invoked, if platform_get_resource() returns NULL. Fixes: 499fef09a323 ("reset: uniphier: add USB3 core reset control") Signed-off-by: Hui Tang <tanghui20@huawei.com> Reviewed-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Link: https://lore.kernel.org/r/20221114004958.258513-1-tanghui20@huawei.com Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2023-01-03reset: ti-sci: honor TI_SCI_PROTOCOL setting when not COMPILE_TESTRandy Dunlap1-1/+1
There is a build error when COMPILE_TEST=y, TI_SCI_PROTOCOL=m, and RESET_TI_SCI=y: drivers/reset/reset-ti-sci.o: in function `ti_sci_reset_probe': reset-ti-sci.c:(.text+0x22c): undefined reference to `devm_ti_sci_get_handle' Fix this by making RESET_TI_SCI honor the Kconfig setting of TI_SCI_PROTOCOL when COMPILE_TEST is not set. When COMPILE_TEST is set, TI_SCI_PROTOCOL must be disabled (=n). Fixes: a6af504184c9 ("reset: ti-sci: Allow building under COMPILE_TEST") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Nishanth Menon <nm@ti.com> Cc: Tero Kristo <kristo@kernel.org> Cc: Santosh Shilimkar <ssantosh@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Nishanth Menon <nm@ti.com> Link: https://lore.kernel.org/r/20221030055636.3139-1-rdunlap@infradead.org Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2023-01-03time: Fix various kernel-doc problemsRandy Dunlap3-10/+10
Clean up kernel-doc complaints about function names and non-kernel-doc comments in kernel/time/. Fixes these warnings: kernel/time/time.c:479: warning: expecting prototype for set_normalized_timespec(). Prototype was for set_normalized_timespec64() instead kernel/time/time.c:553: warning: expecting prototype for msecs_to_jiffies(). Prototype was for __msecs_to_jiffies() instead kernel/time/timekeeping.c:1595: warning: contents before sections kernel/time/timekeeping.c:1705: warning: This comment starts with '/**', but isn't a kernel-doc comment. * We have three kinds of time sources to use for sleep time kernel/time/timekeeping.c:1726: warning: This comment starts with '/**', but isn't a kernel-doc comment. * 1) can be determined whether to use or not only when doing kernel/time/tick-oneshot.c:21: warning: missing initial short description on line: * tick_program_event kernel/time/tick-oneshot.c:107: warning: expecting prototype for tick_check_oneshot_mode(). Prototype was for tick_oneshot_mode_active() instead Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230103032849.12723-1-rdunlap@infradead.org
2023-01-03KVM: arm64: Convert FSC_* over to ESR_ELx_FSC_*Marc Zyngier6-36/+33
The former is an AArch32 legacy, so let's move over to the verbose (and strictly identical) version. This involves moving some of the #defines that were private to KVM into the more generic esr.h. Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-01-03KVM: arm64: Document the behaviour of S1PTW faults on RO memslotsMarc Zyngier1-0/+8
Although the KVM API says that a write to a RO memslot must result in a KVM_EXIT_MMIO describing the write, the arm64 architecture doesn't provide the *data* written by a Stage-1 page table walk (we only get the address). Since there isn't much userspace can do with so little information anyway, document the fact that such an access results in a guest exception, not an exit. This is consistent with the guest being terminally broken anyway. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-01-03KVM: arm64: Fix S1PTW handling on RO memslotsMarc Zyngier1-2/+20
A recent development on the EFI front has resulted in guests having their page tables baked in the firmware binary, and mapped into the IPA space as part of a read-only memslot. Not only is this legitimate, but it also results in added security, so thumbs up. It is possible to take an S1PTW translation fault if the S1 PTs are unmapped at stage-2. However, KVM unconditionally treats S1PTW as a write to correctly handle hardware AF/DB updates to the S1 PTs. Furthermore, KVM injects an exception into the guest for S1PTW writes. In the aforementioned case this results in the guest taking an abort it won't recover from, as the S1 PTs mapping the vectors suffer from the same problem. So clearly our handling is... wrong. Instead, switch to a two-pronged approach: - On S1PTW translation fault, handle the fault as a read - On S1PTW permission fault, handle the fault as a write This is of no consequence to SW that *writes* to its PTs (the write will trigger a non-S1PTW fault), and SW that uses RO PTs will not use HW-assisted AF/DB anyway, as that'd be wrong. Only in the case described in c4ad98e4b72c ("KVM: arm64: Assume write fault on S1PTW permission fault on instruction fetch") do we end-up with two back-to-back faults (page being evicted and faulted back). I don't think this is a case worth optimising for. Fixes: c4ad98e4b72c ("KVM: arm64: Assume write fault on S1PTW permission fault on instruction fetch") Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Regression-tested-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org
2023-01-03efi: fix userspace infinite retry read efivars after EFI runtime services ↵Ding Hui1-0/+1
page fault After [1][2], if we catch exceptions due to EFI runtime service, we will clear EFI_RUNTIME_SERVICES bit to disable EFI runtime service, then the subsequent routine which invoke the EFI runtime service should fail. But the userspace cat efivars through /sys/firmware/efi/efivars/ will stuck and infinite loop calling read() due to efivarfs_file_read() return -EINTR. The -EINTR is converted from EFI_ABORTED by efi_status_to_err(), and is an improper return value in this situation, so let virt_efi_xxx() return EFI_DEVICE_ERROR and converted to -EIO to invoker. Cc: <stable@vger.kernel.org> Fixes: 3425d934fc03 ("efi/x86: Handle page faults occurring while running EFI runtime services") Fixes: 23715a26c8d8 ("arm64: efi: Recover from synchronous exceptions occurring in firmware") Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-03efi: fix NULL-deref in init error pathJohan Hovold1-3/+6
In cases where runtime services are not supported or have been disabled, the runtime services workqueue will never have been allocated. Do not try to destroy the workqueue unconditionally in the unlikely event that EFI initialisation fails to avoid dereferencing a NULL pointer. Fixes: 98086df8b70c ("efi: add missed destroy_workqueue when efisubsys_init fails") Cc: stable@vger.kernel.org Cc: Li Heng <liheng40@huawei.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-03usb: rndis_host: Secure rndis_query check against int overflowSzymon Heidrich1-1/+2
Variables off and len typed as uint32 in rndis_query function are controlled by incoming RNDIS response message thus their value may be manipulated. Setting off to a unexpectetly large value will cause the sum with len and 8 to overflow and pass the implemented validation step. Consequently the response pointer will be referring to a location past the expected buffer boundaries allowing information leakage e.g. via RNDIS_OID_802_3_PERMANENT_ADDRESS OID. Fixes: ddda08624013 ("USB: rndis_host, various cleanups") Signed-off-by: Szymon Heidrich <szymon.heidrich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-03net: dpaa: Fix dtsec check for PCS availabilitySean Anderson1-1/+1
We want to fail if the PCS is not available, not if it is available. Fix this condition. Fixes: 5d93cfcf7360 ("net: dpaa: Convert to phylink") Reported-by: Christian Zigotzky <info@xenosoft.de> Signed-off-by: Sean Anderson <seanga2@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-03octeontx2-pf: Fix lmtst ID used in aura freeGeetha sowjanya1-9/+21
Current code uses per_cpu pointer to get the lmtst_id mapped to the core on which aura_free() is executed. Using per_cpu pointer without preemption disable causing mismatch between lmtst_id and core on which pointer gets freed. This patch fixes the issue by disabling preemption around aura_free. Fixes: ef6c8da71eaf ("octeontx2-pf: cn10K: Reserve LMTST lines per core") Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-03drivers/net/bonding/bond_3ad: return when there's no aggregatorDaniil Tatianin1-0/+1
Otherwise we would dereference a NULL aggregator pointer when calling __set_agg_ports_ready on the line below. Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfDavid S. Miller15-189/+293
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Use signed integer in ipv6_skip_exthdr() called from nf_confirm(). Reported by static analysis tooling, patch from Florian Westphal. 2) Missing set type checks in nf_tables: Validate that set declaration matches the an existing set type, otherwise bail out with EEXIST. Currently, nf_tables silently accepts the re-declaration with a different type but it bails out later with EINVAL when the user adds entries to the set. This fix is relatively large because it requires two preparation patches that are included in this batch. 3) Do not ignore updates of timeout and gc_interval parameters in existing sets. 4) Fix a hang when 0/0 subnets is added to a hash:net,port,net type of ipset. Except hash:net,port,net and hash:net,iface, the set types don't support 0/0 and the auxiliary functions rely on this fact. So 0/0 needs a special handling in hash:net,port,net which was missing (hash:net,iface was not affected by this bug), from Jozsef Kadlecsik. 5) When adding/deleting large number of elements in one step in ipset, it can take a reasonable amount of time and can result in soft lockup errors. This patch is a complete rework of the previous version in order to use a smaller internal batch limit and at the same time removing the external hard limit to add arbitrary number of elements in one step. Also from Jozsef Kadlecsik. Except for patch #1, which fixes a bug introduced in the previous net-next development cycle, anything else has been broken for several releases. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>