summaryrefslogtreecommitdiffstats
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2023-01-25drm/amdgpu/display/mst: Fix mst_state->pbn_div and slot count assignmentsLyude Paul2-5/+24
Looks like I made a pretty big mistake here without noticing: it seems when I moved the assignments of mst_state->pbn_div I completely missed the fact that the reason for us calling drm_dp_mst_update_slots() earlier was to account for the fact that we need to call this function using info from the root MST connector, instead of just trying to do this from each MST encoder's atomic check function. Otherwise, we end up filling out all of DC's link information with zeroes. So, let's restore that and hopefully fix this DSC regression. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2171 Signed-off-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Fixes: 4d07b0bc4034 ("drm/display/dp_mst: Move all payload info into the atomic state") Cc: stable@vger.kernel.org # 6.1 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Tested-by: Didier Raboud <odyx@debian.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-25drm/amdgpu: declare firmware for new MES 11.0.4Li Ma1-0/+2
To support new mes ip block Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-25drm/amdgpu: enable imu firmware for GC 11.0.4Li Ma1-0/+1
The GC 11.0.4 needs load IMU to power up GFX before loads GFX firmware. Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-25drm/amd/pm: add missing AllowIHInterrupt message mapping for SMU13.0.0Evan Quan1-0/+1
Add SMU13.0.0 AllowIHInterrupt message mapping. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-01-25drm/amdgpu: remove unconditional trap enable on add gfx11 queuesJonathan Kim1-1/+0
Rebase of driver has incorrect unconditional trap enablement for GFX11 when adding mes queues. Reported-by: Graham Sider <graham.sider@amd.com> Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Graham Sider <graham.sider@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-01-20Merge tag 'amd-drm-fixes-6.2-2023-01-19' of ↵Dave Airlie8-18/+33
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.2-2023-01-19: amdgpu: - Fix display scaling - Fix RN/CZN power reporting on some firmware versions - Colorspace fixes - Fix resource freeing in error case in CS IOCTL - Fix warning on driver unload - GC11 fixes - DCN 3.1.4/5 S/G display workarounds Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230119195908.7670-1-alexander.deucher@amd.com
2023-01-20Merge tag 'drm-misc-fixes-2023-01-19' of ↵Dave Airlie5-10/+17
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes A fix for vc4 to address a memory leak when allocating a buffer, a Kconfig fix for panfrost and two fixes for i915 and fb-helper to address some bugs with vga-switcheroo. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20230119082059.h32bs7zqoxmjbcvn@houat
2023-01-20Merge tag 'drm-intel-fixes-2023-01-19' of ↵Dave Airlie5-10/+23
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - Reject display plane with height == 0 (Drew) - re-disable RC6p on Sandy Bridge (Sasa) - Fix hugepages' selftest (Chris) - DG2 hw workarounds (Matt Atwood) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Y8mf3/ANNWctpc7R@intel.com
2023-01-20Merge tag 'drm-msm-fixes-2023-01-16' of ↵Dave Airlie4-4/+19
https://gitlab.freedesktop.org/drm/msm into drm-fixes msm-fixes for v6.3-rc5 Two GPU fixes which were meant to be part of the previous pull request, but I'd forgotten to fetch from gitlab after the MR was merged so that git tag was applied to the wrong commit. - kexec shutdown fix - fix potential double free Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGskguoVsz2wqAK2k+f32LwcVY5JC6+e2RwLqZswz3RY2Q@mail.gmail.com
2023-01-18drm/amd/display: disable S/G display on DCN 3.1.4Alex Deucher1-1/+0
Causes flickering or white screens in some configurations. Disable it for now until we can fix the issue. Cc: roman.li@amd.com Cc: yifan1.zhang@amd.com Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Roman Li <Roman.Li@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-01-18drm/amd/display: disable S/G display on DCN 3.1.5Alex Deucher1-1/+0
Causes flickering or white screens in some configurations. Disable it for now until we can fix the issue. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2354 Cc: roman.li@amd.com Cc: yifan1.zhang@amd.com Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-01-18drm/amdgpu: allow multipipe policy on ASICs with one MECLang Yu1-0/+3
Always enable multipipe policy on ASICs with GC VERSION > 9.0.0 instead of MEC number > 1. This will allow multipipe policy on ASICs with one MEC, e.g., gfx11 APUs. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-01-18drm/amdgpu: correct MEC number for gfx11 APUsLang Yu1-2/+9
There is only one MEC on these APUs. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-01-18drm/amd/display: fix issues with driver unloadHamza Mahfooz2-5/+0
Currently, we run into a number of WARN()s when attempting to unload the amdgpu driver (e.g. using "modprobe -r amdgpu"). These all stem from calling drm_encoder_cleanup() too early. So, to fix this we can stop calling drm_encoder_cleanup() from amdgpu_dm_fini() and instead have it be called from amdgpu_dm_encoder_destroy(). Also, we don't need to free in amdgpu_dm_encoder_destroy() since mst_encoders[] isn't explicitly allocated by the slab allocator. Fixes: f74367e492ba ("drm/amdgpu/display: create fake mst encoders ahead of time (v4)") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-18drm/amdgpu: fix amdgpu_job_free_resources v2Christian König1-2/+8
It can be that neither fence were initialized when we run out of UVD streams for example. v2: fix typo breaking compile Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2324 Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-01-18drm/amd/display: Fix COLOR_SPACE_YCBCR2020_TYPE matrixJoshua Ashton1-2/+2
The YCC conversion matrix for RGB -> COLOR_SPACE_YCBCR2020_TYPE is missing the values for the fourth column of the matrix. The fourth column of the matrix is essentially just a value that is added given that the color is 3 components in size. These values are needed to bias the chroma from the [-1, 1] -> [0, 1] range. This fixes color being very green when using Gamescope HDR on HDMI output which prefers YCC 4:4:4. Fixes: 40df2f809e8f ("drm/amd/display: color space ycbcr709 support") Reviewed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Joshua Ashton <joshua@froggi.es> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-01-18drm/amd/display: Calculate output_color_space after pixel encoding adjustmentJoshua Ashton1-2/+2
Code in get_output_color_space depends on knowing the pixel encoding to determine whether to pick between eg. COLOR_SPACE_SRGB or COLOR_SPACE_YCBCR709 for transparent RGB -> YCbCr 4:4:4 in the driver. v2: Fixed patch being accidentally based on a personal feature branch, oops! Fixes: ea117312ea9f ("drm/amd/display: Reduce HDMI pixel encoding if max clock is exceeded") Reviewed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Joshua Ashton <joshua@froggi.es> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-01-18drm/amdgpu: fix cleaning up reserved VMID on releaseChristian König1-0/+1
We need to reset this or otherwise run into list corruption later on. Fixes: e44a0fe630c5 ("drm/amdgpu: rework reserved VMID handling") Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Candice Li <candice.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-18drm/amdgpu: Correct the power calcultion for Renior/Cezanne.jie1zhan1-1/+6
From smu firmware,the value of power is transferred in units of watts. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2321 Fixes: 137aac26a2ed ("drm/amdgpu/smu12: fix power reporting on renoir") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-01-18drm/amd/display: Fix set scaling doesn's workhongao1-2/+2
[Why] Setting scaling does not correctly update CRTC state. As a result dc stream state's src (composition area) && dest (addressable area) was not calculated as expected. This causes set scaling doesn's work. [How] Correctly update CRTC state when setting scaling property. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Tested-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: hongao <hongao@uniontech.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-01-18drm/i915: Remove unused variableNirmoy Das1-2/+0
Removed unused i915 var. Fixes: a273e95721e9 ("drm/i915: Allow switching away via vga-switcheroo if uninitialized") Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230118170624.9326-1-nirmoy.das@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-01-18drm/i915/dg2: Introduce Wa_18019271663Matt Atwood2-3/+7
Wa_18019271663 applies to all DG2 steppings and skus. Bspec: 66622 Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221123183648.407058-2-matthew.s.atwood@intel.com (cherry picked from commit 900a80c5836587d95db32742f66e1f34f7b40fcb) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-01-18drm/i915/dg2: Introduce Wa_18018764978Matt Atwood2-1/+9
Wa_18018764978 applies to specific steppings of DG2 (G10 C0+, G11 and G12 A0+). Clean up style in function at the same time. Bspec: 66622 Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221123183648.407058-1-matthew.s.atwood@intel.com (cherry picked from commit 468a4e630c7da8cf586f85cc498d6097aed1ab4b) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-01-18drm/fb-helper: Set framebuffer for vga-switcheroo clientsThomas Zimmermann1-0/+7
Set the framebuffer info for drivers that support VGA switcheroo. Only affects the amdgpu and nouveau drivers, which use VGA switcheroo and generic fbdev emulation. For other drivers, this does nothing. This fixes a potential regression in the console code. Both, amdgpu and nouveau, invoked vga_switcheroo_client_fb_set() from their internal fbdev code. But the call got lost when the drivers switched to the generic emulation. Fixes: 087451f372bf ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.") Fixes: 4a16dd9d18a0 ("drm/nouveau/kms: switch to drm fbdev helpers") Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Karol Herbst <kherbst@redhat.com> Cc: Lyude Paul <lyude@redhat.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Javier Martinez Canillas <javierm@redhat.com> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Evan Quan <evan.quan@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Likun Gao <Likun.Gao@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: "Tianci.Yin" <tianci.yin@amd.com> Cc: Xiaojian Du <Xiaojian.Du@amd.com> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Cc: YiPeng Chai <YiPeng.Chai@amd.com> Cc: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com> Cc: Bokun Zhang <Bokun.Zhang@amd.com> Cc: Guchun Chen <guchun.chen@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Solomon Chiu <solomon.chiu@amd.com> Cc: Kai-Heng Feng <kai.heng.feng@canonical.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: "Marek Olšák" <marek.olsak@amd.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Hans de Goede <hdegoede@redhat.com> Cc: "Ville Syrjälä" <ville.syrjala@linux.intel.com> Cc: dri-devel@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.17+ Link: https://patchwork.freedesktop.org/patch/msgid/20230116115425.13484-3-tzimmermann@suse.de
2023-01-18drm/i915: Allow switching away via vga-switcheroo if uninitializedThomas Zimmermann2-3/+6
Always allow switching away via vga-switcheroo if the display is uninitalized. Instead prevent switching to i915 if the device has not been initialized. This issue was introduced by commit 5df7bd130818 ("drm/i915: skip display initialization when there is no display") protected, which protects code paths from being executed on uninitialized devices. In the case of vga-switcheroo, we want to allow a switch away from i915's device. So run vga_switcheroo_process_delayed_switch() and test in the switcheroo callbacks if the i915 device is available. Fixes: 5df7bd130818 ("drm/i915: skip display initialization when there is no display") Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: "Ville Syrjälä" <ville.syrjala@linux.intel.com> Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: "Jouni Högander" <jouni.hogander@intel.com> Cc: Uma Shankar <uma.shankar@intel.com> Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Ramalingam C <ramalingam.c@intel.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: "José Roberto de Souza" <jose.souza@intel.com> Cc: Julia Lawall <Julia.Lawall@inria.fr> Cc: intel-gfx@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.14+ Link: https://patchwork.freedesktop.org/patch/msgid/20230116115425.13484-2-tzimmermann@suse.de
2023-01-18drm/i915/selftests: Unwind hugepages to drop wakeref on errorChris Wilson1-4/+4
Make sure that upon error after we have acquired the wakeref we do release it again. v2: add another missing "goto out_wf"(Andi). Fixes: 027c38b4121e ("drm/i915/selftests: Grab the runtime pm in shrink_thp") Cc: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230117123234.26487-1-nirmoy.das@intel.com (cherry picked from commit 14ec40a88210151296fff3e981c1a7196ad9bf55) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-01-18drm/i915: re-disable RC6p on Sandy BridgeSasa Dragic1-1/+2
RC6p on Sandy Bridge got re-enabled over time, causing visual glitches and GPU hangs. Disabled originally in commit 1c8ecf80fdee ("drm/i915: do not enable RC6p on Sandy Bridge"). Signed-off-by: Sasa Dragic <sasa.dragic@gmail.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221219172927.9603-2-sasa.dragic@gmail.com Fixes: fb6db0f5bf1d ("drm/i915: Remove unsafe i915.enable_rc6") Fixes: 13c5a577b342 ("drm/i915/gt: Select the deepest available parking mode for rc6") Cc: stable@vger.kernel.org Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 0c8a6e9ea232c221976a0670256bd861408d9917) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-01-18drm/panfrost: fix GENERIC_ATOMIC64 dependencyArnd Bergmann1-1/+2
On ARMv5 and earlier, a randconfig build can still run into WARNING: unmet direct dependencies detected for IOMMU_IO_PGTABLE_LPAE Depends on [n]: IOMMU_SUPPORT [=y] && (ARM [=y] || ARM64 || COMPILE_TEST [=y]) && !GENERIC_ATOMIC64 [=y] Selected by [y]: - DRM_PANFROST [=y] && HAS_IOMEM [=y] && DRM [=y] && (ARM [=y] || ARM64 || COMPILE_TEST [=y] && !GENERIC_ATOMIC64 [=y]) && MMU [=y] Rework the dependencies to always require a working cmpxchg64. Fixes: db594ba3fcf9 ("drm/panfrost: depend on !GENERIC_ATOMIC64 when using COMPILE_TEST") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230117164456.1591901-1-arnd@kernel.org
2023-01-17drm/i915/display: Check source height is > 0Drew Davenport1-1/+1
The error message suggests that the height of the src rect must be at least 1. Reject source with height of 0. Cc: stable@vger.kernel.org Signed-off-by: Drew Davenport <ddavenport@chromium.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221226225246.1.I15dff7bb5a0e485c862eae61a69096caf12ef29f@changeid (cherry picked from commit 0fe76b198d482b41771a8d17b45fb726d13083cf) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-01-15Merge tag 'edac_urgent_for_v6.2_rc4' of ↵Linus Torvalds3-12/+14
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC fixes from Borislav Petkov: - Fix the EDAC device's confusion in the polling setting units - Fix a memory leak in highbank's probing function * tag 'edac_urgent_for_v6.2_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/highbank: Fix memory leak in highbank_mc_probe() EDAC/device: Fix period calculation in edac_device_reset_delay_period()
2023-01-14Merge tag 'iommu-fixes-v6.2-rc3' of ↵Linus Torvalds5-17/+35
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - Core: Fix an iommu-group refcount leak - Fix overflow issue in IOVA alloc path - ARM-SMMU fixes from Will: - Fix VFIO regression on NXP SoCs by reporting IOMMU_CAP_CACHE_COHERENCY - Fix SMMU shutdown paths to avoid device unregistration race - Error handling fix for Mediatek IOMMU driver * tag 'iommu-fixes-v6.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/mediatek-v1: Fix an error handling path in mtk_iommu_v1_probe() iommu/iova: Fix alloc iova overflows issue iommu: Fix refcount leak in iommu_device_claim_dma_owner iommu/arm-smmu-v3: Don't unregister on shutdown iommu/arm-smmu: Don't unregister on shutdown iommu/arm-smmu: Report IOMMU_CAP_CACHE_COHERENCY even betterer
2023-01-14Merge tag 'hardening-v6.2-rc4' of ↵Linus Torvalds2-2/+8
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull kernel hardening fixes from Kees Cook: - Fix CFI hash randomization with KASAN (Sami Tolvanen) - Check size of coreboot table entry and use flex-array * tag 'hardening-v6.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: kbuild: Fix CFI hash randomization with KASAN firmware: coreboot: Check size of table entry and use flex-array
2023-01-14Merge tag 'scsi-fixes' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two minor fixes in the hisi_sas driver which only impact enterprise style multi-expander and shared disk situations and no core changes" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: hisi_sas: Set a port invalid only if there are no devices attached when refreshing port id scsi: hisi_sas: Use abort task set to reset SAS disks when discovered
2023-01-14Merge tag 'ata-6.2-rc4' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ATA fix from Damien Le Moal: "A single fix to prevent building the pata_cs5535 driver with user mode linux as it uses msr operations that are not defined with UML" * tag 'ata-6.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: pata_cs5535: Don't build on UML
2023-01-13Merge tag 'block-6.2-2023-01-13' of git://git.kernel.dk/linuxLinus Torvalds3-51/+73
Pull block fixes from Jens Axboe: "Nothing major in here, just a collection of NVMe fixes and dropping a wrong might_sleep() that static checkers tripped over but which isn't valid" * tag 'block-6.2-2023-01-13' of git://git.kernel.dk/linux: MAINTAINERS: stop nvme matching for nvmem files nvme: don't allow unprivileged passthrough on partitions nvme: replace the "bool vec" arguments with flags in the ioctl path nvme: remove __nvme_ioctl nvme-pci: fix error handling in nvme_pci_enable() nvme-pci: add NVME_QUIRK_IDENTIFY_CNS quirk to Apple T2 controllers nvme-apple: add NVME_QUIRK_IDENTIFY_CNS quirk to fix regression block: Drop spurious might_sleep() from blk_put_queue()
2023-01-13Merge tag 'pci-v6.2-fixes-1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci fixes from Bjorn Helgaas: - Work around apparent firmware issue that made Linux reject MMCONFIG space, which broke PCI extended config space (Bjorn Helgaas) - Fix CONFIG_PCIE_BT1 dependency due to mid-air collision between a PCI_MSI_IRQ_DOMAIN -> PCI_MSI change and addition of PCIE_BT1 (Lukas Bulwahn) * tag 'pci-v6.2-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: x86/pci: Treat EfiMemoryMappedIO as reservation of ECAM space x86/pci: Simplify is_mmconf_reserved() messages PCI: dwc: Adjust to recent removal of PCI_MSI_IRQ_DOMAIN
2023-01-13firmware: coreboot: Check size of table entry and use flex-arrayKees Cook2-2/+8
The memcpy() of the data following a coreboot_table_entry couldn't be evaluated by the compiler under CONFIG_FORTIFY_SOURCE. To make it easier to reason about, add an explicit flexible array member to struct coreboot_device so the entire entry can be copied at once. Additionally, validate the sizes before copying. Avoids this run-time false positive warning: memcpy: detected field-spanning write (size 168) of single field "&device->entry" at drivers/firmware/google/coreboot_table.c:103 (size 8) Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Link: https://lore.kernel.org/all/03ae2704-8c30-f9f0-215b-7cdf4ad35a9a@molgen.mpg.de/ Cc: Jack Rosenthal <jrosenth@chromium.org> Cc: Guenter Roeck <groeck@chromium.org> Cc: Julius Werner <jwerner@chromium.org> Cc: Brian Norris <briannorris@chromium.org> Cc: Stephen Boyd <swboyd@chromium.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Julius Werner <jwerner@chromium.org> Reviewed-by: Guenter Roeck <groeck@chromium.org> Link: https://lore.kernel.org/r/20230107031406.gonna.761-kees@kernel.org Reviewed-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Jack Rosenthal <jrosenth@chromium.org> Link: https://lore.kernel.org/r/20230112230312.give.446-kees@kernel.org
2023-01-14ata: pata_cs5535: Don't build on UMLPeter Foley1-0/+1
This driver uses MSR functions that aren't implemented under UML. Avoid building it to prevent tripping up allyesconfig. e.g. /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x3a3): undefined reference to `__tracepoint_read_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x3d2): undefined reference to `__tracepoint_write_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x457): undefined reference to `__tracepoint_write_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x481): undefined reference to `do_trace_write_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x4d5): undefined reference to `do_trace_write_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x4f5): undefined reference to `do_trace_read_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x51c): undefined reference to `do_trace_write_msr' Signed-off-by: Peter Foley <pefoley2@pefoley.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2023-01-13drm/vc4: bo: Fix unused variable warningMaxime Ripard1-1/+0
Commit 07a2975c65f2 ("drm/vc4: bo: Fix drmm_mutex_init memory hog") removed the only use of the ret variable, but didn't remove the variable itself leading to a unused variable warning. Remove that variable. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: 07a2975c65f2 ("drm/vc4: bo: Fix drmm_mutex_init memory hog") Reviewed-by: Maíra Canal <mcanal@igalia.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20230113154637.1704116-1-maxime@cerno.tech
2023-01-13Merge tag 'efi-fixes-for-v6.2-1' of ↵Linus Torvalds2-3/+7
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: - avoid a potential crash on the efi_subsys_init() error path - use more appropriate error code for runtime services calls issued after a crash in the firmware occurred - avoid READ_ONCE() for accessing firmware tables that may appear misaligned in memory * tag 'efi-fixes-for-v6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: tpm: Avoid READ_ONCE() for accessing the event log efi: rt-wrapper: Add missing include efi: fix userspace infinite retry read efivars after EFI runtime services page fault efi: fix NULL-deref in init error path
2023-01-13Merge tag 'pm-6.2-rc4' of ↵Linus Torvalds6-9/+32
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix assorted issues in the ARM cpufreq drivers and in the AMD P-state driver. Specifics: - Fix cpufreq policy reference counting in amd-pstate to prevent it from crashing on removal (Perry Yuan) - Fix double initialization and set suspend-freq for Apple's cpufreq driver (Arnd Bergmann, Hector Martin) - Fix reading of "reg" property, update cpufreq-dt's blocklist and update DT documentation for Qualcomm's cpufreq driver (Konrad Dybcio, Krzysztof Kozlowski) - Replace 0 with NULL in the Armada cpufreq driver (Miles Chen) - Fix potential overflows in the CPPC cpufreq driver (Pierre Gondois) - Update blocklist for the Tegra234 Soc cpufreq driver (Sumit Gupta)" * tag 'pm-6.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: amd-pstate: fix kernel hang issue while amd-pstate unregistering cpufreq: armada-37xx: stop using 0 as NULL pointer cpufreq: apple-soc: Switch to the lowest frequency on suspend dt-bindings: cpufreq: cpufreq-qcom-hw: document interrupts cpufreq: Add SM6375 to cpufreq-dt-platdev blocklist cpufreq: Add Tegra234 to cpufreq-dt-platdev blocklist cpufreq: qcom-hw: Fix reading "reg" with address/size-cells != 2 cpufreq: CPPC: Add u64 casts to avoid overflowing cpufreq: apple: remove duplicate intializer
2023-01-13Merge tag 'acpi-6.2-rc4' of ↵Linus Torvalds4-4/+28
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These add one more ACPI IRQ override quirk, improve ACPI companion lookup for backlight devices and add missing kernel command line option values for backlight detection. Specifics: - Improve ACPI companion lookup for backlight devices in the cases when there is more than one candidate ACPI device object (Hans de Goede) - Add missing support for manual selection of NVidia-WMI-EC or Apple GMUX backlight in the kernel command line to the ACPI backlight driver (Hans de Goede) - Skip ACPI IRQ override on Asus Expertbook B2402CBA (Tamim Khan)" * tag 'acpi-6.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: Fix selecting wrong ACPI fwnode for the iGPU on some Dell laptops ACPI: video: Allow selecting NVidia-WMI-EC or Apple GMUX backlight from the cmdline ACPI: resource: Skip IRQ override on Asus Expertbook B2402CBA
2023-01-13Merge tag 'platform-drivers-x86-v6.2-2' of ↵Linus Torvalds15-27/+143
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: "A set of assorted fixes and hardware-id additions" * tag 'platform-drivers-x86-v6.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: thinkpad_acpi: Fix profile mode display in AMT mode platform/x86: int3472/discrete: Ensure the clk/power enable pins are in output mode platform/x86/amd: Fix refcount leak in amd_pmc_probe platform/x86: intel/pmc/core: Add Meteor Lake mobile support platform/x86: simatic-ipc: add another model platform/x86: simatic-ipc: correct name of a model platform/x86: dell-privacy: Only register SW_CAMERA_LENS_COVER if present platform/x86: dell-privacy: Fix SW_CAMERA_LENS_COVER reporting platform/x86: asus-wmi: Don't load fan curves without fan platform/x86: asus-wmi: Ignore fan on E410MA platform/x86: asus-wmi: Add quirk wmi_ignore_fan platform/x86: asus-nb-wmi: Add alternate mapping for KEY_SCREENLOCK platform/x86: asus-nb-wmi: Add alternate mapping for KEY_CAMERA platform/surface: aggregator: Add missing call to ssam_request_sync_free() platform/surface: aggregator: Ignore command messages not intended for us platform/x86: touchscreen_dmi: Add info for the CSL Panther Tab HD platform/x86: ideapad-laptop: Add Legion 5 15ARH05 DMI id to set_fn_lock_led_list[] platform/x86: sony-laptop: Don't turn off 0x153 keyboard backlight during probe
2023-01-13Merge tag 'drm-fixes-2023-01-13' of git://anongit.freedesktop.org/drm/drmLinus Torvalds37-943/+325
Pull drm fixes from Dave Airlie: "There is a bit of a post-holiday build up here I expect, small fixes across the board, amdgpu and msm being the main leaders, with others having a few. One code removal patch for nouveau: buddy: - benchmark regression fix for top-down buddy allocation panel: - add Lenovo panel orientation quirk ttm: - fix kernel oops regression amdgpu: - fix missing fence references - fix missing pipeline sync fencing - SMU13 fan speed fix - SMU13 fix power cap handling - SMU13 BACO fix - Fix a possible segfault in bo validation error case - Delay removal of firmware framebuffer - Fix error when unloading amdkfd: - SVM fix when clearing vram - GC11 fix for multi-GPU i915: - Reserve enough fence slot for i915_vma_unbind_vsync - Fix potential use after free - Reset engines twice in case of reset failure - Use multi-cast registers for SVG Unit registers msm: - display: - doc warning fixes - dt attribs cleanups - memory leak fix - error handing in hdmi probe fix - dp_aux_isr incorrect signalling fix - shutdown path fix - accel: - a5xx: fix quirks to be a bitmask - a6xx: fix gx halt to avoid 1s hang - kexec shutdown fix - fix potential double free vmwgfx: - drop rcu usage to make code more robust virtio: - fix use-after-free in gem handle code nouveau: - drop unused nouveau_fbcon.c" * tag 'drm-fixes-2023-01-13' of git://anongit.freedesktop.org/drm/drm: (35 commits) drm: Optimize drm buddy top-down allocation method drm/ttm: Fix a regression causing kernel oops'es drm/i915/gt: Cover rest of SVG unit MCR registers drm/nouveau: Remove file nouveau_fbcon.c drm/amdkfd: Fix NULL pointer error for GC 11.0.1 on mGPU drm/amd/pm/smu13: BACO is supported when it's in BACO state drm/amdkfd: Add sync after creating vram bo drm/i915/gt: Reset twice drm/amdgpu: fix pipeline sync v2 drm/vmwgfx: Remove rcu locks from user resources drm/virtio: Fix GEM handle creation UAF drm/amdgpu: Fixed bug on error when unloading amdgpu drm/amd: Delay removal of the firmware framebuffer drm/amdgpu: Fix potential NULL dereference drm/i915: Fix potential context UAFs drm/i915: Reserve enough fence slot for i915_vma_unbind_async drm: Add orientation quirk for Lenovo ideapad D330-10IGL drm/msm/a6xx: Avoid gx gbit halt during rpm suspend drm/msm/adreno: Make adreno quirks not overwrite each other drm/msm: another fix for the headless Adreno GPU ...
2023-01-13Merge tag 'arm64-fixes' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "Here's a sizeable batch of Friday the 13th arm64 fixes for -rc4. What could possibly go wrong? The obvious reason we have so much here is because of the holiday season right after the merge window, but we've also brought back an erratum workaround that was previously dropped at the last minute and there's an MTE coredumping fix that strays outside of the arch/arm64 directory. Summary: - Fix PAGE_TABLE_CHECK failures on hugepage splitting path - Fix PSCI encoding of MEM_PROTECT_RANGE function in UAPI header - Fix NULL deref when accessing debugfs node if PSCI is not present - Fix MTE core dumping when VMA list is being updated concurrently - Fix SME signal frame handling when SVE is not implemented by the CPU - Fix asm constraints for cmpxchg_double() to hazard both words - Fix build failure with stack tracer and older versions of Clang - Bring back workaround for Cortex-A715 erratum 2645198" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Fix build with CC=clang, CONFIG_FTRACE=y and CONFIG_STACK_TRACER=y arm64/mm: Define dummy pud_user_exec() when using 2-level page-table arm64: errata: Workaround possible Cortex-A715 [ESR|FAR]_ELx corruption firmware/psci: Don't register with debugfs if PSCI isn't available firmware/psci: Fix MEM_PROTECT_RANGE function numbers arm64/signal: Always allocate SVE signal frames on SME only systems arm64/signal: Always accept SVE signal frames on SME only systems arm64/sme: Fix context switch for SME only systems arm64: cmpxchg_double*: hazard against entire exchange variable arm64/uprobes: change the uprobe_opcode_t typedef to fix the sparse warning arm64: mte: Avoid the racy walk of the vma list during core dump elfcore: Add a cprm parameter to elf_core_extra_{phdrs,data_size} arm64: mte: Fix double-freeing of the temporary tag storage during coredump arm64: ptrace: Use ARM64_SME to guard the SME register enumerations arm64/mm: add pud_user_exec() check in pud_user_accessible_page() arm64/mm: fix incorrect file_map_count for invalid pmd
2023-01-13iommu/mediatek-v1: Fix an error handling path in mtk_iommu_v1_probe()Christophe JAILLET1-1/+3
A clk, prepared and enabled in mtk_iommu_v1_hw_init(), is not released in the error handling path of mtk_iommu_v1_probe(). Add the corresponding clk_disable_unprepare(), as already done in the remove function. Fixes: b17336c55d89 ("iommu/mediatek: add support for mtk iommu generation one HW") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/593e7b7d97c6e064b29716b091a9d4fd122241fb.1671473163.git.christophe.jaillet@wanadoo.fr Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-01-13iommu/iova: Fix alloc iova overflows issueYunfei Wang1-2/+2
In __alloc_and_insert_iova_range, there is an issue that retry_pfn overflows. The value of iovad->anchor.pfn_hi is ~0UL, then when iovad->cached_node is iovad->anchor, curr_iova->pfn_hi + 1 will overflow. As a result, if the retry logic is executed, low_pfn is updated to 0, and then new_pfn < low_pfn returns false to make the allocation successful. This issue occurs in the following two situations: 1. The first iova size exceeds the domain size. When initializing iova domain, iovad->cached_node is assigned as iovad->anchor. For example, the iova domain size is 10M, start_pfn is 0x1_F000_0000, and the iova size allocated for the first time is 11M. The following is the log information, new->pfn_lo is smaller than iovad->cached_node. Example log as follows: [ 223.798112][T1705487] sh: [name:iova&]__alloc_and_insert_iova_range start_pfn:0x1f0000,retry_pfn:0x0,size:0xb00,limit_pfn:0x1f0a00 [ 223.799590][T1705487] sh: [name:iova&]__alloc_and_insert_iova_range success start_pfn:0x1f0000,new->pfn_lo:0x1efe00,new->pfn_hi:0x1f08ff 2. The node with the largest iova->pfn_lo value in the iova domain is deleted, iovad->cached_node will be updated to iovad->anchor, and then the alloc iova size exceeds the maximum iova size that can be allocated in the domain. After judging that retry_pfn is less than limit_pfn, call retry_pfn+1 to fix the overflow issue. Signed-off-by: jianjiao zeng <jianjiao.zeng@mediatek.com> Signed-off-by: Yunfei Wang <yf.wang@mediatek.com> Cc: <stable@vger.kernel.org> # 5.15.* Fixes: 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search fails") Acked-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20230111063801.25107-1-yf.wang@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-01-13iommu: Fix refcount leak in iommu_device_claim_dma_ownerMiaoqian Lin1-3/+5
iommu_group_get() returns the group with the reference incremented. Move iommu_group_get() after owner check to fix the refcount leak. Fixes: 89395ccedbc1 ("iommu: Add device-centric DMA ownership interfaces") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20221230083100.1489569-1-linmq006@gmail.com [ joro: Remove *group = NULL initialization ] Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-01-13iommu/arm-smmu-v3: Don't unregister on shutdownVladimir Oltean1-1/+3
Similar to SMMUv2, this driver calls iommu_device_unregister() from the shutdown path, which removes the IOMMU groups with no coordination whatsoever with their users - shutdown methods are optional in device drivers. This can lead to NULL pointer dereferences in those drivers' DMA API calls, or worse. Instead of calling the full arm_smmu_device_remove() from arm_smmu_device_shutdown(), let's pick only the relevant function call - arm_smmu_device_disable() - more or less the reverse of arm_smmu_device_reset() - and call just that from the shutdown path. Fixes: 57365a04c921 ("iommu: Move bus setup to IOMMU device registration") Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20221215141251.3688780-2-vladimir.oltean@nxp.com Signed-off-by: Will Deacon <will@kernel.org>
2023-01-13iommu/arm-smmu: Don't unregister on shutdownVladimir Oltean1-8/+14
Michael Walle says he noticed the following stack trace while performing a shutdown with "reboot -f". He suggests he got "lucky" and just hit the correct spot for the reboot while there was a packet transmission in flight. Unable to handle kernel NULL pointer dereference at virtual address 0000000000000098 CPU: 0 PID: 23 Comm: kworker/0:1 Not tainted 6.1.0-rc5-00088-gf3600ff8e322 #1930 Hardware name: Kontron KBox A-230-LS (DT) pc : iommu_get_dma_domain+0x14/0x20 lr : iommu_dma_map_page+0x9c/0x254 Call trace: iommu_get_dma_domain+0x14/0x20 dma_map_page_attrs+0x1ec/0x250 enetc_start_xmit+0x14c/0x10b0 enetc_xmit+0x60/0xdc dev_hard_start_xmit+0xb8/0x210 sch_direct_xmit+0x11c/0x420 __dev_queue_xmit+0x354/0xb20 ip6_finish_output2+0x280/0x5b0 __ip6_finish_output+0x15c/0x270 ip6_output+0x78/0x15c NF_HOOK.constprop.0+0x50/0xd0 mld_sendpack+0x1bc/0x320 mld_ifc_work+0x1d8/0x4dc process_one_work+0x1e8/0x460 worker_thread+0x178/0x534 kthread+0xe0/0xe4 ret_from_fork+0x10/0x20 Code: d503201f f9416800 d503233f d50323bf (f9404c00) ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Oops: Fatal exception in interrupt This appears to be reproducible when the board has a fixed IP address, is ping flooded from another host, and "reboot -f" is used. The following is one more manifestation of the issue: $ reboot -f kvm: exiting hardware virtualization cfg80211: failed to load regulatory.db arm-smmu 5000000.iommu: disabling translation sdhci-esdhc 2140000.mmc: Removing from iommu group 11 sdhci-esdhc 2150000.mmc: Removing from iommu group 12 fsl-edma 22c0000.dma-controller: Removing from iommu group 17 dwc3 3100000.usb: Removing from iommu group 9 dwc3 3110000.usb: Removing from iommu group 10 ahci-qoriq 3200000.sata: Removing from iommu group 2 fsl-qdma 8380000.dma-controller: Removing from iommu group 20 platform f080000.display: Removing from iommu group 0 etnaviv-gpu f0c0000.gpu: Removing from iommu group 1 etnaviv etnaviv: Removing from iommu group 1 caam_jr 8010000.jr: Removing from iommu group 13 caam_jr 8020000.jr: Removing from iommu group 14 caam_jr 8030000.jr: Removing from iommu group 15 caam_jr 8040000.jr: Removing from iommu group 16 fsl_enetc 0000:00:00.0: Removing from iommu group 4 arm-smmu 5000000.iommu: Blocked unknown Stream ID 0x429; boot with "arm-smmu.disable_bypass=0" to allow, but this may have security implications arm-smmu 5000000.iommu: GFSR 0x80000002, GFSYNR0 0x00000002, GFSYNR1 0x00000429, GFSYNR2 0x00000000 fsl_enetc 0000:00:00.1: Removing from iommu group 5 arm-smmu 5000000.iommu: Blocked unknown Stream ID 0x429; boot with "arm-smmu.disable_bypass=0" to allow, but this may have security implications arm-smmu 5000000.iommu: GFSR 0x80000002, GFSYNR0 0x00000002, GFSYNR1 0x00000429, GFSYNR2 0x00000000 arm-smmu 5000000.iommu: Blocked unknown Stream ID 0x429; boot with "arm-smmu.disable_bypass=0" to allow, but this may have security implications arm-smmu 5000000.iommu: GFSR 0x80000002, GFSYNR0 0x00000000, GFSYNR1 0x00000429, GFSYNR2 0x00000000 fsl_enetc 0000:00:00.2: Removing from iommu group 6 fsl_enetc_mdio 0000:00:00.3: Removing from iommu group 8 mscc_felix 0000:00:00.5: Removing from iommu group 3 fsl_enetc 0000:00:00.6: Removing from iommu group 7 pcieport 0001:00:00.0: Removing from iommu group 18 arm-smmu 5000000.iommu: Blocked unknown Stream ID 0x429; boot with "arm-smmu.disable_bypass=0" to allow, but this may have security implications arm-smmu 5000000.iommu: GFSR 0x00000002, GFSYNR0 0x00000000, GFSYNR1 0x00000429, GFSYNR2 0x00000000 pcieport 0002:00:00.0: Removing from iommu group 19 Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a8 pc : iommu_get_dma_domain+0x14/0x20 lr : iommu_dma_unmap_page+0x38/0xe0 Call trace: iommu_get_dma_domain+0x14/0x20 dma_unmap_page_attrs+0x38/0x1d0 enetc_unmap_tx_buff.isra.0+0x6c/0x80 enetc_poll+0x170/0x910 __napi_poll+0x40/0x1e0 net_rx_action+0x164/0x37c __do_softirq+0x128/0x368 run_ksoftirqd+0x68/0x90 smpboot_thread_fn+0x14c/0x190 Code: d503201f f9416800 d503233f d50323bf (f9405400) ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Oops: Fatal exception in interrupt ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]--- The problem seems to be that iommu_group_remove_device() is allowed to run with no coordination whatsoever with the shutdown procedure of the enetc PCI device. In fact, it almost seems as if it implies that the pci_driver :: shutdown() method is mandatory if DMA is used with an IOMMU, otherwise this is inevitable. That was never the case; shutdown methods are optional in device drivers. This is the call stack that leads to iommu_group_remove_device() during reboot: kernel_restart -> device_shutdown -> platform_shutdown -> arm_smmu_device_shutdown -> arm_smmu_device_remove -> iommu_device_unregister -> bus_for_each_dev -> remove_iommu_group -> iommu_release_device -> iommu_group_remove_device I don't know much about the arm_smmu driver, but arm_smmu_device_shutdown() invoking arm_smmu_device_remove() looks suspicious, since it causes the IOMMU device to unregister and that's where everything starts to unravel. It forces all other devices which depend on IOMMU groups to also point their ->shutdown() to ->remove(), which will make reboot slower overall. There are 2 moments relevant to this behavior. First was commit b06c076ea962 ("Revert "iommu/arm-smmu: Make arm-smmu explicitly non-modular"") when arm_smmu_device_shutdown() was made to run the exact same thing as arm_smmu_device_remove(). Prior to that, there was no iommu_device_unregister() call in arm_smmu_device_shutdown(). However, that was benign until commit 57365a04c921 ("iommu: Move bus setup to IOMMU device registration"), which made iommu_device_unregister() call remove_iommu_group(). Restore the old shutdown behavior by making remove() call shutdown(), but shutdown() does not call the remove() specific bits. Fixes: 57365a04c921 ("iommu: Move bus setup to IOMMU device registration") Reported-by: Michael Walle <michael@walle.cc> Tested-by: Michael Walle <michael@walle.cc> # on kontron-sl28 Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20221215141251.3688780-1-vladimir.oltean@nxp.com Signed-off-by: Will Deacon <will@kernel.org>