summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd
AgeCommit message (Collapse)AuthorFilesLines
2022-07-29Merge tag 'powerpc-5.19-6' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Re-enable the new amdgpu display engine for powerpc, as long as the compiler is correctly configured. - Disable stack variable initialisation in prom_init to fix GCC 12 allmodconfig. Thanks to Dan Horák and Sudip Mukherjee. * tag 'powerpc-5.19-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: drm/amdgpu: Re-enable DCN for 64-bit powerpc powerpc/64s: Disable stack variable initialisation for prom_init
2022-07-26drm/amdgpu: Re-enable DCN for 64-bit powerpcMichael Ellerman1-1/+1
Commit d11219ad53dc ("amdgpu: disable powerpc support for the newer display engine") disabled the DCN driver for all of powerpc due to unresolved build failures with some compilers. Further digging shows that the build failures only occur with compilers that default to 64-bit long double. Both the ppc64 and ppc64le ABIs define long double to be 128-bits, but there are compilers in the wild that default to 64-bits. The compilers provided by the major distros (Fedora, Ubuntu) default to 128-bits and are not affected by the build failure. There is a compiler flag to force 128-bit long double, which may be the correct long term fix, but as an interim fix only allow building the DCN driver if long double is 128-bits by default. The bisection in commit d11219ad53dc must have gone off the rails at some point, the build failure occurs all the way back to the original commit that enabled DCN support on powerpc, at least with some toolchains. Depends-on: d11219ad53dc ("amdgpu: disable powerpc support for the newer display engine") Fixes: 16a9dea110a6 ("amdgpu: Enable initial DCN support on POWER") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Dan Horák <dan@danny.cz> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2100 Link: https://lore.kernel.org/r/20220725123918.1903255-1-mpe@ellerman.id.au
2022-07-21Merge tag 'amd-drm-fixes-5.19-2022-07-20' of ↵Dave Airlie5-18/+38
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-5.19-2022-07-20: amdgpu: - Drop redundant buffer cleanup that can lead to a segfault - Add a bo_list mutex to avoid possible list corruption in CS Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220720210917.6202-1-alexander.deucher@amd.com
2022-07-20drm/amdgpu: Protect the amdgpu_bo_list list with a mutex v2Luben Tuikov3-4/+19
Protect the struct amdgpu_bo_list with a mutex. This is used during command submission in order to avoid buffer object corruption as recorded in the link below. v2 (chk): Keep the mutex looked for the whole CS to avoid using the list from multiple CS threads at the same time. Suggested-by: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Vitaly Prosyak <Vitaly.Prosyak@amd.com> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2048 Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-07-18drm/amdgpu: Remove one duplicated ef removalxinhui pan1-6/+0
That has been done in BO release notify. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2074 Signed-off-by: xinhui pan <xinhui.pan@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-15Merge tag 'drm-fixes-2022-07-15' of git://anongit.freedesktop.org/drm/drmLinus Torvalds10-388/+291
Pull drm fixes from Dave Airlie: "This is the regular fixes pull for this week. This has a bunch of amdgpu fixes, major one reverts the buddy allocator until it can be tested more, otherwise just small ones, then i915 has a bunch of fixes. The outstanding firmware regressions reported by phoronix will hopefully be dealt with ASAP. amdgpu: - revert buddy allocator support for now - DP MST blank screen fix for specific platforms - MEC firmware check fix for GC 10.3.7 - Deep color fix for DCE - Fix possible divide by 0 - Coverage blend mode fix - Fix cursor only commit timestamps i915: - Selftest fix - TTM fix sg_table construction - Error return fixes - Fix a performance regression related to waitboost - Fix GT resets" * tag 'drm-fixes-2022-07-15' of git://anongit.freedesktop.org/drm/drm: drm/amd/display: Ensure valid event timestamp for cursor-only commits drm/amd/display: correct check of coverage blend mode drm/amd/pm: Prevent divide by zero drm/amd/display: Only use depth 36 bpp linebuffers on DCN display engines. drm/amdkfd: correct the MEC atomic support firmware checking for GC 10.3.7 drm/amd/display: Ignore First MST Sideband Message Return Error drm/i915/selftests: fix subtraction overflow bug drm/i915/gem: Look for waitboosting across the whole object prior to individual waits drm/i915/gt: Serialize TLB invalidates with GT resets drm/i915/gt: Serialize GRDOM access between multiple engine resets drm/i915/ttm: fix sg_table construction drm/i915/selftests: fix a couple IS_ERR() vs NULL tests drm/i915: Fix vm use-after-free in vma destruction drm/i915/guc: ADL-N should use the same GuC FW as ADL-S drm/i915: fix a possible refcount leak in intel_dp_add_mst_connector() drm/i915/gvt: IS_ERR() vs NULL bug in intel_gvt_update_reg_whitelist() Revert "drm/amdgpu: add drm buddy support to amdgpu"
2022-07-15drm/amd/display: Fix new dmub notification enabling in DMStylon Wang1-8/+19
[Why] Changes from "Fix for dmub outbox notification enable" need to land in DM or DMUB outbox notification would be disabled. [How] Enable outbox notification only after interrupt are enabled and IRQ handlers registered. Any pending notification will be sent by DMUB once outbox notification is enabled. Fixes: ed7208706448 ("drm/amd/display: Fix for dmub outbox notification enable") Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Solomon Chiu <solomon.chiu@amd.com> Signed-off-by: Stylon Wang <stylon.wang@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-07-15Merge tag 'amd-drm-fixes-5.19-2022-07-13' of ↵Dave Airlie6-9/+115
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-5.19-2022-07-13: amdgpu: - DP MST blank screen fix for specific platforms - MEC firmware check fix for GC 10.3.7 - Deep color fix for DCE - Fix possible divide by 0 - Coverage blend mode fix - Fix cursor only commit timestamps Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220713172920.6037-1-alexander.deucher@amd.com
2022-07-15Merge tag 'drm-misc-fixes-2022-07-14' of ↵Dave Airlie4-379/+176
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Only a revert for amdgpu reverting the switch to the drm buddy allocator. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20220714071821.hsejxpsgkbbzlec2@houat
2022-07-14amdgpu: disable powerpc support for the newer display engineLinus Torvalds1-1/+1
The DRM_AMD_DC_DCN display engine support (Raven, Navi, and newer) has not been building cleanly on powerpc and causes link errors due to mixing hard- and soft-float object files: powerpc64-linux-ld: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses hard float, drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o uses soft float powerpc64-linux-ld: failed to merge target specific data of file drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o [..] and while patches are floating around, it's not exactly obvious what is going on. The problem bisects to commit 41b7a347bf14 ("powerpc: Book3S 64-bit outline-only KASAN support") but that is probably more about changing config variables than the fundamental cause. Despite the bisection result, a more directly related commit seems to be 26f4712aedbd ("drm/amd/display: move FPU related code from dcn31 to dml/dcn31 folder"). It's probably a combination of the two. This has been going on since the merge window, without any final word. So instead of blindly applying patches that may or may not be the right thing, let's disable this for now. As Michael Ellerman says: "IIUIC this code was never enabled on ppc before, so disabling it seems like a reasonable fix to get the build clean" and once we have more actual feedback (and find any potential users) we can always re-enable it with the patch that fixes the issues and back-port as necessary. Fixes: 41b7a347bf14 ("powerpc: Book3S 64-bit outline-only KASAN support") Fixes: 26f4712aedbd ("drm/amd/display: move FPU related code from dcn31 to dml/dcn31 folder") Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/all/20220606153910.GA1773067@roeck-us.net/ Link: https://lore.kernel.org/all/20220618232737.2036722-1-linux@roeck-us.net/ Link: https://lore.kernel.org/all/20220713050724.GA2471738@roeck-us.net/ Acked-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-13drm/amd/display: Ensure valid event timestamp for cursor-only commitsMichel Dänzer1-3/+40
Requires enabling the vblank machinery for them. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2030 Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-07-13drm/amd/display: correct check of coverage blend modeMelissa Wen1-1/+1
Check the value of per_pixel_alpha to decide whether the Coverage pixel blend mode is applicable or not. Fixes: 76818cdd11a2 ("drm/amd/display: add Coverage blend mode for overlay plane") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-13drm/amd/pm: Prevent divide by zeroYefim Barashkin1-0/+2
divide error: 0000 [#1] SMP PTI CPU: 3 PID: 78925 Comm: tee Not tainted 5.15.50-1-lts #1 Hardware name: MSI MS-7A59/Z270 SLI PLUS (MS-7A59), BIOS 1.90 01/30/2018 RIP: 0010:smu_v11_0_set_fan_speed_rpm+0x11/0x110 [amdgpu] Speed is user-configurable through a file. I accidentally set it to zero, and the driver crashed. Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Yefim Barashkin <mr.b34r@kolabnow.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-07-13drm/amd/display: Only use depth 36 bpp linebuffers on DCN display engines.Mario Kleiner1-5/+6
Various DCE versions had trouble with 36 bpp lb depth, requiring fixes, last time in commit 353ca0fa5630 ("drm/amd/display: Fix 10bit 4K display on CIK GPUs") for DCE-8. So far >= DCE-11.2 was considered ok, but now I found out that on DCE-11.2 it causes dithering when there shouldn't be any, so identity pixel passthrough with identity gamma LUTs doesn't work when it should. This breaks various important neuroscience applications, as reported to me by scientific users of Polaris cards under Ubuntu 22.04 with Linux 5.15, and confirmed by testing it myself on DCE-11.2. Lets only use depth 36 for DCN engines, where my testing showed that it is both necessary for high color precision output, e.g., RGBA16 fb's, and not harmful, as far as more than one year in real-world use showed. DCE engines seem to work fine for high precision output at 30 bpp, so this ("famous last words") depth 30 should hopefully fix all known problems without introducing new ones. Successfully retested on DCE-11.2 Polaris and DCN-1.0 Raven Ridge on top of Linux 5.19.0-rc2 + drm-next. Fixes: 353ca0fa5630 ("drm/amd/display: Fix 10bit 4K display on CIK GPUs") Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Tested-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: stable@vger.kernel.org # 5.14.0 Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-13drm/amdkfd: correct the MEC atomic support firmware checking for GC 10.3.7Prike Liang1-0/+2
On the GC 10.3.7 platform the initial MEC release version #3 can support atomic operation,so need correct and set its MEC atomic support version to #3. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 5.18.x
2022-07-13drm/amd/display: Ignore First MST Sideband Message Return ErrorFangzhi Zuo3-0/+64
[why] First MST sideband message returns AUX_RET_ERROR_HPD_DISCON on certain intel platform. Aux transaction considered failure if HPD unexpected pulled low. The actual aux transaction success in such case, hence do not return error. [how] Not returning error when AUX_RET_ERROR_HPD_DISCON detected on the first sideband message. v2: squash in additional DMI entries v3: squash in static fix Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Acked-by: Solomon Chiu <solomon.chiu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-07-08Revert "drm/amdgpu: add drm buddy support to amdgpu"Arunpravin Paneer Selvam4-379/+176
This reverts commit c9cad937c0c58618fe5b0310fd539a854dc1ae95. This is part of a revert of the following commits: commit 708d19d9f362 ("drm/amdgpu: move internal vram_mgr function into the C file") commit 5e3f1e7729ec ("drm/amdgpu: fix start calculation in amdgpu_vram_mgr_new") commit c9cad937c0c5 ("drm/amdgpu: add drm buddy support to amdgpu") [WHY] Few users reported garbaged graphics as soon as x starts, reverting until this can be resolved. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220708093047.492662-3-Arunpravin.PaneerSelvam@amd.com Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
2022-07-06drm/amdgpu/display: disable prefer_shadow for generic fb helpersAlex Deucher6-6/+12
Seems to break hibernation. Disable for now until we can root cause it. Fixes: 087451f372bf ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.") Bug: https://bugzilla.kernel.org/show_bug.cgi?id=216119 Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-06drm/amdgpu: keep fbdev buffers pinned during suspendAlex Deucher1-4/+21
Was dropped when we converted to the generic helpers. Fixes: 087451f372bf ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.") Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-29Revert "drm/amdgpu/display: set vblank_disable_immediate for DC"Alex Deucher2-3/+1
This reverts commit 92020e81ddbeac351ea4a19bcf01743f32b9c800. This causes stuttering and timeouts with DMCUB for some users so revert it until we understand why and safely enable it to save power. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1887 Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Cc: stable@vger.kernel.org
2022-06-29drm/amdgpu: To flush tlb for MMHUB of RAVEN seriesRuili Ji1-1/+2
amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:40 vmid:8 pasid:32769, for process test_basic pid 3305 thread test_basic pid 3305) amdgpu: in page starting at address 0x00007ff990003000 from IH client 0x12 (VMC) amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00840051 amdgpu: Faulty UTCL2 client ID: MP1 (0x0) amdgpu: MORE_FAULTS: 0x1 amdgpu: WALKER_ERROR: 0x0 amdgpu: PERMISSION_FAULTS: 0x5 amdgpu: MAPPING_ERROR: 0x0 amdgpu: RW: 0x1 When memory is allocated by kfd, no one triggers the tlb flush for MMHUB0. There is page fault from MMHUB0. v2:fix indentation v3:change subject and fix indentation Signed-off-by: Ruili Ji <ruiliji2@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-06-29drm/amdgpu: fix adev variable used in amdgpu_device_gpu_recover()Alex Deucher1-1/+1
Use the correct adev variable for the drm_fb_helper in amdgpu_device_gpu_recover(). Noticed by inspection. Fixes: 087451f372bf ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.") Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-06-22amd/display/dc: Fix COLOR_ENCODING and COLOR_RANGE doing nothing for DCN20+Joshua Ashton3-0/+9
For DCN20 and above, the code that actually hooks up the provided input_color_space got lost at some point. Fixes COLOR_ENCODING and COLOR_RANGE doing nothing on DCN20+. Tested using Steam Remote Play Together + gamescope. Update other DCNs the same wasy DCN1.x was updates in commit a1e07ba89d49 ("drm/amd/display: Use plane->color_space for dpp if specified") Fixes: a1e07ba89d49 ("drm/amd/display: Use plane->color_space for dpp if specified") Signed-off-by: Joshua Ashton <joshua@froggi.es> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-06-22drm/amd/display: Fix typo in override_lane_settingsGeorge Shen1-1/+1
[Why] The function currently skips overriding the drive settings of the first lane. [How] Change for loop to start at 0 instead of 1. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com> Acked-by: Alan Liu <HaoPing.Liu@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-06-22drm/amd/display: Fix DC warning at driver loadQingqing Zhuo1-1/+1
[Why] Wrong index was checked for dcfclk_mhz, causing false warning. [How] Fix the assertion index. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 5.18.x
2022-06-22drm/amd: Revert "drm/amd/display: keep eDP Vdd on when eDP stream is already ↵Mario Limonciello1-22/+2
enabled" A variety of Lenovo machines with Rembrandt APUs and OLED panels have stopped showing the display at login. This behavior clears up after leaving it idle and moving the mouse or touching keyboard. It was bisected to be caused by commit 559e2655220d ("drm/amd/display: keep eDP Vdd on when eDP stream is already enabled"). Revert this commit to fix the issue. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2047 Reported-by: Aaron Ma <aaron.ma@canonical.com> Fixes: 559e2655220d ("drm/amd/display: keep eDP Vdd on when eDP stream is already enabled") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: Mark Pearson <markpearson@lenovo.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-22drm/amdgpu: Adjust logic around GTT size (v3)Alex Deucher1-6/+14
Certain GL unit tests for large textures can cause problems with the OOM killer since there is no way to link this memory to a process. This was originally mitigated (but not necessarily eliminated) by limiting the GTT size. The problem is this limit is often too low for many modern games so just make the limit 1/2 of system memory. The OOM accounting needs to be addressed, but we shouldn't prevent common 3D applications from being usable just to potentially mitigate that corner case. Set default GTT size to max(3G, 1/2 of system ram) by default. v2: drop previous logic and default to 3/4 of ram v3: default to half of ram to align with ttm v4: fix spelling in comment (Kent) Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1942 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-14drm/amd/display: Cap OLED brightness per max frame-average luminanceRoman Li1-4/+4
[Why] For OLED eDP the Display Manager uses max_cll value as a limit for brightness control. max_cll defines the content light luminance for individual pixel. Whereas max_fall defines frame-average level luminance. The user may not observe the difference in brightness in between max_fall and max_cll. That negatively impacts the user experience. [How] Use max_fall value instead of max_cll as a limit for brightness control. Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-06-14drm/amdgpu: Fix GTT size reporting in amdgpu_ioctlMichel Dänzer1-2/+0
The commit below changed the TTM manager size unit from pages to bytes, but failed to adjust the corresponding calculations in amdgpu_ioctl. Fixes: dfa714b88eb0 ("drm/amdgpu: remove GTT accounting v2") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1930 Bug: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6642 Tested-by: Martin Roukala <martin.roukala@mupuf.org> Tested-by: Mike Lothian <mike@fireburn.co.uk> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 5.18.x
2022-06-09Merge tag 'amd-drm-fixes-5.19-2022-06-08' of ↵Dave Airlie37-251/+330
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-5.19-2022-06-08: amdgpu: - DCN 3.1 golden settings fix - eDP fixes - DMCUB fixes - GFX11 fixes and cleanups - VCN fix for yellow carp - GMC11 fixes - RAS fixes - GPUVM TLB flush fixes - SMU13 fixes - VCN3 AV1 regression fix - VCN2 JPEG fix - Other misc fixes amdkfd: - MMU notifier fix - Support for more GC 10.3.x families - Pinned BO handling fix - Partial migration bug fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220608203008.6187-1-alexander.deucher@amd.com
2022-06-08drm/amdgpu/mes: only invalid/prime icache when finish loading both pipe MES FWs.Yifan Zhang1-16/+20
invalid/prime icahce operation takes effect both pipes cuconrrently, therefore CP_MES_IC_BASE_LO/HI and CP_MES_MDBASE_LO/HI both have to be set before prime icache. Otherwise MES hardware gets garbage data in above regsters and causes page fault [ 470.873200] amdgpu 0000:33:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:217 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 470.873222] amdgpu 0000:33:00.0: amdgpu: in page starting at address 0x000092cb89b00000 from client 10 [ 470.873234] amdgpu 0000:33:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000BB3 [ 470.873242] amdgpu 0000:33:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5) [ 470.873247] amdgpu 0000:33:00.0: amdgpu: MORE_FAULTS: 0x1 [ 470.873251] amdgpu 0000:33:00.0: amdgpu: WALKER_ERROR: 0x1 [ 470.873256] amdgpu 0000:33:00.0: amdgpu: PERMISSION_FAULTS: 0xb [ 470.873260] amdgpu 0000:33:00.0: amdgpu: MAPPING_ERROR: 0x1 [ 470.873264] amdgpu 0000:33:00.0: amdgpu: RW: 0x0 Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-08drm/amdgpu/jpeg2: Add jpeg vmid update under IB submitMohammad Zafar Ziya2-1/+6
Add jpeg vmid update under IB submit Signed-off-by: Mohammad Zafar Ziya <Mohammadzafar.ziya@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-06-08drm/amdgpu: always flush the TLB on gfx8Christian König1-0/+5
The TLB on GFX8 stores each block of 8 PTEs where any of the valid bits are set. Fixes: 5255e146c99a ("drm/amdgpu: rework TLB flushing") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-08drm/amdgpu: fix limiting AV1 to the first instance on VCN3Christian König1-10/+7
The job is not yet initialized here. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2037 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: cdc7893fc93f ("drm/amdgpu: use job and ib structures directly in CS parsers") Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-08drm/amdkfd:Fix fw version for 10.3.6Jesse Zhang1-1/+3
fix fw error when loading fw for 10.3.6 Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 5.18.x
2022-06-07drm/amdgpu: Add MODE register to wave debug info in gfx11Joseph Greathouse1-0/+1
All other chips, from gfx6-gfx10, now include the MODE register at the end of the wave debug state. This appears to have been missed in gfx11, so this patch adds in MODE to the debug state for gfx11. Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-07Revert "drm/amd/display: Pass the new context into disable OTG WA"Nicholas Kazlauskas3-12/+12
This reverts commit 8440f57532496d398a461887e56ca6f45089fbcf. Causes a hang when hotplugging DP, shutting down system, or enabling dual eDP. Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-07Revert "drm/amdgpu: Ensure the DMA engine is deactivated during set ups"Guchun Chen1-64/+45
This reverts commit b992a19085885c096b19625a85c674cb89829ca1. This causes regression in GPU reset related test. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: ricetons@gmail.com Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-04Merge tag 'bitmap-for-5.19-rc1' of https://github.com/norov/linuxLinus Torvalds2-2/+2
Pull bitmap updates from Yury Norov: - bitmap: optimize bitmap_weight() usage, from me - lib/bitmap.c make bitmap_print_bitmask_to_buf parseable, from Mauro Carvalho Chehab - include/linux/find: Fix documentation, from Anna-Maria Behnsen - bitmap: fix conversion from/to fix-sized arrays, from me - bitmap: Fix return values to be unsigned, from Kees Cook It has been in linux-next for at least a week with no problems. * tag 'bitmap-for-5.19-rc1' of https://github.com/norov/linux: (31 commits) nodemask: Fix return values to be unsigned bitmap: Fix return values to be unsigned KVM: x86: hyper-v: replace bitmap_weight() with hweight64() KVM: x86: hyper-v: fix type of valid_bank_mask ia64: cleanup remove_siblinginfo() drm/amd/pm: use bitmap_{from,to}_arr32 where appropriate KVM: s390: replace bitmap_copy with bitmap_{from,to}_arr64 where appropriate lib/bitmap: add test for bitmap_{from,to}_arr64 lib: add bitmap_{from,to}_arr64 lib/bitmap: extend comment for bitmap_(from,to)_arr32() include/linux/find: Fix documentation lib/bitmap.c make bitmap_print_bitmask_to_buf parseable MAINTAINERS: add cpumask and nodemask files to BITMAP_API arch/x86: replace nodes_weight with nodes_empty where appropriate mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate clocksource: replace cpumask_weight with cpumask_empty in clocksource.c genirq/affinity: replace cpumask_weight with cpumask_empty where appropriate irq: mips: replace cpumask_weight with cpumask_empty where appropriate drm/i915/pmu: replace cpumask_weight with cpumask_empty where appropriate arch/x86: replace cpumask_weight with cpumask_empty where appropriate ...
2022-06-03drm/amdgpu: suppress the compile warning about 64 bit typeEvan Quan1-1/+1
Suppress the compile warning below: drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:1292 gfx_v11_0_rlc_backdoor_autoload_copy_ucode() warn: should '1 << id' be a 64 bit type? Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-03drm/amd/pm: suppress compile warnings about possible unaligned accessesEvan Quan4-10/+23
Suppress the following compile warnings: >> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/inc/smu_v11_0_pptable.h:163:17: warning: field smc_pptable within 'struct smu_11_0_powerplay_table' is less aligned than 'PPTable_t' and is usually due to 'struct smu_11_0_powerplay_table' being packed, which can lead to unaligned accesses [-Wunaligned-access] PPTable_t smc_pptable; //PPTable_t in smu11_driver_if.h ^ 1 warning generated. -- >> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/inc/smu_v11_0_7_pptable.h:193:17: warning: field smc_pptable within 'struct smu_11_0_7_powerplay_table' is less aligned than 'PPTable_t' and is usually due to 'struct smu_11_0_7_powerplay_table' being packed, which can lead to unaligned accesses [-Wunaligned-access] PPTable_t smc_pptable; //PPTable_t in smu11_driver_if.h ^ 1 warning generated. -- >> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/inc/smu_v13_0_pptable.h:161:12: warning: field smc_pptable within 'struct smu_13_0_powerplay_table' is less aligned than 'PPTable_t' and is usually due to 'struct smu_13_0_powerplay_table' being packed, which can lead to unaligned accesses [-Wunaligned-access] Signed-off-by: Evan Quan <evan.quan@amd.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-03drm/amdkfd: Fix partial migration bugsPhilip Yang2-4/+4
Migration range from system memory to VRAM, if system page can not be locked or unmapped, we do partial migration and leave some pages in system memory. Several bugs found to copy pages and update GPU mapping for this situation: 1. copy to vram should use migrate->npage which is total pages of range as npages, not migrate->cpages which is number of pages can be migrated. 2. After partial copy, set VRAM res cursor as j + 1, j is number of system pages copied plus 1 page to skip copy. 3. copy to ram, should collect all continuous VRAM pages and copy together. 4. Call amdgpu_vm_update_range, should pass in offset as bytes, not as number of pages. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-06-03drm/amdkfd: add pinned BOs to kfd_bo_listLang Yu1-7/+6
The kfd_bo_list is used to restore process BOs after evictions. As page tables could be destroyed during evictions, we should also update pinned BOs' page tables during restoring to make sure they are valid. So for pinned BOs, 1, Validate them and update their page tables. 2, Don't add eviction fence for them. v2: - Don't handle pinned ones specially in BO validation.(Felix) Signed-off-by: Lang Yu <Lang.Yu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-03drm/amdgpu: Update PDEs flush TLB if PTB/PDB movedPhilip Yang1-2/+6
Flush TLBs when existing PDEs are updated because a PTB or PDB moved, but avoids unnecessary TLB flushes when new PDBs or PTBs are added to the page table, which commonly happens when memory is mapped for the first time. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-03drm/amdgpu: enable tmz by default for GC 10.3.7Sunil Khatri1-2/+2
Add IP GC 10.3.7 in the list of target to have tmz enabled by default. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 5.18.x
2022-06-03drm/amdkfd: Add GC 10.3.6 and 10.3.7 KFD definitionsMario Limonciello2-0/+16
Loading amdgpu on GC 10.3.7 shows an ERR level message: `kfd kfd: amdgpu: GC IP 0a0307 not supported in kfd` Add these targets to match yellow carp structures. Reported-by: David Chang <david.chang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Jesse(Jie) Zhang <Jesse.Zhang@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 5.18.x
2022-06-03Merge tag 'drm-next-2022-06-03-1' of git://anongit.freedesktop.org/drm/drmLinus Torvalds80-2531/+3531
Pull more drm updates from Dave Airlie: "This is mostly regular fixes, msm and amdgpu. There is a tegra patch that is bit of prep work for a 5.20 feature to avoid some inter-tree syncs, and a couple of late addition amdgpu uAPI changes but best to get those in early, and the userspace pieces are ready. msm: - Limiting WB modes to max sspp linewidth - Fixing the supported rotations to add 180 back for IGT - Fix to handle pm_runtime_get_sync() errors to avoid unclocked access in the bind() path for dpu driver - Fix the irq_free() without request issue which was a big-time hitter in the CI-runs. amdgpu: - Update fdinfo to the common drm format - uapi: - Add VM_NOALLOC GPUVM attribute to prevent buffers for going into the MALL - Add AMDGPU_GEM_CREATE_DISCARDABLE flag to create buffers that can be discarded on eviction - Mesa code which uses these: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16466 - Link training fixes - DPIA fixes - Misc code cleanups - Aux fixes - Hotplug fixes - More FP clean up - Misc GFX9/10 fixes - Fix a possible memory leak in SMU shutdown - SMU 13 updates - RAS fixes - TMZ fixes - GC 11 updates - SMU 11 metrics fixes - Fix coverage blend mode for overlay plane - Note DDR vs LPDDR memory - Fuzz fix for CS IOCTL - Add new PCI DID amdkfd: - Clean up hive setup - Misc fixes tegra: - add some prelim 5.20 work to avoid inter-tree mess" * tag 'drm-next-2022-06-03-1' of git://anongit.freedesktop.org/drm/drm: (57 commits) drm/msm/dpu: Move min BW request and full BW disable back to mdss drm/msm/dpu: Fix pointer dereferenced before checking drm/msm/dpu: Remove unused code drm/msm/disp/dpu1: remove superfluous init drm/msm/dp: Always clear mask bits to disable interrupts at dp_ctrl_reset_irq_ctrl() gpu: host1x: Add context bus drm/amdgpu: add drm-client-id to fdinfo v2 drm/amdgpu: Convert to common fdinfo format v5 drm/amdgpu: bump minor version number drm/amdgpu: add AMDGPU_VM_NOALLOC v2 drm/amdgpu: add AMDGPU_GEM_CREATE_DISCARDABLE drm/amdgpu: add beige goby PCI ID drm/amd/pm: Return auto perf level, if unsupported drm/amdkfd: fix typo in comment drm/amdgpu/gfx: fix typos in comments drm/amdgpu/cs: make commands with 0 chunks illegal behaviour. drm/amdgpu: differentiate between LP and non-LP DDR memory drm/amdgpu: Resolve pcie_bif RAS recovery bug drm/amdgpu: clean up asd on the ta_firmware_header_v2_0 drm/amdgpu/discovery: validate VCN and SDMA instances ...
2022-06-03drm/amd/pm: use bitmap_{from,to}_arr32 where appropriateYury Norov2-2/+2
The smu_v1X_0_set_allowed_mask() uses bitmap_copy() to convert bitmap to 32-bit array. This may be wrong due to endiannes issues. Fix it by switching to bitmap_{from,to}_arr32. CC: Alexander Gordeev <agordeev@linux.ibm.com> CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com> CC: Christian Borntraeger <borntraeger@linux.ibm.com> CC: Claudio Imbrenda <imbrenda@linux.ibm.com> CC: David Hildenbrand <david@redhat.com> CC: Heiko Carstens <hca@linux.ibm.com> CC: Janosch Frank <frankja@linux.ibm.com> CC: Rasmus Villemoes <linux@rasmusvillemoes.dk> CC: Sven Schnelle <svens@linux.ibm.com> CC: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-06-01drm/amdkfd: Use mmget_not_zero in MMU notifierPhilip Yang1-0/+3
MMU notifier callback may pass in mm with mm->mm_users==0 when process is exiting, use mmget_no_zero to avoid accessing invalid mm in deferred list work after mm is gone. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-01drm/amdgpu: Resolve RAS GFX error count issue after cold boot on ArcturusCandice Li2-8/+28
Adjust the sequence for ras late init and separate ras reset error status from query status. v2: squash in fix from Candice Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>