summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
AgeCommit message (Collapse)AuthorFilesLines
2022-04-14drm/amd/amdgpu: Remove static from variable in RLCG Reg RWGavin Wan1-5/+5
[why] These static variables save the RLC Scratch registers address. When we install multiple GPUs (for example: XGMI setting) and multiple GPUs call the function at same time. The RLC Scratch registers address are changed each other. Then it caused reading/writing from/to wrong GPU. [how] Removed the static from the variables. The variables are on the stack. Fixes: 5d447e29670148 ("drm/amdgpu: add helper for rlcg indirect reg access") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Gavin Wan <Gavin.Wan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-13drm/amd/amdgpu: Not request init data for MS_HYPERV with vega10Yongqiang Sun1-2/+10
MS_HYPERV with vega10 doesn't have the interface to process request init data msg. Check hypervisor type to not send the request for MS_HYPERV. Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com> Reviewed-by: Alice Wong <shiwei.wong@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-25drm/amdgpu: Fix spelling mistake "regiser" -> "register"Colin Ian King1-1/+1
There is a spelling mistake in a dev_error error message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-25Merge tag 'drm-misc-next-2022-02-23' of ↵Dave Airlie1-2/+4
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v5.18: UAPI Changes: Cross-subsystem Changes: - Split out panel-lvds and lvds dt bindings . - Put yes/no on/off disabled/enabled strings in linux/string_helpers.h and use it in drivers and tomoyo. - Clarify dma_fence_chain and dma_fence_array should never include eachother. - Flatten chains in syncobj's. - Don't double add in fbdev/defio when page is already enlisted. - Don't sort deferred-I/O pages by default in fbdev. Core Changes: - Fix missing pm_runtime_put_sync in bridge. - Set modifier support to only linear fb modifier if drivers don't advertise support. - As a result, we remove allow_fb_modifiers. - Add missing clear for EDID Deep Color Modes in drm_reset_display_info. - Assorted documentation updates. - Warn once in drm_clflush if there is no arch support. - Add missing select for dp helper in drm_panel_edp. - Assorted small fixes. - Improve fb-helper's clipping handling. - Don't dump shmem mmaps in a core dump. - Add accounting to ttm resource manager, and use it in amdgpu. - Allow querying the detected eDP panel through debugfs. - Add helpers for xrgb8888 to 8 and 1 bits gray. - Improve drm's buddy allocator. - Add selftests for the buddy allocator. Driver Changes: - Add support for nomodeset to a lot of drm drivers. - Use drm_module_*_driver in a lot of drm drivers. - Assorted small fixes to bridge/lt9611, v3d, vc4, vmwgfx, mxsfb, nouveau, bridge/dw-hdmi, panfrost, lima, ingenic, sprd, bridge/anx7625, ti-sn65dsi86. - Add bridge/it6505. - Create DP and DVI-I connectors in ast. - Assorted nouveau backlight fixes. - Rework amdgpu reset handling. - Add dt bindings for ingenic,jz4780-dw-hdmi. - Support reading edid through aux channel in ingenic. - Add a drm driver for Solomon SSD130x OLED displays. - Add simple support for sharp LQ140M1JW46. - Add more panels to nt35560. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/686ec871-e77f-c230-22e5-9e3bb80f064a@linux.intel.com
2022-02-16drm/amdgpu: Fix wait for RLCG command completionVictor Skvortsov1-1/+1
if (!(tmp & flag)) condition will always evaluate to true when the flag is 0x0 (AMDGPU_RLCG_GC_WRITE). Instead check that address bits are cleared to determine whether the command is complete. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Tested-by: Bokun Zhang <bokun.zhang@amd.com> Reviewed by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-14drm/amdgpu: no rlcg legacy read in SRIOV caseGuchun Chen1-3/+3
rlcg legacy read is not available in SRIOV configration. Otherwise, gmc_v9_0_flush_gpu_tlb will always complain timeout and finally breaks driver load. v2: bypass read in amdgpu_virt_get_rlcg_reg_access_flag (from Victor) Fixes: 97d1a3b967a3cb ("drm/amdgpu: switch to get_rlcg_reg_access_flag for gfx9") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Victor Skvortsov <Victor.Skvortsov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-14drm/amdgpu: remove VRAM accounting v2Christian König1-2/+4
This is provided by TTM now. Also switch man->size to bytes instead of pages and fix the double printing of size and usage in debugfs. v2: fix size checking as well Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220214093439.2989-8-christian.koenig@amd.com
2022-02-02drm/amdgpu: drop flood print in rlcg reg access functionGuchun Chen1-3/+0
A lot of below message are outputed in SRIOV case. amdgpu: indirect registers access through rlcg is not supported Also drop redundant ret set, as it's initialized to be false already. Fixes: 29dbcac82f96d0 ("drm/amdgpu: add helper to query rlcg reg access flag") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-02drm/amdgpu: Fix uninitialized variable use warningLijo Lazar1-0/+1
Fix uninitialized variable use warning: variable 'reg_access_ctrl' is uninitialized when used here [-Wuninitialized] scratch_reg0 = (void __iomem *)adev->rmmio + 4 * reg_access_ctrl->scratch_reg0; Fixes: 5d447e29670148 ("drm/amdgpu: add helper for rlcg indirect reg access") Reported-by: kernel test robot <yujie.liu@intel.com> Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25drm/amdgpu: retire rlc callbacks sriov_rreg/wregHawking Zhang1-2/+3
Not needed anymore. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25drm/amdgpu: add helper for rlcg indirect reg accessHawking Zhang1-0/+111
The helper will be used to access registers from sriov guest in full access time Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25drm/amdgpu: add helper to query rlcg reg access flagHawking Zhang1-0/+35
Query rlc indirect register access approach specified by sriov host driver per ip blocks Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-18drm/amd/amdgpu: fixing read wrong pf2vf data in SRIOVJingwen Chen1-13/+7
[Why] This fixes 892deb48269c ("drm/amdgpu: Separate vf2pf work item init from virt data exchange"). we should read pf2vf data based at mman.fw_vram_usage_va after gmc sw_init. commit 892deb48269c breaks this logic. [How] calling amdgpu_virt_exchange_data in amdgpu_virt_init_data_exchange to set the right base in the right sequence. v2: call amdgpu_virt_init_data_exchange after gmc sw_init to make data exchange workqueue run v3: clean up the code logic v4: add some comment and make the code more readable Fixes: 892deb48269c ("drm/amdgpu: Separate vf2pf work item init from virt data exchange") Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Reviewed-by: Horace Chen <horace.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-18drm/amd/amdgpu: fixing read wrong pf2vf data in SRIOVJingwen Chen1-13/+7
[Why] This fixes 892deb48269c ("drm/amdgpu: Separate vf2pf work item init from virt data exchange"). we should read pf2vf data based at mman.fw_vram_usage_va after gmc sw_init. commit 892deb48269c breaks this logic. [How] calling amdgpu_virt_exchange_data in amdgpu_virt_init_data_exchange to set the right base in the right sequence. v2: call amdgpu_virt_init_data_exchange after gmc sw_init to make data exchange workqueue run v3: clean up the code logic v4: add some comment and make the code more readable Fixes: 892deb48269c ("drm/amdgpu: Separate vf2pf work item init from virt data exchange") Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Reviewed-by: Horace Chen <horace.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11drm/amdgpu: do not pass ttm_resource_manager to vram_mgrNirmoy Das1-3/+2
Do not allow exported amdgpu_vram_mgr_*() to accept any ttm_resource_manager pointer. Also there is no need to force other module to call a ttm function just to eventually call vram_mgr functions. v2: pass adev's vram_mgr instead of adev Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-07drm/amdgpu: add dummy event6 for vega10James Yao1-0/+4
[why] Malicious mailbox event1 fails driver loading on vega10. A dummy event6 prevent driver from taking response from malicious event1 as its own. [how] On vega10, send a mailbox event6 before sending event1. Signed-off-by: James Yao <yiqing.yao@amd.com> Reviewed-by: Jingwen Chen <Jingwen.Chen2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-16drm/amdgpu: Separate vf2pf work item init from virt data exchangeVictor Skvortsov1-12/+24
We want to be able to call virt data exchange conditionally after gmc sw init to reserve bad pages as early as possible. Since this is a conditional call, we will need to call it again unconditionally later in the init sequence. Refactor the data exchange function so it can be called multiple times without re-initializing the work item. v2: Cleaned up the code. Kept the original call to init_exchange_data() inside early init to initialize the work item, afterwards call exchange_data() when needed. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed By: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-22drm/amd/amdgpu: cleanup the code style a bitBernard Zhao1-8/+13
This change is to cleanup the code style a bit. Signed-off-by: Bernard Zhao <bernard@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28drm/amdgpu: Update TA version output in driverCandice Li1-2/+2
TA version should only be displayed in firmware version column. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-21drm/amd/amdgpu: add dummy_page_addr to sriov msgJingwen Chen1-0/+1
Add dummy_page_addr to sriov msg for host driver to set GCVM_L2_PROTECTION_DEFAULT_ADDR* registers correctly. v2: should update vf2pf msg instead Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Reviewed-by: Horace Chen <horace.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14drm/amdgpu: Unify PSP TA contextCandice Li1-3/+6
Remove all TA binary structures and add the specific binary structure in struct ta_context. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16drm/amd/amdgpu: consolidate PSP TA contextCandice Li1-3/+3
Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23drm/amd/amdgpu: add consistent PSP FW loading size checkingCandice Li1-1/+1
Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <John.Clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-19drm/amdgpu: Fill adev->unique_id with data from PF2VF msgJiawei Gu1-0/+2
Initialize unique_id from PF2VF under virtualization. V2: skip smu_get_unique_id() under virtualization Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-19drm/amdgpu: Complete multimedia bandwidth interfaceBokun Zhang1-0/+56
- Update SRIOV PF2VF header with latest revision - Extend existing function in amdgpu_virt.c to read MM bandwidth config from PF2VF message - Add SRIOV Sienna Cichlid codec array and update the bandwidth with PF2VF message v2: squash in removal of unused variable (Alex) Signed-off-by: Bokun Zhang <bokun.zhang@amd.com> Reviewed-by: Monk liu <monk.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-10drm/amdgpu: Add Aldebaran virtualization supportZhigang Luo1-3/+7
1. add Aldebaran in virtualization detection list. 2. disable Aldebaran virtual display support as there is no GFX engine in Aldebaran. 3. skip TMR loading if Aldebaran is in virtualizatin mode as it shares the one host loaded. Signed-off-by: Zhigang Luo <zhigang.luo@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09drm/amdgpu: indirect register access for nv12 sriovPeng Ju Zhou1-0/+2
using the control bits got from host to control registers access. Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com> Reviewed-by: Emily.Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09drm/amdgpu: indirect register access for nv12 sriovPeng Ju Zhou1-0/+8
get pf2vf msg info at it's earliest time so that guest driver can use these info to decide whether register indirect access enabled. Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com> Reviewed-by: Emily.Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23drm/amdgpu: detect sriov capability for aldebaranHawking Zhang1-0/+1
SRIOV pf/vf function identifier regsiter in aldebaran is the same as the one in arcturus Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-25drm/amd/amdgpu: add error handling to amdgpu_virt_read_pf2vf_dataJingwen Chen1-1/+5
[Why] when vram lost happened in guest, try to write vram can lead to kernel stuck. [How] When the readback data is invalid, don't do write work, directly reschedule a new work. Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Reviewed-by: Monk Liu<monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-20drm/amd/amdgpu: remove redundant flush_delayed_workJingwen Chen1-1/+0
When using cancel_delayed_work_sync, there's no need to flush_delayed_work first. This sequence can lead to a redundant loop of work executing. Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-13drm/amdgpu/sriov Stop data exchange for wholegpu resetJack Zhang1-0/+1
[Why] When host trigger a whole gpu reset, guest will keep waiting till host finish reset. But there's a work queue in guest exchanging data between vf&pf which need to access frame buffer. During whole gpu reset, frame buffer is not accessable, and this causes the call trace. [How] After vf get reset notification from pf, stop data exchange. Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-12-10Merge tag 'amd-drm-next-5.11-2020-12-09' of ↵Dave Airlie1-9/+9
git://people.freedesktop.org/~agd5f/linux into drm-next amd-drm-next-5.11-2020-12-09: amdgpu: - SR-IOV fixes - Navy Flounder updates - Sienna Cichlid updates - Dimgrey Cavefish updates - Vangogh updates - Misc SMU fixes - Misc display fixes - Last big hunk of W=1 warning fixes - Cursor validation fixes - CI BACO updates From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201210045344.21566-1-alexander.deucher@amd.com Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-11-24drm/amd/amdgpu/amdgpu_virt: Correct possible copy/paste or doc-rot misnaming ↵Lee Jones1-6/+6
issue Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c:115: warning: Function parameter or member 'adev' not described in 'amdgpu_virt_request_full_gpu' drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c:115: warning: Excess function parameter 'amdgpu' description in 'amdgpu_virt_request_full_gpu' drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c:138: warning: Function parameter or member 'adev' not described in 'amdgpu_virt_release_full_gpu' drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c:138: warning: Excess function parameter 'amdgpu' description in 'amdgpu_virt_release_full_gpu' drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c:159: warning: Function parameter or member 'adev' not described in 'amdgpu_virt_reset_gpu' drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c:159: warning: Excess function parameter 'amdgpu' description in 'amdgpu_virt_reset_gpu' drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c:194: warning: Function parameter or member 'adev' not described in 'amdgpu_virt_wait_reset' drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c:194: warning: Excess function parameter 'amdgpu' description in 'amdgpu_virt_wait_reset' drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c:210: warning: Function parameter or member 'adev' not described in 'amdgpu_virt_alloc_mm_table' drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c:210: warning: Excess function parameter 'amdgpu' description in 'amdgpu_virt_alloc_mm_table' drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c:239: warning: Function parameter or member 'adev' not described in 'amdgpu_virt_free_mm_table' drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c:239: warning: Excess function parameter 'amdgpu' description in 'amdgpu_virt_free_mm_table' Acked-by: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-11-24amd/amdgpu: use kmalloc_array to replace kmalloc with multiplyBernard Zhao1-2/+2
Fix check_patch.pl warning: WARNING: Prefer kmalloc_array over kmalloc with multiply +bps = kmalloc(align_space * sizeof((*data)->bps), GFP_KERNEL); WARNING: Prefer kmalloc_array over kmalloc with multiply +bps_bo = kmalloc(align_space * sizeof((*data)->bps_bo), GFP_KERNEL); kmalloc_array has multiply overflow check, which will be safer. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Bernard Zhao <bernard@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-11-13drm/amd/amdgpu/amdgpu_virt: Make local function ↵Lee Jones1-1/+1
'amdgpu_virt_update_vf2pf_work_item()' static Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c:560:6: warning: no previous prototype for ‘amdgpu_virt_update_vf2pf_work_item’ [-Wmissing-prototypes] Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-11-06drm/amdgpu/virt: fix handling of the atomic flagAlex Deucher1-1/+3
Use the per device drm driver feature flags rather than the global one. This way we can make the drm driver struct const. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20201104100425.1922351-3-daniel.vetter@ffwll.ch
2020-09-25drm/amdgpu: Implement new guest side VF2PF message transaction (v2)Bokun Zhang1-54/+189
- Refactor the driver code to use amdgpu_virt_read_pf2vf_data and amdgpu_virt_write_vf2pf_data instead of writing all code in one function (which is the old amdgpu_virt_init_data_exchange) - Adding a new transaction method for VF2PF message between host and guest driver. Guest side will periodically update VF2PF message in the framebuffer. In the new header, we include guest ucode information, guest framebuffer usage, and engine usage - Clean up the old macros since they will cause compile error if the new transaction method is used v2: squash in build fix Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24drm/amdgpu: Get DRM dev from adev by inline-fLuben Tuikov1-1/+1
Add a static inline adev_to_drm() to obtain the DRM device pointer from an amdgpu_device pointer. Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24drm/amdgpu: refine message print for devices of hiveDennis Li1-1/+1
Using dev_xxx instead of DRM_xxx/pr_xxx to indicate which device of a hive is the message for. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14drm/amdgpu: revert "fix system hang issue during GPU reset"Christian König1-1/+1
The whole approach wasn't thought through till the end. We already had a reset lock like this in the past and it caused the same problems like this one. Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary. This reverts commit df9c8d1aa278c435c30a69b8f2418b4a52fcb929. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-04drm/amdgpu: move vram usage by vbios to mman (v2)Alex Deucher1-3/+3
It's related to the memory manager so move it there. v2: inline the structure Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-27drm/amdgpu: fix system hang issue during GPU resetDennis Li1-1/+1
when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover, the atomic adev->in_gpu_reset and hive->in_reset are used to avoid re-entering GPU recovery. During GPU reset and resume, it is unsafe that other threads access GPU, which maybe cause GPU reset failed. Therefore the new rw_semaphore adev->reset_sem is introduced, which protect GPU from being accessed by external threads during recovery. v2: 1. add rwlock for some ioctls, debugfs and file-close function. 2. change to use dqm->is_resetting and dqm_lock for protection in kfd driver. 3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid re-enter GPU recovery for the same GPU hang. v3: 1. change back to use adev->reset_sem to protect kfd callback functions, because dqm_lock couldn't protect all codes, for example: free_mqd must be called outside of dqm_lock; [ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019 [ 1230.177221] Call Trace: [ 1230.178249] dump_stack+0x98/0xd5 [ 1230.179443] amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu] [ 1230.180673] gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu] [ 1230.181882] amdgpu_gart_unbind+0xa9/0xe0 [amdgpu] [ 1230.183098] amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu] [ 1230.184239] ? ttm_bo_put+0x171/0x5f0 [ttm] [ 1230.185394] ttm_tt_unbind+0x21/0x40 [ttm] [ 1230.186558] ttm_tt_destroy.part.12+0x12/0x60 [ttm] [ 1230.187707] ttm_tt_destroy+0x13/0x20 [ttm] [ 1230.188832] ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm] [ 1230.189979] ttm_bo_put+0x1be/0x5f0 [ttm] [ 1230.191230] amdgpu_bo_unref+0x1e/0x30 [amdgpu] [ 1230.192522] amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu] [ 1230.193833] free_mqd+0x25/0x40 [amdgpu] [ 1230.195143] destroy_queue_cpsch+0x1a7/0x270 [amdgpu] [ 1230.196475] pqm_destroy_queue+0x105/0x260 [amdgpu] [ 1230.197819] kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu] [ 1230.199154] kfd_ioctl+0x277/0x500 [amdgpu] [ 1230.200458] ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu] [ 1230.201656] ? tomoyo_file_ioctl+0x19/0x20 [ 1230.202831] ksys_ioctl+0x98/0xb0 [ 1230.204004] __x64_sys_ioctl+0x1a/0x20 [ 1230.205174] do_syscall_64+0x5f/0x250 [ 1230.206339] entry_SYSCALL_64_after_hwframe+0x49/0xbe 2. remove try_lock and introduce atomic hive->in_reset, to avoid re-enter GPU recovery. v4: 1. remove an unnecessary whitespace change in kfd_chardev.c 2. remove comment codes in amdgpu_device.c 3. add more detailed comment in commit message 4. define a wrap function amdgpu_in_reset v5: 1. Fix some style issues. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com> Suggested-by: Luben Tukov <luben.tuikov@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-02drm/amdgpu: request init data in virt detectionWenhui Sheng1-0/+28
Move request init data to virt detection func, so we can insert request full access between request init data and set ip blocks. Signed-off-by: Wenhui Sheng <Wenhui.Sheng@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01drm/amdgpu: label internally used symbols as staticNirmoy Das1-2/+2
Used sparse(make C=1) to find these loose ends. v2: removed unwanted extra line Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01drm/amdgpu: support reserve bad page for virt (v3)Stanley.Yang1-0/+173
v1: rename some functions name, only init ras error handler data for supported asic. v2: fix potential memory leak. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01drm/amdgpu/sriov : Add sriov detection for sienna_cichlidshaoyunl1-0/+1
This is a regression due to the rebase , add sienna_cichlid sriov detection back Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-18drm/amdgpu: add amdgpu_virt_get_vf_mode helper functionKevin Wang1-0/+16
the swsmu or powerplay(hwmgr) need to handle task according to different VF mode, this function to help query vf mode. vf mode: 1. SRIOV_VF_MODE_BARE_METAL: the driver work on host OS (PF) 2. SRIOV_VF_MODE_ONE_VF : the driver work on guest OS with one VF 3. SRIOV_VF_MODE_MULTI_VF : the driver work on guest OS with multi VF Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-24drm/amdgpu: protect ring overrunYintian Tao1-1/+7
Wait for the oldest sequence on the ring to be signaled in order to make sure there will be no command overrun. v2: fix coding stype and remove abs operation v3: remove the initialization of variable r Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-13drm/amdgpu: resume kiq access debugfsYintian Tao1-3/+9
If there is no GPU hang, user still can access debugfs through kiq. Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Monk Liu <Monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>