summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
AgeCommit message (Collapse)AuthorFilesLines
2020-07-14drm/amdgpu: fix preemption unit testJack Xiao1-5/+15
Remove signaled jobs from job list and ensure the job was indeed preempted. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-18drm/amdgpu: Add autodump debugfs node for gpu reset v8Jiange Zhao1-1/+77
When GPU got timeout, it would notify an interested part of an opportunity to dump info before actual GPU reset. A usermode app would open 'autodump' node under debugfs system and poll() for readable/writable. When a GPU reset is due, amdgpu would notify usermode app through wait_queue_head and give it 10 minutes to dump info. After usermode app has done its work, this 'autodump' node is closed. On node closure, amdgpu gets to know the dump is done through the completion that is triggered in release(). There is no write or read callback because necessary info can be obtained through dmesg and umr. Messages back and forth between usermode app and amdgpu are unnecessary. v2: (1) changed 'registered' to 'app_listening' (2) add a mutex in open() to prevent race condition v3 (chk): grab the reset lock to avoid race in autodump_open, rename debugfs file to amdgpu_autodump, provide autodump_read as well, style and code cleanups v4: add 'bool app_listening' to differentiate situations, so that the node can be reopened; also, there is no need to wait for completion when no app is waiting for a dump. v5: change 'bool app_listening' to 'enum amdgpu_autodump_state' add 'app_state_mutex' for race conditions: (1)Only 1 user can open this file node (2)wait_dump() can only take effect after poll() executed. (3)eliminated the race condition between release() and wait_dump() v6: removed 'enum amdgpu_autodump_state' and 'app_state_mutex' removed state checking in amdgpu_debugfs_wait_dump Improve on top of version 3 so that the node can be reopened. v7: move reinit_completion into open() so that only one user can open it. v8: remove complete_all() from amdgpu_debugfs_wait_dump(). Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-13drm/amdgpu: restrict debugfs register access under SR-IOVYintian Tao1-4/+69
Under bare metal, there is no more else to take care of the GPU register access through MMIO. Under Virtualization, to access GPU register is implemented through KIQ during run-time due to world-switch. Therefore, under SR-IOV user can only access debugfs to r/w GPU registers when meets all three conditions below. - amdgpu_gpu_recovery=0 - TDR happened - in_gpu_reset=0 v2: merge amdgpu_virt_can_access_debugfs() into amdgpu_virt_enable_access_debugfs() v3: drop ret variable in amdgpu_virt_enable_access_debugfs() and directly return result Signed-off-by: Yintian Tao <yttao@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-03-16drm/amdgpu: revise RLCG access pathMonk Liu1-1/+1
what changed: 1)provide new implementation interface for the rlcg access path 2)put SQ_CMD/SQ_IND_INDEX to GFX9 RLCG path to let debugfs's reg_op function can access reg that need RLCG path help now even debugfs's reg_op can used to dump wave. tested-by: Monk Liu <monk.liu@amd.com> tested-by: Zhou pengju <pengju.zhou@amd.com> Signed-off-by: Zhou pengju <pengju.zhou@amd.com> Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-03-13drm/amd/amdgpu: Fix GPR read from debugfs (v2)Tom St Denis1-3/+3
The offset into the array was specified in bytes but should be in terms of 32-bit words. Also prevent large reads that would also cause a buffer overread. v2: Read from correct offset from internal storage buffer. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-03-13drm/amdgpu: use amdgpu_ras.h in amdgpu_debugfs.cStanley.Yang1-1/+1
include amdgpu_ras.h head file instead of use extern ras_debugfs_create_all function Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-03-10drm/amdgpu: call ras_debugfs_create_all in debugfs_initTao Zhou1-0/+3
and remove each ras IP's own debugfs creation this is required to fix ras when the driver does not use the drm load and unload callbacks due to ordering issues with the drm device node. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-03-05drm/amdgpu: Add debugfs interface to set arbitrary sclk for navi14 (v2)Chengming Gui1-0/+43
add debugfs interface amdgpu_force_sclk to set arbitrary sclk for navi14 v2: Add lock Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-28drm/amdgpu: no need to clean debugfs at amdgpuYintian Tao1-30/+0
drm_minor_unregister will invoke drm_debugfs_cleanup to clean all the child node under primary minor node. We don't need to invoke amdgpu_debugfs_fini and amdgpu_debugfs_regs_cleanup to clean agian. Otherwise, it will raise the NULL pointer like below. [ 45.046029] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 [ 45.047256] PGD 0 P4D 0 [ 45.047713] Oops: 0002 [#1] SMP PTI [ 45.048198] CPU: 0 PID: 2796 Comm: modprobe Tainted: G W OE 4.18.0-15-generic #16~18.04.1-Ubuntu [ 45.049538] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 [ 45.050651] RIP: 0010:down_write+0x1f/0x40 [ 45.051194] Code: 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb e8 ce d9 ff ff 48 ba 01 00 00 00 ff ff ff ff 48 89 d8 <f0> 48 0f c1 10 85 d2 74 05 e8 53 1c ff ff 65 48 8b 04 25 00 5c 01 [ 45.053702] RSP: 0018:ffffad8f4133fd40 EFLAGS: 00010246 [ 45.054384] RAX: 00000000000000a8 RBX: 00000000000000a8 RCX: ffffa011327dd814 [ 45.055349] RDX: ffffffff00000001 RSI: 0000000000000001 RDI: 00000000000000a8 [ 45.056346] RBP: ffffad8f4133fd48 R08: 0000000000000000 R09: ffffffffc0690a00 [ 45.057326] R10: ffffad8f4133fd58 R11: 0000000000000001 R12: ffffa0113cff0300 [ 45.058266] R13: ffffa0113c0a0000 R14: ffffffffc0c02a10 R15: ffffa0113e5c7860 [ 45.059221] FS: 00007f60d46f9540(0000) GS:ffffa0113fc00000(0000) knlGS:0000000000000000 [ 45.060809] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 45.061826] CR2: 00000000000000a8 CR3: 0000000136250004 CR4: 00000000003606f0 [ 45.062913] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 45.064404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 45.065897] Call Trace: [ 45.066426] debugfs_remove+0x36/0xa0 [ 45.067131] amdgpu_debugfs_ring_fini+0x15/0x20 [amdgpu] [ 45.068019] amdgpu_debugfs_fini+0x2c/0x50 [amdgpu] [ 45.068756] amdgpu_pci_remove+0x49/0x70 [amdgpu] [ 45.069439] pci_device_remove+0x3e/0xc0 [ 45.070037] device_release_driver_internal+0x18a/0x260 [ 45.070842] driver_detach+0x3f/0x80 [ 45.071325] bus_remove_driver+0x59/0xd0 [ 45.071850] driver_unregister+0x2c/0x40 [ 45.072377] pci_unregister_driver+0x22/0xa0 [ 45.073043] amdgpu_exit+0x15/0x57c [amdgpu] [ 45.073683] __x64_sys_delete_module+0x146/0x280 [ 45.074369] do_syscall_64+0x5a/0x120 [ 45.074916] entry_SYSCALL_64_after_hwframe+0x44/0xa9 v2: remove all debugfs cleanup/fini code at amdgpu v3: squash in unused variable removal Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26drm/amdgpu/display: move debugfs init into core amdgpu debugfs (v2)Alex Deucher1-0/+8
In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for display. v2: add config guard for DC Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Harry Wentland <harry.wentland@amd.com> (v1) Acked-by: Christian König <christian.koenig@amd.com> (v1) Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26drm/amdgpu/ring: move debugfs init into core amdgpu debugfsAlex Deucher1-1/+22
In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for rings. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26drm/amdgpu/firmware: move debugfs init into core amdgpu debugfsAlex Deucher1-0/+4
In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for firmware. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26drm/amdgpu/regs: move debugfs init into core amdgpu debugfsAlex Deucher1-0/+4
In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for register access files. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26drm/amdgpu/gem: move debugfs init into core amdgpu debugfsAlex Deucher1-0/+4
In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for gem. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26drm/amdgpu/fence: move debugfs init into core amdgpu debugfsAlex Deucher1-0/+3
In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for fence handling. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26drm/amdgpu/sa: move debugfs init into core amdgpu debugfsAlex Deucher1-0/+4
In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for SA (sub allocator). Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26drm/amdgpu/pm: move debugfs init into core amdgpu debugfsAlex Deucher1-0/+7
In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for pm. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26drm/amdgpu/ttm: move debugfs init into core amdgpu debugfsAlex Deucher1-0/+10
In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for ttm. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26drm/amdgpu: rename amdgpu_debugfs_preempt_cleanupAlex Deucher1-2/+2
to amdgpu_debugfs_fini. It will be used for other things in the future. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26drm/amd/amdgpu: Add gfxoff debugfs entryTom St Denis1-0/+56
Write a 32-bit value of zero to disable GFXOFF and write a 32-bit value of non-zero to enable GFXOFF. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14drm/amdgpu/debugfs: properly handle runtime pmAlex Deucher1-7/+126
If driver debugfs files are accessed, power up the GPU when necessary. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23drm/amdgpu: use true, false for bool variable in amdgpu_debugfs.czhengbin1-3/+3
Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:132:2-10: WARNING: Assignment of 0/1 to bool variable drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:140:2-10: WARNING: Assignment of 0/1 to bool variable drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:142:13-21: WARNING: Assignment of 0/1 to bool variable Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07drm/amdgpu: Avoid accidental thread reactivation.Andrey Grodzovsky1-0/+10
Problem: During GPU reset we call the GPU scheduler to suspend it's thread, those two functions in amdgpu also suspend and resume the sceduler for their needs but this can collide with GPU reset in progress and accidently restart a suspended thread before time. Fix: Serialize with GPU reset. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30drm/amdgpu: Remove superfluous void * cast in debugfs_create_file() callGeert Uytterhoeven1-2/+2
There is no need to cast a typed pointer to a void pointer when calling a function that accepts the latter. Remove it, as the cast prevents further compiler checks. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16drm/amdgpu: remove the redundant null checkszhong jiang1-4/+2
debugfs_remove and kfree has taken the null check in account. hence it is unnecessary to check it. Just remove the condition. No functional change. This issue was detected by using the Coccinelle software. Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31drm/amdgpu: fix a potential information leaking bugWang Xiayang1-1/+1
Coccinelle reports a path that the array "data" is never initialized. The path skips the checks in the conditional branches when either of callback functions, read_wave_vgprs and read_wave_sgprs, is not registered. Later, the uninitialized "data" array is read in the while-loop below and passed to put_user(). Fix the path by allocating the array with kcalloc(). The patch is simplier than adding a fall-back branch that explicitly calls memset(data, 0, ...). Also it does not need the multiplication 1024*sizeof(*data) as the size parameter for memset() though there is no risk of integer overflow. Signed-off-by: Wang Xiayang <xywang.sjtu@sjtu.edu.cn> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-17drm/amd/amdgpu: Fix offset for vmid selection in debugfs interfaceTom St Denis1-1/+1
The register debugfs interface was using the wrong bitmask for vmid selection for GFX_CNTL. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-16drm/amd/amdgpu: Add VMID to SRBM debugfs bank selectionTom St Denis1-4/+5
Add 5 bits to the offset for SRBM selection to handle VMIDs. Also update the select_me_pipe_q() callback to also select VMID. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-25Merge branch 'drm-next' into drm-next-5.3Alex Deucher1-2/+5
Backmerge drm-next and fix up conflicts due to drmP.h removal. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-21drm/amdgpu: mark the partial job as preempted in mcbp unit testJack Xiao1-12/+32
In mcbp unit test, the test should detect the preempted job which may be a partial execution ib and mark it as preempted; so that the gfx block can correctly generate PM4 frame. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-21drm/amdgpu: add mcbp unit test in debugfs (v3)Jack Xiao1-0/+158
The MCBP unit test is used to test the functionality of MCBP. It emualtes to send preemption request and resubmit the unfinished jobs. v2: squash in fixes (Alex) v3: squash in memory leak fix (Jack) Acked-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-10drm/amd: drop use of drmP.h in amdgpu/amdgpu*Sam Ravnborg1-2/+5
Drop use of drmP.h in all files named amdgpu* in drm/amd/amdgpu/ Fix fallout. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20190609220757.10862-10-sam@ravnborg.org
2019-03-19drm/amd/powerplay: implement sysfs of amdgpu_get_busy_percent for smu11Kevin Wang1-4/+3
add interface amdgpu_get_busy_percent for smu11 v2: convert data pointer type to uint32_t *. Signed-off-by: Kevin Wang <Kevin1.Wang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-02-13drm/amdgpu: don't clamp debugfs register access to the BAR sizeAlex Deucher1-3/+0
This prevents us from accessing extended registers in tools like umr. The register access functions already check if the offset is beyond the BAR size and use the indirect accessors with locking so this is safe. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-16drm/amd/amdgpu: Fix debugfs error handlingDan Carpenter1-10/+2
The error handling is wrong and "ent" could be NULL we when dereference it to get "ent->d_inode". The thing is that normally debugfs_create_file() is not supposed to require (or have) any error handling. That function does return error pointers if debugfs is turned off but we know it's enable here. When it's enabled, then it returns NULL on error. So what I did was I stripped out all the error handling except around the i_size_write(). I could have just used a NULL check instead of an IS_ERR_OR_NULL() but I figured this was more clear because that way you don't have to look at the surrounding code to see whether debugfs is enabled or not. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15drm/amd/amdgpu: Add some documentation to the debugfs entriesTom St Denis1-4/+189
Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-04-11drm/amdgpu: Use dpm_enabled as dpm state flagRex Zhu1-1/+1
driver will set dpm_enabled to true only when module parameter amdgpu_dpm not equal to 0 and smu hw initialize successfully. Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-04-03drm/amdgpu: Add support for SRBM selection v3Andrey Grodzovsky1-77/+40
Also remove code duplication in write and read regs functions. This also fixes potential missing unlock in amdgpu_debugfs_regs_write in case get_user would fail. v2: Add SRBM mutex locking. v3: Fix TO counter and fix comment location. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-07drm/amdgpu: add amdgpu_evict_gtt debugfs entryChristian König1-1/+12
Allow evicting all BOs from the GTT domain. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexdeucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18drm/amdgpu: move debugfs functions to their own fileAlex Deucher1-0/+792
amdgpu_device.c was getting pretty cluttered. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>