summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/scheduler
AgeCommit message (Collapse)AuthorFilesLines
2016-10-25dma-buf: Rename struct fence to dma_fenceChris Wilson4-70/+75
I plan to usurp the short name of struct fence for a core kernel struct, and so I need to rename the specialised fence/timeline for DMA operations to make room. A consensus was reached in https://lists.freedesktop.org/archives/dri-devel/2016-July/113083.html that making clear this fence applies to DMA operations was a good thing. Since then the patch has grown a bit as usage increases, so hopefully it remains a good thing! (v2...: rebase, rerun spatch) v3: Compile on msm, spotted a manual fixup that I broke. v4: Try again for msm, sorry Daniel coccinelle script: @@ @@ - struct fence + struct dma_fence @@ @@ - struct fence_ops + struct dma_fence_ops @@ @@ - struct fence_cb + struct dma_fence_cb @@ @@ - struct fence_array + struct dma_fence_array @@ @@ - enum fence_flag_bits + enum dma_fence_flag_bits @@ @@ ( - fence_init + dma_fence_init | - fence_release + dma_fence_release | - fence_free + dma_fence_free | - fence_get + dma_fence_get | - fence_get_rcu + dma_fence_get_rcu | - fence_put + dma_fence_put | - fence_signal + dma_fence_signal | - fence_signal_locked + dma_fence_signal_locked | - fence_default_wait + dma_fence_default_wait | - fence_add_callback + dma_fence_add_callback | - fence_remove_callback + dma_fence_remove_callback | - fence_enable_sw_signaling + dma_fence_enable_sw_signaling | - fence_is_signaled_locked + dma_fence_is_signaled_locked | - fence_is_signaled + dma_fence_is_signaled | - fence_is_later + dma_fence_is_later | - fence_later + dma_fence_later | - fence_wait_timeout + dma_fence_wait_timeout | - fence_wait_any_timeout + dma_fence_wait_any_timeout | - fence_wait + dma_fence_wait | - fence_context_alloc + dma_fence_context_alloc | - fence_array_create + dma_fence_array_create | - to_fence_array + to_dma_fence_array | - fence_is_array + dma_fence_is_array | - trace_fence_emit + trace_dma_fence_emit | - FENCE_TRACE + DMA_FENCE_TRACE | - FENCE_WARN + DMA_FENCE_WARN | - FENCE_ERR + DMA_FENCE_ERR ) ( ... ) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Acked-by: Sumit Semwal <sumit.semwal@linaro.org> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161025120045.28839-1-chris@chris-wilson.co.uk
2016-08-19drm/amdgpu: fix timeout value check in amd_sched_job_recoveryChristian König1-1/+1
Could be that we don't actually have a timeout set. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-29drm/amd: fix deadlock of job_list_lock V2Chunming Zhou1-3/+6
run_job involves mutex, which could sleep. V2: use list_for_each_entry_safe, since the job might complete while we dropped the lock. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-29drm/amd: reset hw count when reset jobChunming Zhou1-0/+3
Means the hw ring is empty after gpu reset. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: add amd_sched_job_recoveryChunming Zhou2-0/+34
Which is to recover hw jobs when gpu reset. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amd: add amd_sched_hw_job_resetChunming Zhou2-0/+15
amd_sched_hw_job_reset will remove callback from hw fence. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amd: add parent for sched fenceChunming Zhou3-0/+3
Parent of sched fence is hw fence which is to signal sched fence. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: remove fence parameter from amd_sched_job_initChristian König2-4/+2
We return the fence as part of the job structur anyway, no need to do this twice. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: stop disabling irqs when it isn't neccessaryChristian König1-8/+6
A regular spin_lock/unlock should do here as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: block scheduler when gpu resetChunming Zhou1-3/+14
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: stop trying to schedule() with a spin heldChristian König1-0/+2
Drop the lock before calling cancel_delayed_work_sync(). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96445 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: generalize the scheduler fenceChristian König4-45/+75
Make it two events, one for the job being scheduled and one when it is finished. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: fix and cleanup job destructionChristian König2-47/+24
Remove the job reference counting and just properly destroy it from a work item which blocks on any potential running timeout handler. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk.Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: move locking into the functions who need itChristian König1-9/+9
Otherwise the locking becomes rather confusing. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk.Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: properly abstract scheduler timeout handlingChristian König2-3/+10
The driver shouldn't mess with the scheduler internals. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk.Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: remove use_shed hack in job cleanupChristian König2-2/+0
Remembering the code path in a variable to cleanup differently is usually not a good idea at all. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk.Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: remove duplicated timeout callbackChristian König2-5/+1
No need for double housekeeping here. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk.Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: remove begin_job/finish_jobChristian König3-20/+9
Completely pointless and confusing to use a callback to call into the same code file. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk.Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: fix coding style in the scheduler v2Christian König3-23/+32
v2: fix even more Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk.Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-11drm/amd: cleanup remaining spaces and tabs v2Christian König1-1/+1
This is the result of running the following commands: find drivers/gpu/drm/amd/ -name "*.h" -exec sed -i 's/[ \t]\+$//' {} \; find drivers/gpu/drm/amd/ -name "*.c" -exec sed -i 's/[ \t]\+$//' {} \; find drivers/gpu/drm/amd/ -name "*.h" -exec sed -i 's/ \+\t/\t/' {} \; find drivers/gpu/drm/amd/ -name "*.c" -exec sed -i 's/ \+\t/\t/' {} \; v2: drop changes to DAL and internal headers Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-04drm/amd/scheduler: Mark amdgpu_sched_ops constNils Wallménius2-3/+3
This marks the struct amdgpu_sched_ops const and adjusts amd_sched_init to take a const pointer for the ops param. The ops member of struct amd_gpu_scheduler is also changed to const. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Nils Wallménius <nils.wallmenius@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02drm/amdgpu: use ref to keep job aliveMonk Liu2-1/+20
this is to fix fatal page fault error that occured if: job is signaled/released after its timeout work is already put to the global queue (in this case the cancel_delayed_work will return false), which will lead to NX-protection error page fault during job_timeout_func. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02drm/amdgpu: rework TDR in scheduler (v2)Monk Liu3-0/+45
Add two callbacks to scheduler to maintain jobs, and invoked for job timeout calculations. Now TDR measures time gap from job is processed by hw. v2: fix typo Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02drm/amdgpu: get rid of incorrect TDRMonk Liu2-42/+1
original time out detect routine is incorrect, cuz it measures the gap from job scheduled, but we should only measure the gap from processed by hw. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02drm/amdgpu: put job to list before doneMonk Liu3-0/+27
the mirror_list will be used for later time out detect feature. This is needed to properly detect a GPU timeout with the scheduler. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02drm/amdgpu: delay job free to when it's finished (v2)Monk Liu2-1/+12
for those jobs submitted through scheduler, do not free it immediately after scheduled, instead free it in global workqueue by its sched fence signaling callback function. v2: call uf's bo_undef after job_run() call job's sync free after job_run() no static inline __amdgpu_job_free() anymore, just use kfree(job) to replace it. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02drm/amdgpu: use sched_job_init to initialize sched_jobMonk Liu2-1/+21
Consolidate job initialization in one place rather than duplicating it in multiple places. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-03-16drm/amdgpu: RCU protected amd_sched_fence_releaseChristian König1-1/+22
Fences must be freed RCU protected, otherwise the reservation_object_*_rcu() functions can run into problems. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2016-02-10drm/amdgpu: drop a dummy wakeup schedulerMonk Liu1-1/+9
since the dependency job is also scheduled by the same scheduler with the job depended on it, no need to call wake up scheduler when the dep is scheduled. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-14drm/amdgpu: add entity only when first job comeChunming Zhou1-5/+8
umd somtimes will create a context for every ring, that means some entities wouldn't be used at all. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-04drm/amdgpu: fix race condition in amd_sched_entity_push_jobNicolai Hähnle1-2/+3
As soon as we leave the spinlock after the job has been added to the job queue, we can no longer rely on the job's data to be available. I have seen a null-pointer dereference due to sched == NULL in amd_sched_wakeup via amd_sched_entity_push_job and amd_sched_ib_submit_kernel_helper. Since the latter initializes sched_job->sched with the address of the ring scheduler, which is guaranteed to be non-NULL, this race appears to be a likely culprit. Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Bugzilla: https://bugs.freedesktop.org/attachment.cgi?bugid=93079 Reviewed-by: Christian König <christian.koenig@amd.com>
2015-12-02drm/amd: abstract kernel rq and normal rq to priority of run queueChunming Zhou2-7/+16
Allows us to set priorities in the scheduler. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
2015-11-23drm/amdgpu: move dependency handling out of atomic section v2Christian König1-27/+44
This way the driver isn't limited in the dependency handling callback. v2: remove extra check in amd_sched_entity_pop_job() Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
2015-11-23drm/amdgpu: optimize scheduler fence handlingChristian König3-14/+55
We only need to wait for jobs to be scheduled when the dependency is from the same scheduler. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
2015-11-16drm/amdgpu: fix incorrect mutex usage v3Christian König2-11/+2
Before this patch the scheduler fence was created when we push the job into the queue, so we could only get the fence after pushing it. The mutex now was necessary to prevent the thread pushing the jobs to the hardware from running faster than the thread pushing the jobs into the queue. Otherwise the thread pushing jobs into the queue would have accessed possible freed up memory when it tries to get a reference to the fence. So what you get in the end is thread A: mutex_lock(&job->lock); ... Kick of thread B. ... mutex_unlock(&job->lock); And thread B: mutex_lock(&job->lock); .... mutex_unlock(&job->lock); kfree(job); I'm actually not sure if I'm still up to date on this, but this usage pattern used to be not allowed with mutexes. See here as well https://lwn.net/Articles/575460/. v2: remove unrelated changes, fix missing owner v3: rebased, add more commit message Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-11-16drm/amdgpu: cleanup scheduler fence get/put danceChristian König1-1/+0
The code was correct, but getting two references when the ownership is linearly moved on is a bit awkward and just overhead. Signed: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-11-16drm/amdgpu: add command submission workflow tracepointChunming Zhou2-3/+22
OGL needs these tracepoints to investigate performance issue. Change-Id: I5e58187d061253f7d665dfce8e4e163ba91d3e2b Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
2015-11-16drm/amd: add kmem cache for sched fenceChunming Zhou3-2/+23
Change-Id: I45bb8ff10ef05dc3b15e31a77fbcf31117705f11 Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>
2015-11-04drm/amdgpu: fix stoping the scheduler timeoutChristian König1-1/+1
cancel_delayed_work_sync is forbidden in interrupt context. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-11-03drm/amd/scheduler: don't oops on failure to loadDave Airlie1-1/+2
In two places amdgpu tries to tear down something it hasn't initalised when failing. This is what happens when you enable experimental support on topaz which then fails in ring init. This patch allows it to fail cleanly. agd: Split out from from the original patch since the scheduler is a driver independent. Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2015-10-28drm/amdgpu: ignore scheduler fences from the same entityChristian König1-0/+6
We are going to submit them before the job anyway. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
2015-10-14drm/amdgpu: fix lockup when clean pending fencesJunwei Zhang1-3/+3
The first lockup fence will lock the fence list of scheduler. Then cancel the delayed workqueues for all clean pending fences without waiting the workqueues to finish. Change-Id: I9bec826de1aa49d587b0662f3fb4a95333979429 Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>
2015-10-14drm/amdgpu: add timer to fence to detect scheduler lockupJunwei Zhang2-2/+48
Change-Id: I67e987db0efdca28faa80b332b75571192130d33 Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: David Zhou <david1.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>
2015-09-23drm/amdgpu: more scheduler cleanups v2Christian König3-39/+26
Embed the scheduler into the ring structure instead of allocating it. Use the ring name directly instead of the id. v2: rebased, whitespace cleanup Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Chunming Zhou<david1.zhou@amd.com>
2015-09-23drm/amdgpu: rename fence->scheduler to sched v2Christian König3-4/+4
Just to be consistent with the other members. v2: rename the ring member as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> (v1) Reviewed-by: Chunming Zhou<david1.zhou@amd.com>
2015-09-23drm/amdgpu: cleanup entity initChristian König3-19/+25
Reorder the fields and properly return the kfifo_alloc error code. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Chunming Zhou<david1.zhou@amd.com>
2015-09-23drm/amdgpu: refine the job naming for amdgpu_job and amdgpu_sched_jobJunwei Zhang2-33/+35
Use consistent naming across functions. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: David Zhou <david1.zhou@amd.com> Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
2015-09-23drm/amdgpu: remove process_job callback from the schedulerChristian König2-2/+0
Just free the resources immediately after submitting the job. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
2015-09-23drm/amdgpu: move scheduler fence callback into fence v2Christian König2-11/+12
And call the processed callback directly after submitting the job. v2: split adding error handling into separate patch. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
2015-09-23drm/amdgpu: signal scheduler fence when hw submission fails v3Christian König1-0/+3
Otherwise the resource blocked by it will never be reclaimed. v2: add DRM_ERROR. v3: fix typo in commit message Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Chunming Zhou<david1.zhou@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>