summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/etnaviv/etnaviv_sched.c
AgeCommit message (Collapse)AuthorFilesLines
2019-05-02drm/scheduler: rework job destructionChristian König1-1/+1
We now destroy finished jobs from the worker thread to make sure that we never destroy a job currently in timeout processing. By this we avoid holding lock around ring mirror list in drm_sched_stop which should solve a deadlock reported by a user. v2: Remove unused variable. v4: Move guilty job free into sched code. v5: Move sched->hw_rq_count to drm_sched_start to account for counter decrement in drm_sched_stop even when we don't call resubmit jobs if guily job did signal. v6: remove unused variable Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109692 Acked-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/1555599624-12285-3-git-send-email-andrey.grodzovsky@amd.com
2019-03-12Merge branch 'etnaviv/next' of https://git.pengutronix.de/git/lst/linux into ↵Dave Airlie1-1/+1
drm-next "small fixes and a change to not restrict etnaviv to certain architectures." Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas Stach <l.stach@pengutronix.de> Link: https://patchwork.freedesktop.org/patch/msgid/4bc1a4c8447bb947d2fe8facd0ff09c5b8753087.camel@pengutronix.de
2019-01-25drm/sched: Refactor ring mirror list handling.Andrey Grodzovsky1-4/+7
Decauple sched threads stop and start and ring mirror list handling from the policy of what to do about the guilty jobs. When stoppping the sched thread and detaching sched fences from non signaled HW fenes wait for all signaled HW fences to complete before rerunning the jobs. v2: Fix resubmission of guilty job into HW after refactoring. v4: Full restart for all the jobs, not only from guilty ring. Extract karma increase into standalone function. v5: Rework waiting for signaled jobs without relying on the job struct itself as those might already be freed for non 'guilty' job's schedulers. Expose karma increase to drivers. v6: Use list_for_each_entry_safe_continue and drm_sched_process_job in case fence already signaled. Call drm_sched_increase_karma only once for amdgpu and add documentation. v7: Wait only for the latest job's fence. Suggested-by: Christian Koenig <Christian.Koenig@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-07drm/etnaviv: move job context pointer to etnaviv_gem_submitLucas Stach1-1/+1
The context isn't really related to the cmdbuf, but is a property of the job. This has been missed when moving to a properly refcounted etnaviv_gem_submit. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-11-19Merge branch 'drm-next-4.21' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie1-2/+3
into drm-next New features for 4.21: amdgpu: - Support for SDMA paging queue on vega - Put compute EOP buffers into vram for better performance - Share more code with amdkfd - Support for scanout with DCC on gfx9 - Initial kerneldoc for DC - Updated SMU firmware support for gfx8 chips - Rework CSA handling for eventual support for preemption - XGMI PSP support - Clean up RLC handling - Enable GPU reset by default on VI, SOC15 dGPUs - Ring and IB test cleanups amdkfd: - Share more code with amdgpu ttm: - Move global init out of the drivers scheduler: - Track if schedulers are ready for work - Timeout/fault handling changes to facilitate GPU recovery Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181114165113.3751-1-alexander.deucher@amd.com
2018-11-07Merge branch 'etnaviv/fixes' of https://git.pengutronix.de/git/lst/linux ↵Dave Airlie1-1/+1
into drm-fixes Single etnaviv fence fix for GPU recovery. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas Stach <l.stach@pengutronix.de> Link: https://patchwork.freedesktop.org/patch/msgid/1541522424.2508.26.camel@pengutronix.de
2018-11-05drm/scheduler: Add drm_sched_job_cleanupSharat Masetty1-0/+3
This patch adds a new API to clean up the scheduler job resources. This is primarliy needed in cases the job was created but was not queued to the scheduler queue. Additionally with this change, the layer which creates the scheduler job also gets to free up the job's resources and this entails moving the dma_fence_put(finished_fence) to the drivers ops free handler routines. Signed-off-by: Sharat Masetty <smasetty@codeaurora.org> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/sched: make sure timer is restartedChristian König1-2/+0
Make sure we always restart the timer after a timeout and remove the device specific workarounds. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/etnaviv: fix bogus fence complete check in timeout handlerLucas Stach1-1/+1
The GPU hardware fences and the job out-fences are on different timelines so it's wrong to compare them. Fix this by only looking at the out-fence. Cc: <stable@vger.kernel.org> Fixes: 2c83a726d6fb (drm/etnaviv: bring back progress check in job timeout handler) Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2018-09-27drm/scheduler: remove timeout work_struct from drm_sched_job (v3)Nayan Deshmukh1-1/+1
having a delayed work item per job is redundant as we only need one per scheduler to track the time out the currently executing job. v2: the first element of the ring mirror list is the currently executing job so we don't need a additional variable for it v3: squash in fixes for v3d and etnaviv Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-08Merge branch 'etnaviv/next' of https://git.pengutronix.de/git/lst/linux into ↵Dave Airlie1-7/+17
drm-next From: Lucas Stach <l.stach@pengutronix.de> "not much to de-stage this time. Changes from Philipp and Souptick to use memset32 more and switch the fault handler to the new vm_fault_t and two small fixes for issues that can be hit in rare corner cases from me." Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/1533563808.2809.7.camel@pengutronix.de
2018-08-06drm/etnaviv: protect sched job submission with fence mutexLucas Stach1-7/+17
The documentation of drm_sched_job_init and drm_sched_entity_push_job has been clarified. Both functions should be called under a shared lock, to avoid jobs getting pushed into the scheduler queue in a different order than their sched_fence seqnos, which will confuse checks that are looking at the seqnos to infer information about completion order. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2018-07-30BackMerge v4.18-rc7 into drm-nextDave Airlie1-0/+24
rmk requested this for armada and I think we've had a few conflicts build up. Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-07-25drm/scheduler: modify API to avoid redundancyNayan Deshmukh1-2/+2
entity has a scheduler field and we don't need the sched argument in any of the functions where entity is provided. Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-05drm/etnaviv: bring back progress check in job timeout handlerLucas Stach1-0/+24
When the hangcheck handler was replaced by the DRM scheduler timeout handling we dropped the forward progress check, as this might allow clients to hog the GPU for a long time with a big job. It turns out that even reasonably well behaved clients like the Armada Xorg driver occasionally trip over the 500ms timeout. Bring back the forward progress check to get rid of the userspace regression. We would still like to fix userspace to submit smaller batches if possible, but that is for another day. Cc: <stable@vger.kernel.org> Fixes: 6d7a20c07760 (drm/etnaviv: replace hangcheck with scheduler timeout) Reported-by: Russell King <linux@armlinux.org.uk> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-18drm/etnaviv: replace license text with SPDX tagsLucas Stach1-12/+1
This replaces the repetitive GPL-2.0 license text in code and header files with the SPDX tags. Generated hardware headers aren't changed, as any changes there need to be done in the upstream rnndb repository. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-03-22drm/etnaviv: bump HW job limit to 4Lucas Stach1-1/+1
The current limit of 2 leads to some GPU idle times, as the usual IRQ latency leads to up to 3 jobs getting signaled at once with some standard workloads. A larger HW job limit might lead to slightly worse QoS, but we accept that to not sacrifice GPU throughput in the common case. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2018-03-09drm/etnaviv: etnaviv_sched: Staticize functions when possibleFabio Estevam1-3/+4
etnaviv_sched_dependency() and etnaviv_sched_run_job() are only used in this file, so make them static. This fixes the following sparse warnings: drivers/gpu/drm/etnaviv/etnaviv_sched.c:30:18: warning: symbol 'etnaviv_sched_dependency' was not declared. Should it be static? drivers/gpu/drm/etnaviv/etnaviv_sched.c:81:18: warning: symbol 'etnaviv_sched_run_job' was not declared. Should it be static? Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2018-02-12drm/etnaviv: replace hangcheck with scheduler timeoutLucas Stach1-22/+21
This replaces the etnaviv internal hangcheck logic with the job timeout handling provided by the DRM scheduler. This simplifies the driver further and allows to replay jobs after a GPU reset, so only minimal state is lost. This introduces a user-visible change in that we don't allow jobs to run indefinitely as long as they make progress anymore, as this introduces quality of service issues when multiple processes are using the GPU. Userspace is now responsible to flush jobs in a way that the finish in a reasonable time, where reasonable is currently defined as less than 500ms. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2018-02-12drm/etnaviv: move dependency handling to schedulerLucas Stach1-0/+45
Move the fence dependency handling to the scheduler where it belongs. Jobs with unsignaled dependencies just get to sit in the scheduler queue without holding any locks. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2018-02-12drm/etnaviv: hook up DRM GPU schedulerLucas Stach1-0/+125
This hooks in the DRM GPU scheduler. No improvement yet, as all the dependency handling is still done in etnaviv_gem_submit. This just replaces the actual GPU submit by passing through the scheduler. Allows to get rid of the retire worker, as this is now driven by the scheduler. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>