drm/sched: fix the bug of time out calculation(v4) - linux - Linux Kernel (branches are rebased on master from time to time)

diff options

author	Monk Liu <Monk.Liu@amd.com>	2021-09-01 08:46:46 +0800
committer	Andrey Grodzovsky <andrey.grodzovsky@amd.com>	2021-09-15 10:21:30 -0400
commit	bcf26654a38f8e55ecac4635dac2e72c161d0063 (patch)
tree	20eb90439123abf55821c744dd05e7974206a743 /drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c
parent	282abb5a1f381d0ec10b20893961563be174a1c3 (diff)
download	linux-bcf26654a38f8e55ecac4635dac2e72c161d0063.tar.bz2

drm/sched: fix the bug of time out calculation(v4)

issue: in cleanup_job the cancle_delayed_work will cancel a TO timer even the its corresponding job is still running. fix: do not cancel the timer in cleanup_job, instead do the cancelling only when the heading job is signaled, and if there is a "next" job we start_timeout again. v2: further cleanup the logic, and do the TDR timer cancelling if the signaled job is the last one in its scheduler. v3: change the issue description remove the cancel_delayed_work in the begining of the cleanup_job recover the implement of drm_sched_job_begin. v4: remove the kthread_should_park() checking in cleanup_job routine, we should cleanup the signaled job asap TODO: 1)introduce pause/resume scheduler in job_timeout to serial the handling of scheduler and job_timeout. 2)drop the bad job's del and insert in scheduler due to above serialization (no race issue anymore with the serialization) Tested-by: jingwen <jingwen.chen@@amd.com> Signed-off-by: Monk Liu <Monk.Liu@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/1630457207-13107-1-git-send-email-Monk.Liu@amd.com

Diffstat (limited to 'drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: