From 36bf93216ecbe399c40c5e0486f0f0e3a4afa69e Mon Sep 17 00:00:00 2001 From: Lang Yu Date: Fri, 15 Apr 2022 15:35:44 +0800 Subject: drm/amdkfd: only allow heavy-weight TLB flush on some ASICs for SVM too The idea is from commit a50fe7078035 ("drm/amdkfd: Only apply heavy-weight TLB flush on Aldebaran") and commit f61c40c0757a ("drm/amdkfd: enable heavy-weight TLB flush on Arcturus"). At the moment, heavy-weight TLB could cause problems on ASICs except Aldebaran and Arcturus. A simple hipMallocManaged/hipFree program could trigger this issue. [ 97.787657] amdgpu 0000:01:00.0: amdgpu: wait for kiq fence error: 0. [ 106.868758] amdgpu: qcm fence wait loop timeout expired [ 106.868966] amdgpu: The cp might be in an unrecoverable state due to an unsuccessful queues preemption [ 106.869203] amdgpu: Failed to evict process queues [ 106.869261] amdgpu: Failed to quiesce KFD Signed-off-by: Lang Yu Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) (limited to 'drivers') diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 459fa07a3bcc..5afe216cf099 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -1229,7 +1229,9 @@ svm_range_unmap_from_gpus(struct svm_range *prange, unsigned long start, if (r) break; } - kfd_flush_tlb(pdd, TLB_FLUSH_HEAVYWEIGHT); + + if (kfd_flush_tlb_after_unmap(pdd->dev)) + kfd_flush_tlb(pdd, TLB_FLUSH_HEAVYWEIGHT); } return r; -- cgit v1.2.3