summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm
diff options
context:
space:
mode:
authorStanley.Yang <Stanley.Yang@amd.com>2021-12-02 13:06:05 +0800
committerAlex Deucher <alexander.deucher@amd.com>2021-12-02 12:43:06 -0500
commitbab73f092da654d149bb4771c418bf585c06044a (patch)
tree2d23df6a4be535f00c70bde634f2fb84aec93250 /drivers/gpu/drm
parentddb267b66af9d49d54e3d3ce8a6b4e4e7ad9af0a (diff)
downloadlinux-bab73f092da654d149bb4771c418bf585c06044a.tar.bz2
drm/amdgpu: skip query ecc info in gpu recovery
this is a workaround due to get ecc info failed during gpu recovery [ 700.236122] amdgpu 0000:09:00.0: amdgpu: Failed to export SMU ecc table! [ 700.236128] amdgpu 0000:09:00.0: amdgpu: GPU reset begin! [ 704.331171] amdgpu: qcm fence wait loop timeout expired [ 704.331194] amdgpu: The cp might be in an unrecoverable state due to an unsuccessful queues preemption [ 704.332445] amdgpu 0000:09:00.0: amdgpu: GPU reset begin! [ 704.332448] amdgpu 0000:09:00.0: amdgpu: Bailing on TDR for s_job:ffffffffffffffff, as another already in progress [ 704.332456] amdgpu: Pasid 0x8000 destroy queue 0 failed, ret -62 [ 710.360924] amdgpu 0000:09:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x00000013 SMN_C2PMSG_82:0x00000007 [ 710.360964] amdgpu 0000:09:00.0: amdgpu: Failed to disable smu features. [ 710.361002] amdgpu 0000:09:00.0: amdgpu: Fail to disable dpm features! [ 710.361014] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -62 Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Diffstat (limited to 'drivers/gpu/drm')
-rw-r--r--drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c4
1 files changed, 4 insertions, 0 deletions
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 3c623e589b79..28678c8f4eb2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -897,6 +897,10 @@ static void amdgpu_ras_get_ecc_info(struct amdgpu_device *adev, struct ras_err_d
struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
int ret = 0;
+ /* skip get ecc info during gpu recovery */
+ if (atomic_read(&ras->in_recovery) == 1)
+ return;
+
/*
* choosing right query method according to
* whether smu support query error information