summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h
AgeCommit message (Collapse)AuthorFilesLines
2021-02-26drm/amdgpu: remove unnecessary reading for epprom headerDennis Li1-3/+1
If the number of badpage records exceed the threshold, driver has updated both epprom header and control->tbl_hdr.header before gpu reset, therefore GPU recovery thread no need to read epprom header directly. v2: merge amdgpu_ras_check_err_threshold into amdgpu_ras_eeprom_check_err_threshold Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-04drm/amdgpu: break GPU recovery once it's in bad state(v4)Guchun Chen1-0/+4
When GPU executes recovery and retriving bad GPU tag from external eerpom device, the recovery will be broken and error message is printed as well for user's awareness. v2: Refine warning message in threshold reaching case, and fix spelling typo. v3: Fix explicit calling of bad gpu. v4: Rename function names. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-04drm/amdgpu: break driver init process when it's bad GPU(v5)Guchun Chen1-1/+2
When retrieving bad gpu tag from eeprom, GPU init should fail as the GPU needs to be retired for further check. v2: Fix spelling typo, correct the condition to detect bad gpu tag and refine error message. v3: Refine function argument name. v4: Fix missing check of returning value of i2c initialization error case. v5: Use dev_err to print PCI information in dmesg instead of DRM_ERROR. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-04drm/amdgpu: validate bad page threshold in ras(v3)Guchun Chen1-0/+2
Bad page threshold value should be valid in the range between -1 and max records length of eeprom. It could determine when saved bad pages exceed threshold value, and proceed corresponding actions. v2: When using the default typical value, it should be min value between typical value and eeprom max records length. v3: drop the case of setting bad_page_cnt_threshold to be 0xFFFFFFFF, as it confuses user. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amdgpu: move i2c bus lock out of ras structureAlex Deucher1-1/+0
It's not really ras related. It's just a lock for the bus in general. This removes the ras dependency from the smu i2c bus. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-03-16drm/amdgpu: Move EEPROM I2C adapter to amdgpu_deviceAndrey Grodzovsky1-2/+0
Puts the i2c adapter in common place for sharing by RAS and upcoming data read from FRU EEPROM feature. v2: Move i2c adapter to amdgpu_pm and rename it. v3: Move i2c adapter init to ASIC specific code and get rid of the switch case in amdgpu_device Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26drm/amdgpu: Resolved offchip EEPROM I/O issueJohn Clements1-0/+1
Updated target I2C address Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16drm/amdgpu: Add amdgpu_ras_eeprom_reset_tableAndrey Grodzovsky1-0/+1
This will allow to reset the table on the fly. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-27drm/amdgpu: Add RAS EEPROM table.Andrey Grodzovsky1-0/+90
Add RAS EEPROM table manager to eanble RAS errors to be stored upon appearance and retrived on driver load. v2: Fix some prints. v3: Fix checksum calculation. Make table record and header structs packed to do correct byte value sum. Fix record crossing EEPROM page boundry. v4: Fix byte sum val calculation for record - look at sizeof(record). Fix some style comments. v5: Add description to EEPROM_TABLE_RECORD_SIZE and syntax fixes. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Luben Tuikov <Luben.Tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>