diff options
author | Chris Wilson <chris@chris-wilson.co.uk> | 2016-04-13 17:35:08 +0100 |
---|---|---|
committer | Chris Wilson <chris@chris-wilson.co.uk> | 2016-04-14 10:45:40 +0100 |
commit | f4457ae71fd6c3bc9ab979839258411a34baa7f6 (patch) | |
tree | 86563215da7b74fa26067b5ef33b64b662a14e93 /drivers/gpu/drm/i915/intel_lrc.c | |
parent | f7e5838bb37d970e103b1d9af3928f0f471c8a48 (diff) | |
download | linux-f4457ae71fd6c3bc9ab979839258411a34baa7f6.tar.bz2 |
drm/i915: Prevent leaking of -EIO from i915_wait_request()
Reporting -EIO from i915_wait_request() has proven very troublematic
over the years, with numerous hard-to-reproduce bugs cropping up in the
corner case of where a reset occurs and the code wasn't expecting such
an error.
If the we reset the GPU or have detected a hang and wish to reset the
GPU, the request is forcibly complete and the wait broken. Currently, we
report either -EAGAIN or -EIO in order for the caller to retreat and
restart the wait (if appropriate) after dropping and then reacquiring
the struct_mutex (essential to allow the GPU reset to proceed). However,
if we take the view that the request is complete (no further work will
be done on it by the GPU because it is dead and soon to be reset), then
we can proceed with the task at hand and then drop the struct_mutex
allowing the reset to occur. This transfers the burden of checking
whether it is safe to proceed to the caller, which in all but one
instance it is safe - completely eliminating the source of all spurious
-EIO.
Of note, we only have two API entry points where we expect that
userspace can observe an EIO. First is when submitting an execbuf, if
the GPU is terminally wedged, then the operation cannot succeed and an
-EIO is reported. Secondly, existing userspace uses the throttle ioctl
to detect an already wedged GPU before starting using HW acceleration
(or to confirm that the GPU is wedged after an error condition). So if
the GPU is wedged when the user calls throttle, also report -EIO.
v2: Split more carefully the change to i915_wait_request() and assorted
ABI from the reset handling.
v3: Add a couple of WARN_ON(EIO) to the interruptible modesetting code
so that we don't start to leak EIO there in future (and break our hang
resistant modesetting).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-9-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-1-git-send-email-chris@chris-wilson.co.uk
Diffstat (limited to 'drivers/gpu/drm/i915/intel_lrc.c')
-rw-r--r-- | drivers/gpu/drm/i915/intel_lrc.c | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 6b0915eefe33..c5dba687f288 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1048,7 +1048,7 @@ void intel_logical_ring_stop(struct intel_engine_cs *engine) return; ret = intel_engine_idle(engine); - if (ret && !i915_reset_in_progress_or_wedged(&dev_priv->gpu_error)) + if (ret) DRM_ERROR("failed to quiesce %s whilst cleaning up: %d\n", engine->name, ret); |