summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/i915/i915_request.c
AgeCommit message (Collapse)AuthorFilesLines
2019-11-27Merge tag 'drm-next-2019-11-27' of git://anongit.freedesktop.org/drm/drmLinus Torvalds1-135/+100
Pull drm updates from Dave Airlie: "Lots of stuff in here, though it hasn't been too insane this merge apart from dealing with the security fun. uapi: - export different colorspace properties on DP vs HDMI - new fourcc for ARM 16x16 block format - syncobj: allow querying last submitted timeline value - DRM_FORMAT_BIG_ENDIAN defined as unsigned core: - allow using gem vma manager in ttm - connector/encoder/bridge doc fixes - allow more than 3 encoders for a connector - displayport mst suspend/resume reprobing support - vram lazy unmapping, uniform vram mm and gem vram - edid cleanups + AVI informframe bar info - displayport helpers - dpcd parser added dp_cec: - Allow a connector to be associated with a cec device ttm: - pipelining with no_gpu_wait fix - always keep BOs on the LRU sched: - allow free_job routine to sleep i915: - Block userptr from mappable GTT - i915 perf uapi versioning - OA stream dynamic reconfiguration - make context persistence optional - introduce DRM_I915_UNSTABLE Kconfig - add fake lmem testing under unstable - BT.2020 support for DP MSA - struct mutex elimination - Tigerlake display/PLL/power management improvements - Jasper Lake PCH support - refactor PMU for multiple GPUs - Icelake firmware update - Split out vga + switcheroo code amdgpu: - implement dma-buf import/export without helpers - vega20 RAS enablement - DC i2c over aux fixes - renoir GPU reset - DC HDCP support - BACO support for CI/VI asics - MSI-X support - Arcturus EEPROM support - Arcturus VCN encode support - VCN dynamic powergating on RV/RV2 amdkfd: - add navi12/14/renoir support to kfd radeon: - SI dpm fix ported from amdgpu - fix bad DMA on ppc platforms gma500: - memory leak fixes qxl: - convert to new gem mmap exynos: - build warning fix komeda: - add aclk sysfs attribute v3d: - userspace cleanup uapi change i810: - fix for underflow in dispatch ioctls ast: - refactor show_cursor mgag200: - refactor show_cursor arcgpu: - encoder finding improvements mediatek: - mipi_tx, dsi and partial crtc support for MT8183 SoC - rotation support meson: - add suspend/resume support omap: - misc refactors tegra: - DisplayPort support for Tegra 210, 186 and 194. - IOMMU-backed DMA API fixes panfrost: - fix lockdep issue - simplify devfreq integration rcar-du: - R8A774B1 SoC support - fixes for H2 ES2.0 sun4i: - vcc-dsi regulator support virtio-gpu: - vmexit vs spinlock fix - move to gem shmem helpers - handle large command buffers with cma" * tag 'drm-next-2019-11-27' of git://anongit.freedesktop.org/drm/drm: (1855 commits) drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10 drm/amdgpu: initialize vm_inv_eng0_sem for gfxhub and mmhub drm/amd/amdgpu/sriov skip RLCG s/r list for arcturus VF. drm/amd/amdgpu/sriov temporarily skip ras,dtm,hdcp for arcturus VF drm/amdgpu/gfx10: re-init clear state buffer after gpu reset merge fix for "ftrace: Rework event_create_dir()" drm/amdgpu: Update Arcturus golden registers drm/amdgpu/gfx10: fix out-of-bound mqd_backup array access drm/amdgpu/gfx10: explicitly wait for cp idle after halt/unhalt Revert "drm/amd/display: enable S/G for RAVEN chip" drm/amdgpu: disable gfxoff on original raven drm/amdgpu: remove experimental flag for Navi14 drm/amdgpu: disable gfxoff when using register read interface drm/amdgpu/powerplay: properly set PP_GFXOFF_MASK (v2) drm/amdgpu: fix bad DMA from INTERRUPT_CNTL2 drm/radeon: fix bad DMA from INTERRUPT_CNTL2 drm/amd/display: Fix debugfs on MST connectors drm/amdgpu/nv: add asic func for fetching vbios from rom directly drm/amdgpu: put flush_delayed_work at first drm/amdgpu/vcn2.5: fix the enc loop with hw fini ...
2019-11-26Merge branch 'locking-core-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in this cycle were: - A comprehensive rewrite of the robust/PI futex code's exit handling to fix various exit races. (Thomas Gleixner et al) - Rework the generic REFCOUNT_FULL implementation using atomic_fetch_* operations so that the performance impact of the cmpxchg() loops is mitigated for common refcount operations. With these performance improvements the generic implementation of refcount_t should be good enough for everybody - and this got confirmed by performance testing, so remove ARCH_HAS_REFCOUNT and REFCOUNT_FULL entirely, leaving the generic implementation enabled unconditionally. (Will Deacon) - Other misc changes, fixes, cleanups" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits) lkdtm: Remove references to CONFIG_REFCOUNT_FULL locking/refcount: Remove unused 'refcount_error_report()' function locking/refcount: Consolidate implementations of refcount_t locking/refcount: Consolidate REFCOUNT_{MAX,SATURATED} definitions locking/refcount: Move saturation warnings out of line locking/refcount: Improve performance of generic REFCOUNT_FULL code locking/refcount: Move the bulk of the REFCOUNT_FULL implementation into the <linux/refcount.h> header locking/refcount: Remove unused refcount_*_checked() variants locking/refcount: Ensure integer operands are treated as signed locking/refcount: Define constants for saturation and max refcount values futex: Prevent exit livelock futex: Provide distinct return value when owner is exiting futex: Add mutex around futex exit futex: Provide state handling for exec() as well futex: Sanitize exit state handling futex: Mark the begin of futex exit explicitly futex: Set task::futex_state to DEAD right after handling futex exit futex: Split futex_mm_release() for exit/exec exit/exec: Seperate mm_release() futex: Replace PF_EXITPIDONE with a state ...
2019-10-26drm/i915: Extract GT render power state managementAndi Shyti1-3/+4
i915_irq.c is large. One reason for this is that has a large chunk of the GT render power management stashed away in it. Extract that logic out of i915_irq.c and intel_pm.c and put it under one roof. Based on a patch by Chris Wilson. Signed-off-by: Andi Shyti <andi.shyti@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191024211642.7688-1-chris@chris-wilson.co.uk
2019-10-26drm/i915: Encapsulate kconfig constant values inside boolean predicatesChris Wilson1-1/+1
Avoid angering clang and smatch by using a constant value in a '&&' test, by forcing that constant value into a boolean. E.g., drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c:159:13: warning: use of logical '&&' with constant operand [-Wconstant-logical-operand] if (!delay && CONFIG_DRM_I915_PREEMPT_TIMEOUT) { ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191025135943.12524-1-chris@chris-wilson.co.uk
2019-10-24drm/i915/gt: Split intel_ring_submissionChris Wilson1-0/+1
Split the legacy submission backend from the common CS ring buffer handling. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191024100344.5041-1-chris@chris-wilson.co.uk
2019-10-17drm/i915: Don't disable interrupts independently of the lockSebastian Andrzej Siewior1-8/+4
The locks (active.lock and rq->lock) need to be taken with disabled interrupts. This is done in i915_request_retire() by disabling the interrupts independently of the locks itself. While local_irq_disable()+spin_lock() equals spin_lock_irq() on vanilla it does not on PREEMPT_RT. Chris Wilson confirmed that local_irq_disable() was just introduced as an optimisation to avoid enabling/disabling interrupts during lock/unlock combo. Enable/disable interrupts as part of the locking instruction. Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191017161352.e5z3ugse7gxl5ari@linutronix.de
2019-10-15drm/i915: Flush tasklet submission before sleeping on i915_request_waitChris Wilson1-0/+1
If the system is being slow and userspace is racing ahead of the GPU and finds itself waiting for the GPU to catch up, before the process sleeps give the tasklet a kick, bypassing ksoftirqd. If the system is overloaded, then ksoftirqd may be delayed incurring additional latency to our user. This should not be a frequent problem, but in the past we have observed several hundred millisecond delays before ksoftirqd services an interrupt, so burn a few cycles to lend a helping hand. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191015132606.14349-1-chris@chris-wilson.co.uk
2019-10-14drm/i915/execlists: Tweak virtual unsubmissionChris Wilson1-1/+1
Since commit e2144503bf3b ("drm/i915: Prevent bonded requests from overtaking each other on preemption") we have restricted requests to run on their chosen engine across preemption events. We can take this restriction into account to know that we will want to resubmit those requests onto the same physical engine, and so can shortcircuit the virtual engine selection process and keep the request on the same engine during unwind. References: e2144503bf3b ("drm/i915: Prevent bonded requests from overtaking each other on preemption") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Ramlingam C <ramalingam.c@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191013203012.25208-1-chris@chris-wilson.co.uk
2019-10-09drm/i915: Only enqueue already completed requestsChris Wilson1-12/+32
If we are asked to submit a completed request, just move it onto the active-list without modifying it's payload. If we try to emit the modified payload of a completed request, we risk racing with the ring->head update during retirement which may advance the head past our breadcrumb and so we generate a warning for the emission being behind the RING_HEAD. v2: Commentary for the sneaky, shared responsibility between functions. v3: Spelling mistakes and bonus assertion Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190923110056.15176-3-chris@chris-wilson.co.uk (cherry picked from commit c0bb487dc19fc45dbeede7dcf8f513df51a3cd33) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-10-09locking/lockdep: Remove unused @nested argument from lock_release()Qian Cai1-1/+1
Since the following commit: b4adfe8e05f1 ("locking/lockdep: Remove unused argument in __lock_release") @nested is no longer used in lock_release(), so remove it from all lock_release() calls and friends. Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: airlied@linux.ie Cc: akpm@linux-foundation.org Cc: alexander.levin@microsoft.com Cc: daniel@iogearbox.net Cc: davem@davemloft.net Cc: dri-devel@lists.freedesktop.org Cc: duyuyang@gmail.com Cc: gregkh@linuxfoundation.org Cc: hannes@cmpxchg.org Cc: intel-gfx@lists.freedesktop.org Cc: jack@suse.com Cc: jlbec@evilplan.or Cc: joonas.lahtinen@linux.intel.com Cc: joseph.qi@linux.alibaba.com Cc: jslaby@suse.com Cc: juri.lelli@redhat.com Cc: maarten.lankhorst@linux.intel.com Cc: mark@fasheh.com Cc: mhocko@kernel.org Cc: mripard@kernel.org Cc: ocfs2-devel@oss.oracle.com Cc: rodrigo.vivi@intel.com Cc: sean@poorly.run Cc: st@kernel.org Cc: tj@kernel.org Cc: tytso@mit.edu Cc: vdavydov.dev@gmail.com Cc: vincent.guittot@linaro.org Cc: viro@zeniv.linux.org.uk Link: https://lkml.kernel.org/r/1568909380-32199-1-git-send-email-cai@lca.pw Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-07drm/i915: Verify the engine after acquiring the active.lockChris Wilson1-3/+22
When using virtual engines, the rq->engine is not stable until we hold the engine->active.lock (as the virtual engine may be exchanged with the sibling). Since commit 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy") we may retire a request concurrently with resubmitting it to HW, we need to be extra careful to verify we are holding the correct lock for the request's active list. This is similar to the issue we saw with rescheduling the virtual requests, see sched_lock_engine(). Or else: <4> [876.736126] list_add corruption. prev->next should be next (ffff8883f931a1f8), but was dead000000000100. (prev=ffff888361ffa610). <4> [876.736136] WARNING: CPU: 2 PID: 21 at lib/list_debug.c:28 __list_add_valid+0x4d/0x70 <4> [876.736137] Modules linked in: i915(+) amdgpu gpu_sched ttm vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_intel_nhlt snd_hda_codec snd_hwdep snd_hda_core ghash_clmulni_intel e1000e cdc_ether usbnet mii snd_pcm ptp pps_core mei_me mei prime_numbers btusb btrtl btbcm btintel bluetooth ecdh_generic ecc [last unloaded: i915] <4> [876.736154] CPU: 2 PID: 21 Comm: ksoftirqd/2 Tainted: G U 5.3.0-CI-CI_DRM_6898+ #1 <4> [876.736156] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019 <4> [876.736157] RIP: 0010:__list_add_valid+0x4d/0x70 <4> [876.736159] Code: c3 48 89 d1 48 c7 c7 20 33 0e 82 48 89 c2 e8 4a 4a bc ff 0f 0b 31 c0 c3 48 89 c1 4c 89 c6 48 c7 c7 70 33 0e 82 e8 33 4a bc ff <0f> 0b 31 c0 c3 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 c0 33 0e 82 e8 <4> [876.736160] RSP: 0018:ffffc9000018bd30 EFLAGS: 00010082 <4> [876.736162] RAX: 0000000000000000 RBX: ffff888361ffc840 RCX: 0000000000000104 <4> [876.736163] RDX: 0000000080000104 RSI: 0000000000000000 RDI: 00000000ffffffff <4> [876.736164] RBP: ffffc9000018bd68 R08: 0000000000000000 R09: 0000000000000001 <4> [876.736165] R10: 00000000aed95de3 R11: 000000007fe927eb R12: ffff888361ffca10 <4> [876.736166] R13: ffff888361ffa610 R14: ffff888361ffc880 R15: ffff8883f931a1f8 <4> [876.736168] FS: 0000000000000000(0000) GS:ffff88849fd00000(0000) knlGS:0000000000000000 <4> [876.736169] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [876.736170] CR2: 00007f093a9173c0 CR3: 00000003bba08005 CR4: 0000000000760ee0 <4> [876.736171] PKRU: 55555554 <4> [876.736172] Call Trace: <4> [876.736226] __i915_request_submit+0x152/0x370 [i915] <4> [876.736263] __execlists_submission_tasklet+0x6da/0x1f50 [i915] <4> [876.736293] ? execlists_submission_tasklet+0x29/0x50 [i915] <4> [876.736321] execlists_submission_tasklet+0x34/0x50 [i915] <4> [876.736325] tasklet_action_common.isra.5+0x47/0xb0 <4> [876.736328] __do_softirq+0xd8/0x4ae <4> [876.736332] ? smpboot_thread_fn+0x23/0x280 <4> [876.736334] ? smpboot_thread_fn+0x6b/0x280 <4> [876.736336] run_ksoftirqd+0x2b/0x50 <4> [876.736338] smpboot_thread_fn+0x1d3/0x280 <4> [876.736341] ? sort_range+0x20/0x20 <4> [876.736343] kthread+0x119/0x130 <4> [876.736345] ? kthread_park+0xa0/0xa0 <4> [876.736347] ret_from_fork+0x24/0x50 <4> [876.736353] irq event stamp: 2290145 <4> [876.736356] hardirqs last enabled at (2290144): [<ffffffff8123cde8>] __slab_free+0x3e8/0x500 <4> [876.736358] hardirqs last disabled at (2290145): [<ffffffff819cfb4d>] _raw_spin_lock_irqsave+0xd/0x50 <4> [876.736360] softirqs last enabled at (2290114): [<ffffffff81c0033e>] __do_softirq+0x33e/0x4ae <4> [876.736361] softirqs last disabled at (2290119): [<ffffffff810b815b>] run_ksoftirqd+0x2b/0x50 <4> [876.736363] WARNING: CPU: 2 PID: 21 at lib/list_debug.c:28 __list_add_valid+0x4d/0x70 <4> [876.736364] ---[ end trace 3e58d6c7356c65bf ]--- <4> [876.736406] ------------[ cut here ]------------ <4> [876.736415] list_del corruption. prev->next should be ffff888361ffca10, but was ffff88840ac2c730 <4> [876.736421] WARNING: CPU: 2 PID: 5490 at lib/list_debug.c:53 __list_del_entry_valid+0x79/0x90 <4> [876.736422] Modules linked in: i915(+) amdgpu gpu_sched ttm vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_intel_nhlt snd_hda_codec snd_hwdep snd_hda_core ghash_clmulni_intel e1000e cdc_ether usbnet mii snd_pcm ptp pps_core mei_me mei prime_numbers btusb btrtl btbcm btintel bluetooth ecdh_generic ecc [last unloaded: i915] <4> [876.736433] CPU: 2 PID: 5490 Comm: i915_selftest Tainted: G U W 5.3.0-CI-CI_DRM_6898+ #1 <4> [876.736435] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019 <4> [876.736436] RIP: 0010:__list_del_entry_valid+0x79/0x90 <4> [876.736438] Code: 0b 31 c0 c3 48 89 fe 48 c7 c7 30 34 0e 82 e8 ae 49 bc ff 0f 0b 31 c0 c3 48 89 f2 48 89 fe 48 c7 c7 68 34 0e 82 e8 97 49 bc ff <0f> 0b 31 c0 c3 48 c7 c7 a8 34 0e 82 e8 86 49 bc ff 0f 0b 31 c0 c3 <4> [876.736439] RSP: 0018:ffffc900003ef758 EFLAGS: 00010086 <4> [876.736440] RAX: 0000000000000000 RBX: ffff888361ffc840 RCX: 0000000000000002 <4> [876.736442] RDX: 0000000080000002 RSI: 0000000000000000 RDI: 00000000ffffffff <4> [876.736443] RBP: ffffc900003ef780 R08: 0000000000000000 R09: 0000000000000001 <4> [876.736444] R10: 000000001418e4b7 R11: 000000007f0ea93b R12: ffff888361ffcab8 <4> [876.736445] R13: ffff88843b6d0000 R14: 000000000000217c R15: 0000000000000001 <4> [876.736447] FS: 00007f4e6f255240(0000) GS:ffff88849fd00000(0000) knlGS:0000000000000000 <4> [876.736448] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [876.736449] CR2: 00007f093a9173c0 CR3: 00000003bba08005 CR4: 0000000000760ee0 <4> [876.736450] PKRU: 55555554 <4> [876.736451] Call Trace: <4> [876.736488] i915_request_retire+0x224/0x8e0 [i915] <4> [876.736521] i915_request_create+0x4b/0x1b0 [i915] <4> [876.736550] nop_virtual_engine+0x230/0x4d0 [i915] Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111695 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190918145453.8800-1-chris@chris-wilson.co.uk (cherry picked from commit 37fa0de3c137d5f54f7e64f53495c9d501d42a4d) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-10-04drm/i915: Move request runtime management onto gtChris Wilson1-63/+1
Requests are run from the gt and are tided into the gt runtime power management, so pull the runtime request management under gt/ Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-12-chris@chris-wilson.co.uk
2019-10-04drm/i915: Merge wait_for_timelines with retire_requestChris Wilson1-3/+23
wait_for_timelines is essentially the same loop as retiring requests (with an extra timeout), so merge the two into one routine. v2: i915_retire_requests_timeout and keep VT'd w/a as !interruptible Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-10-chris@chris-wilson.co.uk
2019-10-04drm/i915: Coordinate i915_active with its own mutexChris Wilson1-36/+2
Forgo the struct_mutex serialisation for i915_active, and interpose its own mutex handling for active/retire. This is a multi-layered sleight-of-hand. First, we had to ensure that no active/retire callbacks accidentally inverted the mutex ordering rules, nor assumed that they were themselves serialised by struct_mutex. More challenging though, is the rule over updating elements of the active rbtree. Instead of the whole i915_active now being serialised by struct_mutex, allocations/rotations of the tree are serialised by the i915_active.mutex and individual nodes are serialised by the caller using the i915_timeline.mutex (we need to use nested spinlocks to interact with the dma_fence callback lists). The pain point here is that instead of a single mutex around execbuf, we now have to take a mutex for active tracker (one for each vma, context, etc) and a couple of spinlocks for each fence update. The improvement in fine grained locking allowing for multiple concurrent clients (eventually!) should be worth it in typical loads. v2: Add some comments that barely elucidate anything :( Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-6-chris@chris-wilson.co.uk
2019-09-23drm/i915: Only enqueue already completed requestsChris Wilson1-12/+32
If we are asked to submit a completed request, just move it onto the active-list without modifying it's payload. If we try to emit the modified payload of a completed request, we risk racing with the ring->head update during retirement which may advance the head past our breadcrumb and so we generate a warning for the emission being behind the RING_HEAD. v2: Commentary for the sneaky, shared responsibility between functions. v3: Spelling mistakes and bonus assertion Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190923110056.15176-3-chris@chris-wilson.co.uk
2019-09-20drm/i915: Lock signaler timeline while navigatingChris Wilson1-22/+46
As we need to take a walk back along the signaler timeline to find the fence before upon which we want to wait, we need to lock that timeline to prevent it being modified as we walk. Similarly, we also need to acquire a reference to the earlier fence while it still exists! Though we lack the correct locking today, we are saved by the overarching struct_mutex -- but that protection is being removed. v2: Tvrtko made me realise I was being lax and using annotations to ignore the AB-BA deadlock from the timeline overlap. As it would be possible to construct a second request that was using a semaphore from the same timeline as ourselves, we could quite easily end up in a situation where we deadlocked in our mutex waits. Avoid that by using a trylock and falling back to a normal dma-fence await if contended. v3: Eek, the signal->timeline is volatile and must be carefully dereferenced to ensure it is valid. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-2-chris@chris-wilson.co.uk
2019-09-20drm/i915: Mark i915_request.timeline as a volatile, rcu pointerChris Wilson1-11/+17
The request->timeline is only valid until the request is retired (i.e. before it is completed). Upon retiring the request, the context may be unpinned and freed, and along with it the timeline may be freed. We therefore need to be very careful when chasing rq->timeline that the pointer does not disappear beneath us. The vast majority of users are in a protected context, either during request construction or retirement, where the timeline->mutex is held and the timeline cannot disappear. It is those few off the beaten path (where we access a second timeline) that need extra scrutiny -- to be added in the next patch after first adding the warnings about dangerous access. One complication, where we cannot use the timeline->mutex itself, is during request submission onto hardware (under spinlocks). Here, we want to check on the timeline to finalize the breadcrumb, and so we need to impose a second rule to ensure that the request->timeline is indeed valid. As we are submitting the request, it's context and timeline must be pinned, as it will be used by the hardware. Since it is pinned, we know the request->timeline must still be valid, and we cannot submit the idle barrier until after we release the engine->active.lock, ergo while submitting and holding that spinlock, a second thread cannot release the timeline. v2: Don't be lazy inside selftests; hold the timeline->mutex for as long as we need it, and tidy up acquiring the timeline with a bit of refactoring (i915_active_add_request) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-1-chris@chris-wilson.co.uk
2019-09-19drm/i915: Verify the engine after acquiring the active.lockChris Wilson1-3/+22
When using virtual engines, the rq->engine is not stable until we hold the engine->active.lock (as the virtual engine may be exchanged with the sibling). Since commit 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy") we may retire a request concurrently with resubmitting it to HW, we need to be extra careful to verify we are holding the correct lock for the request's active list. This is similar to the issue we saw with rescheduling the virtual requests, see sched_lock_engine(). Or else: <4> [876.736126] list_add corruption. prev->next should be next (ffff8883f931a1f8), but was dead000000000100. (prev=ffff888361ffa610). <4> [876.736136] WARNING: CPU: 2 PID: 21 at lib/list_debug.c:28 __list_add_valid+0x4d/0x70 <4> [876.736137] Modules linked in: i915(+) amdgpu gpu_sched ttm vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_intel_nhlt snd_hda_codec snd_hwdep snd_hda_core ghash_clmulni_intel e1000e cdc_ether usbnet mii snd_pcm ptp pps_core mei_me mei prime_numbers btusb btrtl btbcm btintel bluetooth ecdh_generic ecc [last unloaded: i915] <4> [876.736154] CPU: 2 PID: 21 Comm: ksoftirqd/2 Tainted: G U 5.3.0-CI-CI_DRM_6898+ #1 <4> [876.736156] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019 <4> [876.736157] RIP: 0010:__list_add_valid+0x4d/0x70 <4> [876.736159] Code: c3 48 89 d1 48 c7 c7 20 33 0e 82 48 89 c2 e8 4a 4a bc ff 0f 0b 31 c0 c3 48 89 c1 4c 89 c6 48 c7 c7 70 33 0e 82 e8 33 4a bc ff <0f> 0b 31 c0 c3 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 c0 33 0e 82 e8 <4> [876.736160] RSP: 0018:ffffc9000018bd30 EFLAGS: 00010082 <4> [876.736162] RAX: 0000000000000000 RBX: ffff888361ffc840 RCX: 0000000000000104 <4> [876.736163] RDX: 0000000080000104 RSI: 0000000000000000 RDI: 00000000ffffffff <4> [876.736164] RBP: ffffc9000018bd68 R08: 0000000000000000 R09: 0000000000000001 <4> [876.736165] R10: 00000000aed95de3 R11: 000000007fe927eb R12: ffff888361ffca10 <4> [876.736166] R13: ffff888361ffa610 R14: ffff888361ffc880 R15: ffff8883f931a1f8 <4> [876.736168] FS: 0000000000000000(0000) GS:ffff88849fd00000(0000) knlGS:0000000000000000 <4> [876.736169] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [876.736170] CR2: 00007f093a9173c0 CR3: 00000003bba08005 CR4: 0000000000760ee0 <4> [876.736171] PKRU: 55555554 <4> [876.736172] Call Trace: <4> [876.736226] __i915_request_submit+0x152/0x370 [i915] <4> [876.736263] __execlists_submission_tasklet+0x6da/0x1f50 [i915] <4> [876.736293] ? execlists_submission_tasklet+0x29/0x50 [i915] <4> [876.736321] execlists_submission_tasklet+0x34/0x50 [i915] <4> [876.736325] tasklet_action_common.isra.5+0x47/0xb0 <4> [876.736328] __do_softirq+0xd8/0x4ae <4> [876.736332] ? smpboot_thread_fn+0x23/0x280 <4> [876.736334] ? smpboot_thread_fn+0x6b/0x280 <4> [876.736336] run_ksoftirqd+0x2b/0x50 <4> [876.736338] smpboot_thread_fn+0x1d3/0x280 <4> [876.736341] ? sort_range+0x20/0x20 <4> [876.736343] kthread+0x119/0x130 <4> [876.736345] ? kthread_park+0xa0/0xa0 <4> [876.736347] ret_from_fork+0x24/0x50 <4> [876.736353] irq event stamp: 2290145 <4> [876.736356] hardirqs last enabled at (2290144): [<ffffffff8123cde8>] __slab_free+0x3e8/0x500 <4> [876.736358] hardirqs last disabled at (2290145): [<ffffffff819cfb4d>] _raw_spin_lock_irqsave+0xd/0x50 <4> [876.736360] softirqs last enabled at (2290114): [<ffffffff81c0033e>] __do_softirq+0x33e/0x4ae <4> [876.736361] softirqs last disabled at (2290119): [<ffffffff810b815b>] run_ksoftirqd+0x2b/0x50 <4> [876.736363] WARNING: CPU: 2 PID: 21 at lib/list_debug.c:28 __list_add_valid+0x4d/0x70 <4> [876.736364] ---[ end trace 3e58d6c7356c65bf ]--- <4> [876.736406] ------------[ cut here ]------------ <4> [876.736415] list_del corruption. prev->next should be ffff888361ffca10, but was ffff88840ac2c730 <4> [876.736421] WARNING: CPU: 2 PID: 5490 at lib/list_debug.c:53 __list_del_entry_valid+0x79/0x90 <4> [876.736422] Modules linked in: i915(+) amdgpu gpu_sched ttm vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_intel_nhlt snd_hda_codec snd_hwdep snd_hda_core ghash_clmulni_intel e1000e cdc_ether usbnet mii snd_pcm ptp pps_core mei_me mei prime_numbers btusb btrtl btbcm btintel bluetooth ecdh_generic ecc [last unloaded: i915] <4> [876.736433] CPU: 2 PID: 5490 Comm: i915_selftest Tainted: G U W 5.3.0-CI-CI_DRM_6898+ #1 <4> [876.736435] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019 <4> [876.736436] RIP: 0010:__list_del_entry_valid+0x79/0x90 <4> [876.736438] Code: 0b 31 c0 c3 48 89 fe 48 c7 c7 30 34 0e 82 e8 ae 49 bc ff 0f 0b 31 c0 c3 48 89 f2 48 89 fe 48 c7 c7 68 34 0e 82 e8 97 49 bc ff <0f> 0b 31 c0 c3 48 c7 c7 a8 34 0e 82 e8 86 49 bc ff 0f 0b 31 c0 c3 <4> [876.736439] RSP: 0018:ffffc900003ef758 EFLAGS: 00010086 <4> [876.736440] RAX: 0000000000000000 RBX: ffff888361ffc840 RCX: 0000000000000002 <4> [876.736442] RDX: 0000000080000002 RSI: 0000000000000000 RDI: 00000000ffffffff <4> [876.736443] RBP: ffffc900003ef780 R08: 0000000000000000 R09: 0000000000000001 <4> [876.736444] R10: 000000001418e4b7 R11: 000000007f0ea93b R12: ffff888361ffcab8 <4> [876.736445] R13: ffff88843b6d0000 R14: 000000000000217c R15: 0000000000000001 <4> [876.736447] FS: 00007f4e6f255240(0000) GS:ffff88849fd00000(0000) knlGS:0000000000000000 <4> [876.736448] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [876.736449] CR2: 00007f093a9173c0 CR3: 00000003bba08005 CR4: 0000000000760ee0 <4> [876.736450] PKRU: 55555554 <4> [876.736451] Call Trace: <4> [876.736488] i915_request_retire+0x224/0x8e0 [i915] <4> [876.736521] i915_request_create+0x4b/0x1b0 [i915] <4> [876.736550] nop_virtual_engine+0x230/0x4d0 [i915] Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111695 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190918145453.8800-1-chris@chris-wilson.co.uk
2019-09-17drm/i915/tgl: Extend MI_SEMAPHORE_WAITChris Wilson1-5/+16
On Tigerlake, MI_SEMAPHORE_WAIT grew an extra dword, so be sure to update the length field and emit that extra parameter and any padding noop as required. v2: Define the token shift while we are adding the updated MI_SEMAPHORE_WAIT v3: Use int instead of bool in the addition so that readers are not left wondering about the intricacies of the C spec. Now they just have to worry what the integer value of a boolean operation is... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190917123055.28965-1-chris@chris-wilson.co.uk
2019-09-06drm/i915: Hold irq-off for the entire fake lock periodChris Wilson1-4/+5
Sadly lockdep records when the irqs are re-enabled and then marks up the fake lock as being irq-unsafe. Our hand is forced and so we must mark up the entire fake lock critical section as irq-off. Hopefully this is the last tweak required! v2: Not quite, we need to mark the timeline spinlock as irqsafe. That was a genuine bug being hidden by the earlier lockdep splat. Fixes: d67739268cf0 ("drm/i915/gt: Mark up the nested engine-pm timeline lock as irqsafe") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190823132700.25286-2-chris@chris-wilson.co.uk (cherry picked from commit 6dcb85a0ad990455ae7c596e3fc966ad9c1ba9c5) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-08-28drm/i915: Protect our local workers against I915_FENCE_TIMEOUTChris Wilson1-1/+1
Trust our own workers to not cause unnecessary delays and disable the automatic timeout on their asynchronous fence waits. (Along the same lines that we trust our own requests to complete eventually, if necessary by force.) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190826072149.9447-6-chris@chris-wilson.co.uk
2019-08-23drm/i915: Keep drm_i915_file_private around under RCUChris Wilson1-6/+7
Ensure that the drm_i915_file_private continues to exist as we attempt to remove a request from its list, which may race with the destruction of the file. <6> [38.380714] [IGT] gem_ctx_create: starting subtest basic-files <0> [42.201329] BUG: spinlock bad magic on CPU#0, kworker/u16:0/7 <4> [42.201356] general protection fault: 0000 [#1] PREEMPT SMP PTI <4> [42.201371] CPU: 0 PID: 7 Comm: kworker/u16:0 Tainted: G U 5.3.0-rc5-CI-Patchwork_14169+ #1 <4> [42.201391] Hardware name: Dell Inc. OptiPlex 745 /0GW726, BIOS 2.3.1 05/21/2007 <4> [42.201594] Workqueue: i915 retire_work_handler [i915] <4> [42.201614] RIP: 0010:spin_dump+0x5a/0x90 <4> [42.201625] Code: 00 48 8d 88 c0 06 00 00 48 c7 c7 00 71 09 82 e8 35 ef 00 00 48 85 db 44 8b 4d 08 41 b8 ff ff ff ff 48 c7 c1 0b cd 0f 82 74 0e <44> 8b 83 e0 04 00 00 48 8d 8b c0 06 00 00 8b 55 04 48 89 ee 48 c7 <4> [42.201660] RSP: 0018:ffffc9000004bd80 EFLAGS: 00010202 <4> [42.201673] RAX: 0000000000000031 RBX: 6b6b6b6b6b6b6b6b RCX: ffffffff820fcd0b <4> [42.201688] RDX: 0000000000000000 RSI: ffff88803de266f8 RDI: 00000000ffffffff <4> [42.201703] RBP: ffff888038381ff8 R08: 00000000ffffffff R09: 000000006b6b6b6b <4> [42.201718] R10: 0000000041cb0b89 R11: 646162206b636f6c R12: ffff88802a618500 <4> [42.201733] R13: ffff88802b32c288 R14: ffff888038381ff8 R15: ffff88802b32c250 <4> [42.201748] FS: 0000000000000000(0000) GS:ffff88803de00000(0000) knlGS:0000000000000000 <4> [42.201765] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [42.201778] CR2: 00007f2cefc6d180 CR3: 00000000381ee000 CR4: 00000000000006f0 <4> [42.201793] Call Trace: <4> [42.201805] do_raw_spin_lock+0x66/0xb0 <4> [42.201898] i915_request_retire+0x548/0x7c0 [i915] <4> [42.201989] retire_requests+0x4d/0x60 [i915] <4> [42.202078] i915_retire_requests+0x144/0x2e0 [i915] <4> [42.202169] retire_work_handler+0x10/0x40 [i915] Recently, in commit 44c22f3f1a0a ("drm/i915: Serialize insertion into the file->mm.request_list"), we fixed a race on insertion. Now, it appears we also have a race with destruction! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190823181455.31910-1-chris@chris-wilson.co.uk
2019-08-23drm/i915: Hold irq-off for the entire fake lock periodChris Wilson1-4/+5
Sadly lockdep records when the irqs are re-enabled and then marks up the fake lock as being irq-unsafe. Our hand is forced and so we must mark up the entire fake lock critical section as irq-off. Hopefully this is the last tweak required! v2: Not quite, we need to mark the timeline spinlock as irqsafe. That was a genuine bug being hidden by the earlier lockdep splat. Fixes: d67739268cf0 ("drm/i915/gt: Mark up the nested engine-pm timeline lock as irqsafe") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190823132700.25286-2-chris@chris-wilson.co.uk
2019-08-22Merge drm/drm-next into drm-intel-next-queuedRodrigo Vivi1-2/+2
We need the rename of reservation_object to dma_resv. The solution on this merge came from linux-next: From: Stephen Rothwell <sfr@canb.auug.org.au> Date: Wed, 14 Aug 2019 12:48:39 +1000 Subject: [PATCH] drm: fix up fallout from "dma-buf: rename reservation_object to dma_resv" Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> --- drivers/gpu/drm/i915/gt/intel_engine_pool.c | 8 ++++---- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pool.c b/drivers/gpu/drm/i915/gt/intel_engine_pool.c index 03d90b49584a..4cd54c569911 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pool.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pool.c @@ -43,12 +43,12 @@ static int pool_active(struct i915_active *ref) { struct intel_engine_pool_node *node = container_of(ref, typeof(*node), active); - struct reservation_object *resv = node->obj->base.resv; + struct dma_resv *resv = node->obj->base.resv; int err; - if (reservation_object_trylock(resv)) { - reservation_object_add_excl_fence(resv, NULL); - reservation_object_unlock(resv); + if (dma_resv_trylock(resv)) { + dma_resv_add_excl_fence(resv, NULL); + dma_resv_unlock(resv); } err = i915_gem_object_pin_pages(node->obj); which is a simplified version from a previous one which had: Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-08-21Merge tag 'drm-misc-next-2019-08-19' of ↵Dave Airlie1-2/+2
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.4: UAPI Changes: Cross-subsystem Changes: Core Changes: - dma-buf: add reservation_object_fences helper, relax reservation_object_add_shared_fence, remove reservation_object seq number (and then restored) - dma-fence: Shrinkage of the dma_fence structure, Merge dma_fence_signal and dma_fence_signal_locked, Store the timestamp in struct dma_fence in a union with cb_list Driver Changes: - More dt-bindings YAML conversions - More removal of drmP.h includes - dw-hdmi: Support get_eld and various i2s improvements - gm12u320: Few fixes - meson: Global cleanup - panfrost: Few refactors, Support for GPU heap allocations - sun4i: Support for DDC enable GPIO - New panels: TI nspire, NEC NL8048HL11, LG Philips LB035Q02, Sharp LS037V7DW01, Sony ACX565AKM, Toppoly TD028TTEC1 Toppoly TD043MTEA1 Signed-off-by: Dave Airlie <airlied@redhat.com> [airlied: fixup dma_resv rename fallout] From: Maxime Ripard <maxime.ripard@bootlin.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190819141923.7l2adietcr2pioct@flea
2019-08-20drm/i915: Serialize insertion into the file->mm.request_listChris Wilson1-3/+3
Currently, we remove the from per-file request list for throttling and retirement under a dedicated spinlock, but insertion is governed by struct_mutex. This needs to be the same lock so that the retirement/insertion of neighbouring requests (at the tail) doesn't break the list. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190820080907.4665-1-chris@chris-wilson.co.uk
2019-08-19drm/i915: Use 0 for the unordered contextChris Wilson1-2/+2
Since commit 078dec3326e2 ("dma-buf: add dma_fence_get_stub") the 0 fence context became an impossible match as it is used for an always signaled fence. We can simplify our timeline tracking by knowing that 0 always means no match. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190819184404.24200-1-chris@chris-wilson.co.uk Link: https://patchwork.freedesktop.org/patch/msgid/20190819175109.5241-1-chris@chris-wilson.co.uk
2019-08-18drm/i915: Propagate fence errorsChris Wilson1-0/+8
Errors spread like wildfire, and must eventually be returned to the user. They need to be captured and passed along the flow of fences, infecting each in turn with the existing error, until finally they fall out of a user visible result. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190817232511.11391-1-chris@chris-wilson.co.uk
2019-08-16drm/i915/gt: Mark context->active_count as protected by timeline->mutexChris Wilson1-1/+2
We use timeline->mutex to protect modifications to context->active_count, and the associated enable/disable callbacks. Due to complications with engine-pm barrier there is a path where we used a "superlock" to provide serialised protect and so could not unconditionally assert with lockdep that it was always held. However, we can mark the mutex as taken (noting that we may be nested underneath ourselves) which means we can be reassured the right timeline->mutex is always treated as held and let lockdep roam free. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190816121000.8507-1-chris@chris-wilson.co.uk
2019-08-15drm/i915: Protect request retirement with timeline->mutexChris Wilson1-83/+73
Forgo the struct_mutex requirement for request retirement as we have been transitioning over to only using the timeline->mutex for controlling the lifetime of a request on that timeline. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190815205709.24285-4-chris@chris-wilson.co.uk
2019-08-15drm/i915: Move tasklet kicking to __i915_request_queue callerChris Wilson1-2/+2
Since __i915_request_queue() may be called from hardirq (timer) context, we cannot use local_bh_disable/enable at the lower level. As we do want to kick the tasklet to speed up initial submission or preemption for normal client submission, lift it to the normal process context callpath. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190815042031.27750-1-chris@chris-wilson.co.uk
2019-08-13drm/i915: Push the wakeref->count deferral to the backendChris Wilson1-31/+35
If the backend wishes to defer the wakeref parking, make it responsible for unlocking the wakeref (i.e. bumping the counter). This allows it to time the unlock much more carefully in case it happens to needs the wakeref to be active during its deferral. For instance, during engine parking we may choose to emit an idle barrier (a request). To do so, we borrow the engine->kernel_context timeline and to ensure exclusive access we keep the engine->wakeref.count as 0. However, to submit that request to HW may require a intel_engine_pm_get() (e.g. to keep the submission tasklet alive) and before we allow that we have to rewake our wakeref to avoid a recursive deadlock. <4> [257.742916] IRQs not enabled as expected <4> [257.742930] WARNING: CPU: 0 PID: 0 at kernel/softirq.c:169 __local_bh_enable_ip+0xa9/0x100 <4> [257.742936] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 btusb btrtl btbcm btintel snd_hda_intel snd_intel_nhlt bluetooth snd_hda_codec coretemp snd_hwdep crct10dif_pclmul snd_hda_core crc32_pclmul ecdh_generic ecc ghash_clmulni_intel snd_pcm r8169 realtek lpc_ich prime_numbers i2c_hid <4> [257.742991] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G U W 5.3.0-rc3-g5d0a06cd532c-drmtip_340+ #1 <4> [257.742998] Hardware name: GIGABYTE GB-BXBT-1900/MZBAYAB-00, BIOS F6 02/17/2015 <4> [257.743008] RIP: 0010:__local_bh_enable_ip+0xa9/0x100 <4> [257.743017] Code: 37 5b 5d c3 8b 80 50 08 00 00 85 c0 75 a9 80 3d 0b be 25 01 00 75 a0 48 c7 c7 f3 0c 06 ac c6 05 fb bd 25 01 01 e8 77 84 ff ff <0f> 0b eb 89 48 89 ef e8 3b 41 06 00 eb 98 e8 e4 5c f4 ff 5b 5d c3 <4> [257.743025] RSP: 0018:ffffa78600003cb8 EFLAGS: 00010086 <4> [257.743035] RAX: 0000000000000000 RBX: 0000000000000200 RCX: 0000000000010302 <4> [257.743042] RDX: 0000000080010302 RSI: 0000000000000000 RDI: 00000000ffffffff <4> [257.743050] RBP: ffffffffc0494bb3 R08: 0000000000000000 R09: 0000000000000001 <4> [257.743058] R10: 0000000014c8f0e9 R11: 00000000fee2ff8e R12: ffffa23ba8c38008 <4> [257.743065] R13: ffffa23bacc579c0 R14: ffffa23bb7db0f60 R15: ffffa23b9cc8c430 <4> [257.743074] FS: 0000000000000000(0000) GS:ffffa23bbba00000(0000) knlGS:0000000000000000 <4> [257.743082] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [257.743089] CR2: 00007fe477b20778 CR3: 000000011f72a000 CR4: 00000000001006f0 <4> [257.743096] Call Trace: <4> [257.743104] <IRQ> <4> [257.743265] __i915_request_commit+0x240/0x5d0 [i915] <4> [257.743427] ? __i915_request_create+0x228/0x4c0 [i915] <4> [257.743584] __engine_park+0x64/0x250 [i915] <4> [257.743730] ____intel_wakeref_put_last+0x1c/0x70 [i915] <4> [257.743878] i915_sample+0x2ee/0x310 [i915] <4> [257.744030] ? i915_pmu_cpu_offline+0xb0/0xb0 [i915] <4> [257.744040] __hrtimer_run_queues+0x11e/0x4b0 <4> [257.744068] hrtimer_interrupt+0xea/0x250 <4> [257.744079] ? lockdep_hardirqs_off+0x79/0xd0 <4> [257.744101] smp_apic_timer_interrupt+0x96/0x280 <4> [257.744114] apic_timer_interrupt+0xf/0x20 <4> [257.744125] RIP: 0010:__do_softirq+0xb3/0x4ae v2: Keep the priority_hint assert v3: That assert was desperately trying to point out my bug. Sorry, little assert. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111378 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190813190705.23869-1-chris@chris-wilson.co.uk
2019-08-13dma-buf: rename reservation_object to dma_resvChristian König1-2/+2
Be more consistent with the naming of the other DMA-buf objects. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/323401/
2019-08-09drm/i915: Lift timeline into intel_contextChris Wilson1-5/+5
Move the timeline from being inside the intel_ring to intel_context itself. This saves much pointer dancing and makes the relations of the context to its timeline much clearer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190809182518.20486-4-chris@chris-wilson.co.uk
2019-08-07drm/i915: avoid including intel_drv.h via i915_drv.h->i915_trace.hJani Nikula1-0/+1
Disentangle i915_drv.h from intel_drv.h, which gets included via i915_trace.h. This necessitates including i915_trace.h wherever it's needed. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ed82bf259d3b725a1a1a3c3e9d6fb5c08bc4d489.1565085691.git.jani.nikula@intel.com
2019-08-06drm/i915/gem: Make caps.scheduler staticChris Wilson1-2/+0
We do not notify userspace when the scheduler capabilities are changed (due to wedging the driver) and as such userspace will expect the caps to be static and unchanging. Make it so, and so we only need to compute our caps once during driver registration. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190806124300.24945-1-chris@chris-wilson.co.uk
2019-07-12drm/i915/gt: Use intel_gt as the primary object for handling resetsChris Wilson1-3/+2
Having taken the first step in encapsulating the functionality by moving the related files under gt/, the next step is to start encapsulating by passing around the relevant structs rather than the global drm_i915_private. In this step, we pass intel_gt to intel_reset.c Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190712192953.9187-1-chris@chris-wilson.co.uk
2019-07-09drm/i915: add infrastructure to hold off preemption on a requestLionel Landwerlin1-2/+2
We want to set this flag in the next commit on requests containing perf queries so that the result of the perf query can just be a delta of global counters, rather than doing post processing of the OA buffer. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [ickle: add basic selftest for nopreempt] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190709164227.25859-1-chris@chris-wilson.co.uk
2019-06-21drm/i915: Rename i915_timeline to intel_timeline and move under gtTvrtko Ursulin1-7/+7
Move all timeline code under gt and rename to intel_gt prefix. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-32-tvrtko.ursulin@linux.intel.com
2019-06-20drm/i915/execlists: Preempt-to-busyChris Wilson1-0/+6
When using a global seqno, we required a precise stop-the-workd event to handle preemption and unwind the global seqno counter. To accomplish this, we would preempt to a special out-of-band context and wait for the machine to report that it was idle. Given an idle machine, we could very precisely see which requests had completed and which we needed to feed back into the run queue. However, now that we have scrapped the global seqno, we no longer need to precisely unwind the global counter and only track requests by their per-context seqno. This allows us to loosely unwind inflight requests while scheduling a preemption, with the enormous caveat that the requests we put back on the run queue are still _inflight_ (until the preemption request is complete). This makes request tracking much more messy, as at any point then we can see a completed request that we believe is not currently scheduled for execution. We also have to be careful not to rewind RING_TAIL past RING_HEAD on preempting to the running context, and for this we use a semaphore to prevent completion of the request before continuing. To accomplish this feat, we change how we track requests scheduled to the HW. Instead of appending our requests onto a single list as we submit, we track each submission to ELSP as its own block. Then upon receiving the CS preemption event, we promote the pending block to the inflight block (discarding what was previously being tracked). As normal CS completion events arrive, we then remove stale entries from the inflight tracker. v2: Be a tinge paranoid and ensure we flush the write into the HWS page for the GPU semaphore to pick in a timely fashion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190620142052.19311-1-chris@chris-wilson.co.uk
2019-06-19drm/i915: Flush the execution-callbacks on retiringChris Wilson1-44/+49
In the unlikely case the request completes while we regard it as not even executing on the GPU (see the next patch!), we have to flush any pending execution callbacks at retirement and ensure that we do not add any more. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190618074153.16055-4-chris@chris-wilson.co.uk
2019-06-19drm/i915: Signal fence completion from i915_request_waitChris Wilson1-1/+3
With the upcoming change to automanaged i915_active, the intent is that whenever we wait on the set of active fences, they are signaled and collected. The requirement is that all successful returns from i915_request_wait() signal the fence, so fixup the one remaining path where we may return before the interrupt has been run. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190619112341.9082-3-chris@chris-wilson.co.uk
2019-06-19drm/i915: Stop passing I915_WAIT_LOCKED to i915_request_wait()Chris Wilson1-4/+0
Since commit eb8d0f5af4ec ("drm/i915: Remove GPU reset dependence on struct_mutex"), the I915_WAIT_LOCKED flags passed to i915_request_wait() has been defunct. Now go ahead and remove it from all callers. References: eb8d0f5af4ec ("drm/i915: Remove GPU reset dependence on struct_mutex") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190618074153.16055-3-chris@chris-wilson.co.uk
2019-06-19drm/i915: Make the semaphore saturation mask globalChris Wilson1-2/+2
The idea behind keeping the saturation mask local to a context backfired spectacularly. The premise with the local mask was that we would be more proactive in attempting to use semaphores after each time the context idled, and that all new contexts would attempt to use semaphores ignoring the current state of the system. This turns out to be horribly optimistic. If the system state is still oversaturated and the existing workloads have all stopped using semaphores, the new workloads would attempt to use semaphores and be deprioritised behind real work. The new contexts would not switch off using semaphores until their initial batch of low priority work had completed. Given sufficient backload load of equal user priority, this would completely starve the new work of any GPU time. To compensate, remove the local tracking in favour of keeping it as global state on the engine -- once the system is saturated and semaphores are disabled, everyone stops attempting to use semaphores until the system is idle again. One of the reason for preferring local context tracking was that it worked with virtual engines, so for switching to global state we could either do a complete check of all the virtual siblings or simply disable semaphores for those requests. This takes the simpler approach of disabling semaphores on virtual engines. The downside is that the decision that the engine is saturated is a local measure -- we are only checking whether or not this context was scheduled in a timely fashion, it may be legitimately delayed due to user priorities. We still have the same dilemma though, that we do not want to employ the semaphore poll unless it will be used. v2: Explain why we need to assume the worst wrt virtual engines. Fixes: ca6e56f654e7 ("drm/i915: Disable semaphore busywaits on saturated systems") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Cc: Dmitry Ermilov <dmitry.ermilov@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190618074153.16055-8-chris@chris-wilson.co.uk
2019-06-18drm/i915: Use drm_gem_object.resvChris Wilson1-2/+2
Since commit 1ba627148ef5 ("drm: Add reservation_object to drm_gem_object"), struct drm_gem_object grew its own builtin reservation_object rendering our own private one bloat. Remove our redundant reservation_object and point into obj->base.resv instead. References: 1ba627148ef5 ("drm: Add reservation_object to drm_gem_object") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190618125858.7295-1-chris@chris-wilson.co.uk
2019-06-14drm/i915: Replace engine->timeline with a plain listChris Wilson1-30/+13
To continue the onslaught of removing the assumption of a global execution ordering, another casualty is the engine->timeline. Without an actual timeline to track, it is overkill and we can replace it with a much less grand plain list. We still need a list of requests inflight, for the simple purpose of finding inflight requests (for retiring, resetting, preemption etc). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-3-chris@chris-wilson.co.uk
2019-06-14drm/i915: Stop retiring along engineChris Wilson1-78/+53
We no longer track the execution order along the engine and so no longer need to enforce ordering of retire along the engine. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-2-chris@chris-wilson.co.uk
2019-06-14drm/i915: Keep contexts pinned until after the next kernel context switchChris Wilson1-15/+0
We need to keep the context image pinned in memory until after the GPU has finished writing into it. Since it continues to write as we signal the final breadcrumb, we need to keep it pinned until the request after it is complete. Currently we know the order in which requests execute on each engine, and so to remove that presumption we need to identify a request/context-switch we know must occur after our completion. Any request queued after the signal must imply a context switch, for simplicity we use a fresh request from the kernel context. The sequence of operations for keeping the context pinned until saved is: - On context activation, we preallocate a node for each physical engine the context may operate on. This is to avoid allocations during unpinning, which may be from inside FS_RECLAIM context (aka the shrinker) - On context deactivation on retirement of the last active request (which is before we know the context has been saved), we add the preallocated node onto a barrier list on each engine - On engine idling, we emit a switch to kernel context. When this switch completes, we know that all previous contexts must have been saved, and so on retiring this request we can finally unpin all the contexts that were marked as deactivated prior to the switch. We can enhance this in future by flushing all the idle contexts on a regular heartbeat pulse of a switch to kernel context, which will also be used to check for hung engines. v2: intel_context_active_acquire/_release Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-1-chris@chris-wilson.co.uk
2019-06-14drm/i915: Refine i915_reset.lock_mapChris Wilson1-2/+10
We already use a mutex to serialise i915_reset() and wedging, so all we need it to link that into i915_request_wait() and we have our lock cycle detection. v2.5: Take error mutex for selftests Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190614071023.17929-3-chris@chris-wilson.co.uk
2019-06-14drm/i915: Execute signal callbacks from no-op i915_request_waitChris Wilson1-2/+4
If we enter i915_request_wait() with an already completed request, but unsignaled dma-fence, signal the fence before returning. This allows us to execute any of the signal callbacks at the earliest opportunity. v2: Also signal after busyspin success Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190614111053.25615-2-chris@chris-wilson.co.uk