summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/i915/i915_gem.h
AgeCommit message (Collapse)AuthorFilesLines
2021-06-17drm/i915: Break out dma_resv ww locking utilities to separate filesThomas Hellström1-12/+0
As we're about to add more ww-related functionality, break out the dma_resv ww locking utilities to their own files Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210617063018.92802-3-thomas.hellstrom@linux.intel.com
2021-03-25drm/i915: Use tasklet_unlock_spin_wait() in __tasklet_disable_sync_once()Sebastian Andrzej Siewior1-1/+1
The i915 driver has its own tasklet interface which was overseen in the tasklet rework. __tasklet_disable_sync_once() is a wrapper around tasklet_unlock_wait(). tasklet_unlock_wait() might sleep, but the i915 wrappers invokes it from non-preemtible contexts with bottom halves disabled. Use tasklet_unlock_spin_wait() instead which can be invoked from non-preemptible contexts. Fixes: da044747401fc ("tasklets: Replace spin wait in tasklet_unlock_wait()") Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210323092221.awq7g5b2muzypjw3@flow
2021-01-19drm/i915: Make GEM errors non-fatal by defaultChris Wilson1-1/+8
While immensely convenient for developing to only tackle the first error, and not be flooded by repeated or secondiary issues, many more casual testers are not setup to remotely capture debug traces. For those testers, it is more beneficial to keep the system running in the remote chance that they are able to extract the original debug logs. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210114113434.8229-2-chris@chris-wilson.co.uk
2020-09-07drm/i915: Use per object locking in execbuf, v12.Maarten Lankhorst1-0/+1
Now that we changed execbuf submission slightly to allow us to do all pinning in one place, we can now simply add ww versions on top of struct_mutex. All we have to do is a separate path for -EDEADLK handling, which needs to unpin all gem bo's before dropping the lock, then starting over. This finally allows us to do parallel submission, but because not all of the pinning code uses the ww ctx yet, we cannot completely drop struct_mutex yet. Changes since v1: - Keep struct_mutex for now. :( Changes since v2: - Make sure we always lock the ww context in slowpath. Changes since v3: - Don't call __eb_unreserve_vma in eb_move_to_gpu now; this can be done on normal unlock path. - Unconditionally release vmas and context. Changes since v4: - Rebased on top of struct_mutex reduction. Changes since v5: - Remove training wheels. Changes since v6: - Fix accidentally broken -ENOSPC handling. Changes since v7: - Handle gt buffer pool better. Changes since v8: - Properly clear variables, to make -EDEADLK handling not BUG. Change since v9: - Fix unpinning fence on pnv and below. Changes since v10: - Make relocation gpu chaining working again. Changes since v11: - Remove relocation chaining, pain to make it work. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-9-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.Maarten Lankhorst1-0/+11
i915_gem_ww_ctx is used to lock all gem bo's for pinning and memory eviction. We don't use it yet, but lets start adding the definition first. To use it, we have to pass a non-NULL ww to gem_object_lock, and don't unlock directly. It is done in i915_gem_ww_ctx_fini. Changes since v1: - Change ww_ctx and obj order in locking functions (Jonas Lahtinen) Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-6-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-07-06drm/i915: Print caller when tainting for CIMichał Winiarski1-1/+1
We can add taint from multiple places, printing the caller allows us to have a better overview of what exactly caused us to do the tainting. v2: Tweak format and print the device (Chris) v3: Move things around (Chris) Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200706144107.204821-2-michal@hardline.pl
2019-12-27Merge tag 'drm-intel-next-2019-12-23' of ↵Dave Airlie1-2/+6
git://anongit.freedesktop.org/drm/drm-intel into drm-next i915 features for v5.6: - Separate hardware and uapi state (Maarten) - Expose a number of sprite and plane formats (Ville) - DDC symlink in HDMI connector sysfs directory (Andrzej Pietrasiewicz) - Improve obj->mm.lock nesting lock annotation (Daniel) (Includes lockdep changes) - Selftest improvements across the board (Chris) - ICL/TGL VDSC support on DSI (Jani, Vandita) - TGL DSB fixes (Animesh, Lucas, Tvrtko) - VBT parsing improvements and fixes (Lucas, Matt, José, Jani, Dan Carpenter) - Fix LPSS vs. PMIC PWM backlight use on BYT/CHT (Hans) (Includes ACPI+MFD changes) - Display state, crtc, plane code refactoring (Ville) - Set opregion chpd value to indicate the driver handles hotplug (Hans de Goede) - DSI updates and fixes, TGL pipe D support, port mapping (José, Jani, Vandita) - Make HDCP 2.2 support cover CFL (Juston Li) - Fix CML PCI IDs and ULT (Shawn Lee) - CMP-V PCH fix (Imre) - TGL: Add another TGL PCH ID (James) - EHL/JSL: Add new PCI IDs (James) - Rename pipe update tracepoints (Ville) - Fix FBC on GLK+ (Ville) - GuC fixes and improvements (Daniele, Don Hiatt, Stuart Summers, Matthew Brost) - Display debugfs improvements (Ville) - Hotplug/irq fixes (Matt) - PSR fixes and improvements (José) - DRM_I915_GEM_MMAP_OFFSET ioctl (Abdiel) - Static analysis fixes (Colin Ian King) - Register sysctl path globally (Venkata Sandeep Dhanalakota) - Introduce new macros for tracing (Venkata Sandeep Dhanalakota) - Migrate gt towards intel_uncore_read/write (Andi) - Add rps frequency translation helpers (Andi) - Fix TGL transcoder clock off sequence (José) - Fix TGL port A audio (Kai Vehmanen) - TGL render decompression (DK) - GEM/GT improvements and fixes across the board (Chris) - Couple of backmerges (Jani) Signed-off-by: Dave Airlie <airlied@redhat.com> # gpg: Signature made Tue 24 Dec 2019 03:20:48 AM AEST # gpg: using RSA key D398079D26ABEE6F # gpg: Good signature from "Jani Nikula <jani.nikula@intel.com>" # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 1565 A65B 77B0 632E 1124 E59C D398 079D 26AB EE6F # Conflicts: # drivers/gpu/drm/i915/display/intel_fbc.c # drivers/gpu/drm/i915/gt/intel_lrc.c # drivers/gpu/drm/i915/i915_gem.c From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87lfr3rkry.fsf@intel.com
2019-12-17Merge tag 'drm-misc-next-2019-12-16' of ↵Daniel Vetter1-1/+1
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v5.6: UAPI Changes: - Add support for DMA-BUF HEAPS. Cross-subsystem Changes: - mipi dsi definition updates, pulled into drm-intel as well. - Add lockdep annotations for dma_resv vs mmap_sem and fs_reclaim. - Remove support for dma-buf kmap/kunmap. - Constify fb_ops in all fbdev drivers, including drm drivers and drm-core, and media as well. Core Changes: - Small cleanups to ttm. - Fix SCDC definition. - Assorted cleanups to core. - Add todo to remove load/unload hooks, and use generic fbdev emulation. - Assorted documentation updates. - Use blocking ww lock in ttm fault handler. - Remove drm_fb_helper_fbdev_setup/teardown. - Warning fixes with W=1 for atomic. - Use drm_debug_enabled() instead of drm_debug flag testing in various drivers. - Fallback to nontiled mode in fbdev emulation when not all tiles are present. (Later on reverted) - Various kconfig indentation fixes in core and drivers. - Fix freeing transactions in dp-mst correctly. - Sean Paul is steping down as core maintainer. :-( - Add lockdep annotations for atomic locks vs dma-resv. - Prevent use-after-free for a bad job in drm_scheduler. - Fill out all block sizes in the P01x and P210 definitions. - Avoid division by zero in drm/rect, and fix bounds. - Add drm/rect selftests. - Add aspect ratio and alternate clocks for HDMI 4k modes. - Add todo for drm_framebuffer_funcs and fb_create cleanup. - Drop DRM_AUTH for prime import/export ioctls. - Clear DP-MST payload id tables downstream when initializating. - Fix for DSC throughput definition. - Add extra FEC definitions. - Fix fake offset in drm_gem_object_funs.mmap. - Stop using encoder->bridge in core directly - Handle bridge chaining slightly better. - Add backlight support to drm/panel, and use it in many panel drivers. - Increase max number of y420 modes from 128 to 256, as preparation to add the new modes. Driver Changes: - Small fixes all over. - Fix documentation in vkms. - Fix mmap_sem vs dma_resv in nouveau. - Small cleanup in komeda. - Add page flip support in gma500 for psb/cdv. - Add ddc symlink in the connector sysfs directory for many drivers. - Add support for analogic an6345, and fix small bugs in it. - Add atomic modesetting support to ast. - Fix radeon fault handler VMA race. - Switch udl to use generic shmem helpers. - Unconditional vblank handling for mcde. - Miscellaneous fixes to mcde. - Tweak debug output from komeda using debugfs. - Add gamma and color transform support to komeda for DOU-IPS. - Add support for sony acx424AKP panel. - Various small cleanups to gma500. - Use generic fbdev emulation in udl, and replace udl_framebuffer with generic implementation. - Add support for Logic PD Type 28 panel. - Use drm_panel_* wrapper functions in exynos/tegra/msm. - Add devicetree bindings for generic DSI panels. - Don't include drm_pci.h directly in many drivers. - Add support for begin/end_cpu_access in udmabuf. - Stop using drm_get_pci_dev in gma500 and mga200. - Fixes to UDL damage handling, and use dma_buf_begin/end_cpu_access. - Add devfreq thermal support to panfrost. - Fix hotplug with daisy chained monitors by removing VCPI when disabling topology manager. - meson: Add support for OSD1 plane AFBC commit. - Stop displaying garbage when toggling ast primary plane on/off. - More cleanups and fixes to UDL. - Add D32 suport to komeda. - Remove globle copy of drm_dev in gma500. - Add support for Boe Himax8279d MIPI-DSI LCD panel. - Add support for ingenic JZ4770 panel. - Small null pointer deference fix in ingenic. - Remove support for the special tfp420 driver, as there is a generic way to do it. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ba73535a-9334-5302-2e1f-5208bd7390bd@linux.intel.com
2019-11-19drm/i915/gem: Manually dump the debug trace on GEM_BUG_ONChris Wilson1-0/+3
Since igt now defaults to not enabling ftrace-on-oops, we need to manually invoke GEM_TRACE_DUMP() to see the debug log prior to a GEM_BUG_ON panicking. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191119100929.2628356-1-chris@chris-wilson.co.uk
2019-11-14drm/i915: use drm_debug_enabled() to check for debug categoriesJani Nikula1-1/+1
Allow better abstraction of the drm_debug global variable in the future. No functional changes. Reviewed-by: Eric Engestrom <eric@engestrom.ch> Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: Sean Paul <sean@poorly.run> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/e94fe4977c5b8cac68556318be81f8e422e973fd.1572258936.git.jani.nikula@intel.com
2019-11-11drm/i915: Taint the kernel on dumping the GEM ftrace bufferChris Wilson1-2/+3
As the ftrace buffer is single shot, once dumped it will not update. As such, it only provides information for the first bug and all subsequent bugs are noise. The goal of CI is to have zero bugs, so taint the kernel causing CI to reboot the machine; fix the bug and move on. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191110185806.17413-13-chris@chris-wilson.co.uk
2019-10-23drm/i915/execlists: Force preemptionChris Wilson1-14/+0
If the preempted context takes too long to relinquish control, e.g. it is stuck inside a shader with arbitration disabled, evict that context with an engine reset. This ensures that preemptions are reasonably responsive, providing a tighter QoS for the more important context at the cost of flagging unresponsive contexts more frequently (i.e. instead of using an ~10s hangcheck, we now evict at ~100ms). The challenge of lies in picking a timeout that can be reasonably serviced by HW for typical workloads, balancing the existing clients against the needs for responsiveness. Note that coupled with timeslicing, this will lead to rapid GPU "hang" detection with multiple active contexts vying for GPU time. The forced preemption mechanism can be compiled out with ./scripts/config --set-val DRM_I915_PREEMPT_TIMEOUT 0 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-2-chris@chris-wilson.co.uk
2019-10-16drm/i915/execlist: Trim immediate timeslice expiryChris Wilson1-0/+14
We perform timeslicing immediately upon receipt of a request that may be put into the second ELSP slot. The idea behind this was that since we didn't install the timer if the second ELSP slot was empty, we would not have any idea of how long ELSP[0] had been running and so giving the newcomer a chance on the GPU was fair. However, this causes us extra busy work that we may be able to avoid if we wait a jiffie for the first timeslice as normal. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191016100851.4979-1-chris@chris-wilson.co.uk
2019-10-14drm/i915/execlists: Assert tasklet is locked for process_csb()Chris Wilson1-0/+5
We rely on only the tasklet being allowed to call into process_csb(), so assert that is locked when we do. As the tasklet uses a simple bitlock, there is no strong lockdep checking so we must make do with a plain assertion that the tasklet is running and assume that we are the tasklet! v2: Fixup intel_gt_sanitize() to prepare each engine for the reset so that the locks are marked as held during the reset v3: Check for existent function pointers for very early sanitisation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191014121336.30137-1-chris@chris-wilson.co.uk
2019-10-11drm/i915/execlists: Leave tell-tales as to why pending[] is badChris Wilson1-4/+7
Before we BUG out with bad pending state, leave a telltale as to which test failed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191010071434.31195-2-chris@chris-wilson.co.uk
2019-10-09drm/i915/gt: execlists->active is serialised by the taskletChris Wilson1-0/+6
The active/pending execlists is no longer protected by the engine->active.lock, but is serialised by the tasklet instead. Update the locking around the debug and stats to follow suit. v2: local_bh_disable() to prevent recursing into the tasklet in case we trigger a softirq (Tvrtko) Fixes: df403069029d ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191009160906.16195-1-chris@chris-wilson.co.uk
2019-08-07drm/i915: avoid including intel_drv.h via i915_drv.h->i915_trace.hJani Nikula1-0/+2
Disentangle i915_drv.h from intel_drv.h, which gets included via i915_trace.h. This necessitates including i915_trace.h wherever it's needed. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ed82bf259d3b725a1a1a3c3e9d6fb5c08bc4d489.1565085691.git.jani.nikula@intel.com
2019-05-22drm/i915: Load balancing across a virtual engineChris Wilson1-0/+5
Having allowed the user to define a set of engines that they will want to only use, we go one step further and allow them to bind those engines into a single virtual instance. Submitting a batch to the virtual engine will then forward it to any one of the set in a manner as best to distribute load. The virtual engine has a single timeline across all engines (it operates as a single queue), so it is not able to concurrently run batches across multiple engines by itself; that is left up to the user to submit multiple concurrent batches to multiple queues. Multiple users will be load balanced across the system. The mechanism used for load balancing in this patch is a late greedy balancer. When a request is ready for execution, it is added to each engine's queue, and when an engine is ready for its next request it claims it from the virtual engine. The first engine to do so, wins, i.e. the request is executed at the earliest opportunity (idle moment) in the system. As not all HW is created equal, the user is still able to skip the virtual engine and execute the batch on a specific engine, all within the same queue. It will then be executed in order on the correct engine, with execution on other virtual engines being moved away due to the load detection. A couple of areas for potential improvement left! - The virtual engine always take priority over equal-priority tasks. Mostly broken up by applying FQ_CODEL rules for prioritising new clients, and hopefully the virtual and real engines are not then congested (i.e. all work is via virtual engines, or all work is to the real engine). - We require the breadcrumb irq around every virtual engine request. For normal engines, we eliminate the need for the slow round trip via interrupt by using the submit fence and queueing in order. For virtual engines, we have to allow any job to transfer to a new ring, and cannot coalesce the submissions, so require the completion fence instead, forcing the persistent use of interrupts. - We only drip feed single requests through each virtual engine and onto the physical engines, even if there was enough work to fill all ELSP, leaving small stalls with an idle CS event at the end of every request. Could we be greedy and fill both slots? Being lazy is virtuous for load distribution on less-than-full workloads though. Other areas of improvement are more general, such as reducing lock contention, reducing dispatch overhead, looking at direct submission rather than bouncing around tasklets etc. sseu: Lift the restriction to allow sseu to be reconfigured on virtual engines composed of RENDER_CLASS (rcs). v2: macroize check_user_mbz() v3: Cancel virtual engines on wedging v4: Commence commenting v5: Replace 64b sibling_mask with a list of class:instance v6: Drop the one-element array in the uabi v7: Assert it is an virtual engine in to_virtual_engine() v8: Skip over holes in [class][inst] so we can selftest with (vcs0, vcs2) Link: https://github.com/intel/media-driver/pull/283 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-6-chris@chris-wilson.co.uk
2019-04-24drm/i915: Invert the GEM wakeref hierarchyChris Wilson1-3/+0
In the current scheme, on submitting a request we take a single global GEM wakeref, which trickles down to wake up all GT power domains. This is undesirable as we would like to be able to localise our power management to the available power domains and to remove the global GEM operations from the heart of the driver. (The intent there is to push global GEM decisions to the boundary as used by the GEM user interface.) Now during request construction, each request is responsible via its logical context to acquire a wakeref on each power domain it intends to utilize. Currently, each request takes a wakeref on the engine(s) and the engines themselves take a chipset wakeref. This gives us a transition on each engine which we can extend if we want to insert more powermangement control (such as soft rc6). The global GEM operations that currently require a struct_mutex are reduced to listening to pm events from the chipset GT wakeref. As we reduce the struct_mutex requirement, these listeners should evaporate. Perhaps the biggest immediate change is that this removes the struct_mutex requirement around GT power management, allowing us greater flexibility in request construction. Another important knock-on effect, is that by tracking engine usage, we can insert a switch back to the kernel context on that engine immediately, avoiding any extra delay or inserting global synchronisation barriers. This makes tracking when an engine and its associated contexts are idle much easier -- important for when we forgo our assumed execution ordering and need idle barriers to unpin used contexts. In the process, it means we remove a large chunk of code whose only purpose was to switch back to the kernel context. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-5-chris@chris-wilson.co.uk
2019-04-02drm/i915: Move intel_engine_mask_t around for use by i915_request_types.hChris Wilson1-2/+0
We want to use intel_engine_mask_t inside i915_request.h, which means extracting it from the general header file mess and placing it inside a types.h. A knock on effect is that the compiler wants to warn about type-contraction of ALL_ENGINES into intel_engine_maskt_t, so prepare for the worst. v2: Use intel_engine_mask_t consistently v3: Move I915_NUM_ENGINES to its natural home at the end of the enum Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190401162641.10963-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2019-03-15drm/i915: Always kick the execlists tasklet after resetChris Wilson1-1/+6
With direct submission being disabled while the reset in progress, we have a small window where we may forgo the submission of a new request and not notice its addition during execlists_reset_finish. To close this window, always schedule the submission tasklet on coming out of reset to catch any residual work. <6> [333.144082] i915: Running intel_hangcheck_live_selftests/igt_reset_engines <3> [333.296927] i915_reset_engine(rcs0:idle): failed to idle after reset <6> [333.296932] i915 0000:00:02.0: [drm] rcs0 <6> [333.296934] i915 0000:00:02.0: [drm] Hangcheck 0:a9ddf7a5 [4157 ms] <6> [333.296936] i915 0000:00:02.0: [drm] Reset count: 36048 (global 754) <6> [333.296938] i915 0000:00:02.0: [drm] Requests: <6> [333.296997] i915 0000:00:02.0: [drm] RING_START: 0x00000000 <6> [333.296999] i915 0000:00:02.0: [drm] RING_HEAD: 0x00000000 <6> [333.297001] i915 0000:00:02.0: [drm] RING_TAIL: 0x00000000 <6> [333.297003] i915 0000:00:02.0: [drm] RING_CTL: 0x00000000 <6> [333.297005] i915 0000:00:02.0: [drm] RING_MODE: 0x00000200 [idle] <6> [333.297007] i915 0000:00:02.0: [drm] RING_IMR: fffffeff <6> [333.297010] i915 0000:00:02.0: [drm] ACTHD: 0x00000000_00000000 <6> [333.297012] i915 0000:00:02.0: [drm] BBADDR: 0x00000000_00000000 <6> [333.297015] i915 0000:00:02.0: [drm] DMA_FADDR: 0x00000000_00000000 <6> [333.297017] i915 0000:00:02.0: [drm] IPEIR: 0x00000000 <6> [333.297019] i915 0000:00:02.0: [drm] IPEHR: 0x00000000 <6> [333.297021] i915 0000:00:02.0: [drm] Execlist status: 0x00000001 00000000 <6> [333.297023] i915 0000:00:02.0: [drm] Execlist CSB read 5, write 5 [mmio:7], tasklet queued? no (enabled) <6> [333.297025] i915 0000:00:02.0: [drm] ELSP[0] idle <6> [333.297027] i915 0000:00:02.0: [drm] ELSP[1] idle <6> [333.297028] i915 0000:00:02.0: [drm] HW active? 0x0 <6> [333.297044] i915 0000:00:02.0: [drm] Queue priority hint: -8186 <6> [333.297067] i915 0000:00:02.0: [drm] Q 2afac:5f2+ prio=-8186 @ 50ms: (null) <6> [333.297068] i915 0000:00:02.0: [drm] HWSP: <6> [333.297071] i915 0000:00:02.0: [drm] [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <6> [333.297073] i915 0000:00:02.0: [drm] * <6> [333.297075] i915 0000:00:02.0: [drm] [0040] 00000001 00000000 00000018 00000002 00000001 00000000 00000018 00000000 <6> [333.297077] i915 0000:00:02.0: [drm] [0060] 00000001 00000000 00008002 00000002 00000000 00000000 00000000 00000005 <6> [333.297079] i915 0000:00:02.0: [drm] [0080] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <6> [333.297081] i915 0000:00:02.0: [drm] * <6> [333.297083] i915 0000:00:02.0: [drm] [00c0] 00000000 00000000 00000000 00000000 a9ddf7a5 00000000 00000000 00000000 <6> [333.297085] i915 0000:00:02.0: [drm] [00e0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <6> [333.297087] i915 0000:00:02.0: [drm] * <6> [333.297089] i915 0000:00:02.0: [drm] Idle? no <6> [333.297090] i915_reset_engine(rcs0:idle): 3000 resets <3> [333.297092] i915/intel_hangcheck_live_selftests: igt_reset_engines failed with error -5 <3> [333.455460] i915 0000:00:02.0: Failed to idle engines, declaring wedged! ... <0> [333.491294] i915_sel-4916 1.... 333262143us : i915_reset_engine: rcs0 flags=4 <0> [333.491328] i915_sel-4916 1.... 333262143us : execlists_reset_prepare: rcs0: depth<-0 <0> [333.491362] i915_sel-4916 1.... 333262143us : intel_engine_stop_cs: rcs0 <0> [333.491396] i915_sel-4916 1d..1 333262144us : process_csb: rcs0 cs-irq head=5, tail=5 <0> [333.491424] i915_sel-4916 1.... 333262145us : intel_gpu_reset: engine_mask=1 <0> [333.491454] kworker/-214 5.... 333262184us : i915_gem_switch_to_kernel_context: awake?=yes <0> [333.491487] kworker/-214 5.... 333262192us : i915_request_add: rcs0 fence 2afac:1522 <0> [333.491520] kworker/-214 5.... 333262193us : i915_request_add: marking (null) as active <0> [333.491553] i915_sel-4916 1.... 333262199us : intel_engine_cancel_stop_cs: rcs0 <0> [333.491587] i915_sel-4916 1.... 333262199us : execlists_reset_finish: rcs0: depth->0 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190313162835.30228-1-chris@chris-wilson.co.uk
2019-03-07drm/i915: Make I915_GEM_IDLE_TIMEOUT into a macroChris Wilson1-0/+2
Currently we use HZ/5 for detecting a dead gpu on startup, and we will wish to reuse this value for detecting a dead gpu on suspend, so convert it into a macro for later convenience. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190307104530.21745-1-chris@chris-wilson.co.uk
2018-10-18drm/i915: GEM_WARN_ON considered harmfulTvrtko Ursulin1-1/+3
GEM_WARN_ON currently has dangerous semantics where it is completely compiled out on !GEM_DEBUG builds. This can leave users who expect it to be more like a WARN_ON, just without a warning in non-debug builds, in complete ignorance. Another gotcha with it is that it cannot be used as a statement. Which is again different from a standard kernel WARN_ON. This patch fixes both problems by making it behave as one would expect. It can now be used both as an expression and as statement, and also the condition evaluates properly in all builds - code under the conditional will therefore not unexpectedly disappear. To satisfy call sites which really want the code under the conditional to completely disappear, we add GEM_DEBUG_WARN_ON and convert some of the callers to it. This one can also be used as both expression and statement. >From the above it follows GEM_DEBUG_WARN_ON should be used in situations where we are certain the condition will be hit during development, but at a place in code where error can be handled to the benefit of not crashing the machine. GEM_WARN_ON on the other hand should be used where condition may happen in production and we just want to distinguish the level of debugging output emitted between the production and debug build. v2: * Dropped BUG_ON hunk. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tomasz Lis <tomasz.lis@intel.com> Reviewed-by: Tomasz Lis <tomasz.lis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181012063142.16080-1-tvrtko.ursulin@linux.intel.com
2018-08-29drm/i915/execlists: Flush tasklet directly from reset-finishChris Wilson1-6/+0
On finishing the reset, the intention is to restart the GPU before we relinquish the forcewake taken to handle the reset - the goal being the GPU reloads a context before it is allowed to sleep. For this purpose, we used tasklet_flush() which although it accomplished the goal of restarting the GPU, carried with it a sting in its tail: it cleared the TASKLET_STATE_SCHED bit. This meant that if another CPU queued a new request to this engine, we would clear the flag and later attempt to requeue the tasklet on the local CPU, breaking the per-cpu softirq lists. Remove the dangerous tasklet_kill() and just run the tasklet func directly as we know it is safe to do so (the tasklets are internally locked to allow mixed usage from direct submission). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180828152702.27536-1-chris@chris-wilson.co.uk
2018-06-28drm/i915/execlists: Direct submission of new requests (avoid tasklet/ksoftirqd)Chris Wilson1-0/+5
Back in commit 27af5eea54d1 ("drm/i915: Move execlists irq handler to a bottom half"), we came to the conclusion that running our CSB processing and ELSP submission from inside the irq handler was a bad idea. A really bad idea as we could impose nearly 1s latency on other users of the system, on average! Deferring our work to a tasklet allowed us to do the processing with irqs enabled, reducing the impact to an average of about 50us. We have since eradicated the use of forcewaked mmio from inside the CSB processing and ELSP submission, bringing the impact down to around 5us (on Kabylake); an order of magnitude better than our measurements 2 years ago on Broadwell and only about 2x worse on average than the gem_syslatency on an unladen system. In this iteration of the tasklet-vs-direct submission debate, we seek a compromise where by we submit new requests immediately to the HW but defer processing the CS interrupt onto a tasklet. We gain the advantage of low-latency and ksoftirqd avoidance when waking up the HW, while avoiding the system-wide starvation of our CS irq-storms. Comparing the impact on the maximum latency observed (that is the time stolen from an RT process) over a 120s interval, repeated several times (using gem_syslatency, similar to RT's cyclictest) while the system is fully laden with i915 nops, we see that direct submission an actually improve the worse case. Maximum latency in microseconds of a third party RT thread (gem_syslatency -t 120 -f 2) x Always using tasklets (a couple of >1000us outliers removed) + Only using tasklets from CS irq, direct submission of requests +------------------------------------------------------------------------+ | + | | + | | + | | + + | | + + + | | + + + + x x x | | +++ + + + x x x x x x | | +++ + ++ + + *x x x x x x | | +++ + ++ + * *x x * x x x | | + +++ + ++ * * +*xxx * x x xx | | * +++ + ++++* *x+**xx+ * x x xxxx x | | **x++++*++**+*x*x****x+ * +x xx xxxx x x | |x* ******+***************++*+***xxxxxx* xx*x xxx + x+| | |__________MA___________| | | |______M__A________| | +------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 118 91 186 124 125.28814 16.279137 + 120 92 187 109 112.00833 13.458617 Difference at 95.0% confidence -13.2798 +/- 3.79219 -10.5994% +/- 3.02677% (Student's t, pooled s = 14.9237) However the mean latency is adversely affected: Mean latency in microseconds of a third party RT thread (gem_syslatency -t 120 -f 1) x Always using tasklets + Only using tasklets from CS irq, direct submission of requests +------------------------------------------------------------------------+ | xxxxxx + ++ | | xxxxxx + ++ | | xxxxxx + +++ ++ | | xxxxxxx +++++ ++ | | xxxxxxx +++++ ++ | | xxxxxxx +++++ +++ | | xxxxxxx + ++++++++++ | | xxxxxxxx ++ ++++++++++ | | xxxxxxxx ++ ++++++++++ | | xxxxxxxxxx +++++++++++++++ | | xxxxxxxxxxx x +++++++++++++++ | |x xxxxxxxxxxxxx x + + ++++++++++++++++++ +| | |__A__| | | |____A___| | +------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 120 3.506 3.727 3.631 3.6321417 0.02773109 + 120 3.834 4.149 4.039 4.0375167 0.041221676 Difference at 95.0% confidence 0.405375 +/- 0.00888913 11.1608% +/- 0.244735% (Student's t, pooled s = 0.03513) However, since the mean latency corresponds to the amount of irqsoff processing we have to do for a CS interrupt, we only need to speed that up to benefit not just system latency but our own throughput. v2: Remember to defer submissions when under reset. v4: Only use direct submission for new requests v5: Be aware that with mixing direct tasklet evaluation and deferred tasklets, we may end up idling before running the deferred tasklet. v6: Remove the redudant likely() from tasklet_is_enabled(), restrict the annotation to reset_in_progress(). v7: Take the full timeline.lock when enabling perf_pmu stats as the tasklet is no longer a valid guard. A consequence is that the stats are now only valid for engines also using the timeline.lock to process state. Testcase: igt/gem_exec_latency/*rthog* References: 27af5eea54d1 ("drm/i915: Move execlists irq handler to a bottom half") Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-9-chris@chris-wilson.co.uk
2018-05-25drm/i915/execlists: Wait for ELSP submission on restartChris Wilson1-0/+6
After a reset, we will ensure that there is at least one request submitted to HW to ensure that a context is loaded for powersaving. Let's wait for this submission via a tasklet to complete before we drop our forcewake, ensuring the system is ready for rc6 before we let it possibly sleep. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180522101937.7738-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2018-05-24drm/i915: Look for an active kernel context before switchingChris Wilson1-0/+3
We were not very carefully checking to see if an older request on the engine was an earlier switch-to-kernel-context before deciding to emit a new switch. The end result would be that we could get into a permanent loop of trying to emit a new request to perform the switch simply to flush the existing switch. What we need is a means of tracking the completion of each timeline versus the kernel context, that is to detect if a more recent request has been submitted that would result in a switch away from the kernel context. To realise this, we need only to look in our syncmap on the kernel context and check that we have synchronized against all active rings. v2: Since all ringbuffer clients currently share the same timeline, we do have to use the gem_context to distinguish clients. As a bonus, include all the tracing used to debug the death inside suspend. v3: Test, test, test. Construct a selftest to exercise and assert the expected behaviour that multiple switch-to-contexts do not emit redundant requests. Reported-by: Mika Kuoppala <mika.kuoppala@intel.com> Fixes: a89d1f921c15 ("drm/i915: Split i915_gem_timeline into individual timelines") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180524081135.15278-1-chris@chris-wilson.co.uk
2018-05-16drm/i915: Only sync tasklets once for recursive reset preparationChris Wilson1-0/+7
When setting up reset, we may need to recursively prepare an engine. In which case we should only synchronously flush the tasklets on the outer most call, the inner calls will then be inside an atomic section where the tasklet will never be run (and so the sync will never complete). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180516183355.10553-2-chris@chris-wilson.co.uk
2018-04-26drm/i915: Compile out engine debug for releaseChris Wilson1-0/+6
The majority of the engine state dumping is too voluminous to be useful outside of a controlled setup, though a few do accompany severe errors. Keep the debug dumps next to the errors, but hide the others behind a CI compile flag. This becomes more useful when adding more dumps to latency sensitive paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180426103219.22181-1-chris@chris-wilson.co.uk
2018-04-06drm/i915: Split out parking from the idle worker for reuseChris Wilson1-0/+5
We will want to park GEM before disengaging the drive^W^W^W unwedging. Since we already do the work for idling, expose the guts as a new function that we can then reuse. v2: Just skip if already parked; makes it more forgiving to use by future callers. v3: Extract mark_busy, rename it to i915_gem_unpark and place it next to i915_gem_park so that we can evaluate it for symmetry more easily. Calling GEM from inside i915_request looks to be a bit of a layering violation, for the moment I am imaging them as being notify_cb. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> #v1 Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180406155144.27791-1-chris@chris-wilson.co.uk
2018-03-13drm/i915: Show GEM_TRACE when detecting a failed GPU idleChris Wilson1-0/+2
If we timeout waiting for the GPU to idle, something went seriously wrong. We currently dump the engine state, but we can also dump the ftrace buffer showing our last operations (when available). In passing, note that since commit 559e040f1f08 ("drm/i915: Show the GPU state when declaring wedged", we now show the engine state twice, once in detecting the failed idle and then again on declaring wedged. v2: ftrace_dump() takes a parameter specifying whether to dump all cpu buffers or the local cpu's. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180309101114.1138-1-chris@chris-wilson.co.uk
2018-03-01drm/i915/icl: Prepare for more ringsTvrtko Ursulin1-1/+1
Gen11 will add more VCS and VECS rings so prepare the infrastructure to support that. Bspec: 7021 v2: Rebase. v3: Rebase. v4: Rebase. v5: Rebase. v6: - Update for POR changes. (Daniele Ceraolo Spurio) - Add provisional guc engine ids - to be checked and confirmed. v7: - Rebased. - Added the new ring masks. - Added the new HW ids. v8: - Introduce I915_MAX_VCS/VECS to avoid magic numbers (Michal) v9: increase MAX_ENGINE_INSTANCE to 3 Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180228101153.7224-1-mika.kuoppala@linux.intel.com
2018-02-28drm/i915: Repeat the GEM_BUG_ON message in the ftrace logChris Wilson1-1/+4
As the ftrace log is overflowing the pstore capture, we lose the last gasps from dmesg which includes the GEM_BUG_ON function:line and condition that failed. Vital information for tracking down the bug, so append it to the frace log as well. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180227211816.5546-1-chris@chris-wilson.co.uk
2017-11-16drm/i915: Print the condition causing GEM_BUG_ONMika Kuoppala1-1/+5
It is easier to categorize and debug bugs if the failed condition is in plain sight in the actual dmesg output. Make it so. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Marta Lofstedt <marta.lofstedt@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171116083954.3357-1-mika.kuoppala@linux.intel.com
2017-11-09drm/i915: Use trace_printk to provide a death rattle for GEMChris Wilson1-0/+6
Trying to enable printk debugging for GEM is fraught with the issue of spam; interactions with HW are very frequent and often boring. However, one instance where they are not so boring is just before a BUG; here ftrace provides a facility to dump its ringbuffer on an oops. So for CI let's enable trace_printk() to capture the last exchanges with HW as a death rattle. For example, [ 79.234110] ------------[ cut here ]------------ [ 79.234137] kernel BUG at drivers/gpu/drm/i915/intel_lrc.c:907! [ 79.234145] invalid opcode: 0000 [#1] SMP [ 79.234153] Dumping ftrace buffer: [ 79.234158] --------------------------------- ... [ 79.314044] gem_conc-1059 1..s1 79203443us : intel_lrc_irq_handler: bcs0 out[0]: ctx=5.2, seqno=145 [ 79.314089] gem_conc-1059 1..s. 79220800us : intel_lrc_irq_handler: bcs0 csb[1/1]: status=0x00000018:0x00000005 [ 79.314133] gem_conc-1059 1..s. 79220803us : intel_lrc_irq_handler: bcs0 out[0]: ctx=5.1, seqno=145 [ 79.314177] gem_conc-1062 2..s1 79230458us : intel_lrc_irq_handler: bcs0 in[0]: ctx=8.1, seqno=146 [ 79.314220] gem_conc-1062 2..s1 79230515us : intel_lrc_irq_handler: bcs0 in[0]: ctx=8.2, seqno=147 [ 79.314265] gem_conc-1059 1..s1 79230951us : intel_lrc_irq_handler: bcs0 csb[2/3]: status=0x00000012:0x00000008 [ 79.314309] gem_conc-1059 1..s1 79230954us : intel_lrc_irq_handler: bcs0 out[0]: ctx=8.2, seqno=147 [ 79.314353] gem_conc-1059 1..s1 79230954us : intel_lrc_irq_handler: bcs0 csb[3/3]: status=0x00008002:0x00000008 [ 79.314396] gem_conc-1059 1..s1 79230955us : intel_lrc_irq_handler: bcs0 out[0]: ctx=8.1, seqno=147 [ 79.314402] --------------------------------- v2: Tweak the formatting to be more consistent between in/out. v3: do {} while (0) stub macro protection Suggested-by: Michał Winiarski <michal.winiarski@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171109143019.16568-1-chris@chris-wilson.co.uk
2017-05-03drm/i915: Squash repeated awaits on the same fenceChris Wilson1-0/+2
Track the latest fence waited upon on each context, and only add a new asynchronous wait if the new fence is more recent than the recorded fence for that context. This requires us to filter out unordered timelines, which are noted by DMA_FENCE_NO_CONTEXT. However, in the absence of a universal identifier, we have to use our own i915->mm.unordered_timeline token. v2: Throw around the debug crutches v3: Inline the likely case of the pre-allocation cache being full. v4: Drop the pre-allocation support, we can lose the most recent fence in case of allocation failure -- it just means we may emit more awaits than strictly necessary but will not break. v5: Trim allocation size for leaf nodes, they only need an array of u32 not pointers. v6: Create mock_timeline to tidy selftest writing v7: s/intel_timeline_sync_get/intel_timeline_sync_is_later/ (Tvrtko) v8: Prune the stale sync points when we idle. v9: Include a small benchmark in the kselftests v10: Separate the idr implementation into its own compartment. (Tvrkto) v11: Refactor igt_sync kselftests to avoid deep nesting (Tvrkto) v12: __sync_leaf_idx() to assert that p->height is 0 when checking leaves v13: kselftests to investigate struct i915_syncmap itself (Tvrtko) v14: Foray into ascii art graphs v15: Take into account that the random lookup/insert does 2 prng calls, not 1, when benchmarking, and use for_each_set_bit() (Tvrtko) v16: Improved ascii art Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170503093924.5320-4-chris@chris-wilson.co.uk
2017-02-10drm/i915: Rename conditional GEM execution macrosChris Wilson1-6/+6
After a brief discussion, we settled on a naming convention for the conditional GEM debugging data that should be clearer to the casual user: GEM_DEBUG Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170207102319.10910-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-02-06drm/i915: Mark the end of intel_ring_begin() and check in intel_ring_advance()Chris Wilson1-0/+9
It is required that the caller declare the exact number of dwords they wish to write into the ring. This is required for two reasons, we need to allocate sufficient space for the entire command packet and we need to be sure that the contents are completely written to avoid executing stale data. The current interface requires for any bug to be caught in review, the reader has to carefully count the number of intel_ring_emit() between intel_ring_begin() and intel_ring_advance(). If we record the end of the packet of each intel_ring_begin() we can also have CI check for us. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170206170502.30944-1-chris@chris-wilson.co.uk
2016-12-16drm/i915: introduce GEM_WARN_ONMatthew Auld1-0/+2
In a similar spirit to GEM_BUG_ON we now also have GEM_WARN_ON, with the simple goal of expressing warnings which are truly insane, and so are only really useful for CI where we have some abusive tests. v2: - use BUILD_BUG_ON_INVALID for !DEBUG_GEM - clarify commit message Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161213203222.32564-1-matthew.auld@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-12-05drm/i915: allow GEM_BUG_ON expr checking with !DEBUG_GEMMatthew Auld1-1/+1
Use BUILD_BUG_ON_INVALID(expr) in GEM_BUG_ON when building without DEBUG_GEM. This means the compiler can now check the validity of expr without generating any code, in turn preventing us from inadvertently breaking the build when DEBUG_GEM is not enabled. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161202184750.3843-1-matthew.auld@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-11-08drm/i915: avoid harmless empty-body warningArnd Bergmann1-1/+1
The newly added assert_kernel_context_is_current introduces a warning when built with W=1: drivers/gpu/drm/i915/i915_gem.c: In function ‘assert_kernel_context_is_current’: drivers/gpu/drm/i915/i915_gem.c:4417:63: error: suggest braces around empty body in an ‘else’ statement [-Werror=empty-body] Changing the GEM_BUG_ON() macro from an empty definition to "do { } while (0)" makes the macro more robust to use and avoids the warning. Fixes: 3033acab07f9 ("drm/i915: Queue the idling context switch after all other timelines") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161108135834.2166677-1-arnd@arndb.de
2016-10-28drm/i915: Combine seqno + tracking into a global timeline structChris Wilson1-0/+2
Our timelines are more than just a seqno. They also provide an ordered list of requests to be executed. Due to the restriction of handling individual address spaces, we are limited to a timeline per address space but we use a fence context per engine within. Our first step to introducing independent timelines per context (i.e. to allow each context to have a queue of requests to execute that have a defined set of dependencies on other requests) is to provide a timeline abstraction for the global execution queue. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-23-chris@chris-wilson.co.uk
2016-04-14drm/i915: Add GEM debugging Kconfig optionChris Wilson1-0/+34
Currently there is a #define to enable extra BUG_ON for debugging requests and associated activities. I want to expand its use to cover all of GEM internals (so that we can saturate the code with asserts). We can add a Kconfig option to make it easier to enable - with the usual caveats of not enabling unless explicitly requested. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-3-git-send-email-chris@chris-wilson.co.uk