summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2020-04-29drm/i915/gt: Keep a no-frills swappable copy of the default context stateChris Wilson14-123/+291
We need to keep the default context state around to instantiate new contexts (aka golden rendercontext), and we also keep it pinned while the engine is active so that we can quickly reset a hanging context. However, the default contexts are large enough to merit keeping in swappable memory as opposed to kernel memory, so we store them inside shmemfs. Currently, we use the normal GEM objects to create the default context image, but we can throw away all but the shmemfs file. This greatly simplifies the tricky power management code which wants to run underneath the normal GT locking, and we definitely do not want to use any high level objects that may appear to recurse back into the GT. Though perhaps the primary advantage of the complex GEM object is that we aggressively cache the mapping, but here we are recreating the vm_area everytime time we unpark. At the worst, we add a lightweight cache, but first find a microbenchmark that is impacted. Having started to create some utility functions to make working with shmemfs objects easier, we can start putting them to wider use, where GEM objects are overkill, such as storing persistent error state. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Ramalingam C <ramalingam.c@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429172429.6054-1-chris@chris-wilson.co.uk
2020-04-29drm/i915/selftests: fix error handling in __live_lrc_indirect_ctx_bb()Dan Carpenter1-12/+18
If intel_context_create() fails then it leads to an error pointer dereference. I shuffled things around to make error handling easier. Fixes: 1dd47b54baea ("drm/i915: Add live selftests for indirect ctx batchbuffers") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200429132425.GE815283@mwanda
2020-04-29drm/i915: Avoid dereferencing a dead contextChris Wilson1-5/+7
Once the intel_context is closed, the GEM context may be freed and so the link from intel_context.gem_context is invalid. <3>[ 219.782944] BUG: KASAN: use-after-free in intel_engine_coredump_alloc+0x1bc3/0x2250 [i915] <3>[ 219.782996] Read of size 8 at addr ffff8881d7dff0b8 by task kworker/0:1/12 <4>[ 219.783052] CPU: 0 PID: 12 Comm: kworker/0:1 Tainted: G U 5.7.0-rc2-g1f3ffd7683d54-kasan_118+ #1 <4>[ 219.783055] Hardware name: System manufacturer System Product Name/Z170 PRO GAMING, BIOS 3402 04/26/2017 <4>[ 219.783105] Workqueue: events heartbeat [i915] <4>[ 219.783109] Call Trace: <4>[ 219.783113] <IRQ> <4>[ 219.783119] dump_stack+0x96/0xdb <4>[ 219.783177] ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915] <4>[ 219.783182] print_address_description.constprop.6+0x16/0x310 <4>[ 219.783239] ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915] <4>[ 219.783295] ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915] <4>[ 219.783300] __kasan_report+0x137/0x190 <4>[ 219.783359] ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915] <4>[ 219.783366] kasan_report+0x32/0x50 <4>[ 219.783426] intel_engine_coredump_alloc+0x1bc3/0x2250 [i915] <4>[ 219.783481] execlists_reset+0x39c/0x13d0 [i915] <4>[ 219.783494] ? mark_held_locks+0x9e/0xe0 <4>[ 219.783546] ? execlists_hold+0xfc0/0xfc0 [i915] <4>[ 219.783551] ? lockdep_hardirqs_on+0x348/0x5f0 <4>[ 219.783557] ? _raw_spin_unlock_irqrestore+0x34/0x60 <4>[ 219.783606] ? execlists_submission_tasklet+0x118/0x3a0 [i915] <4>[ 219.783615] tasklet_action_common.isra.14+0x13b/0x410 <4>[ 219.783623] ? __do_softirq+0x1e4/0x9a7 <4>[ 219.783630] __do_softirq+0x226/0x9a7 <4>[ 219.783643] do_softirq_own_stack+0x2a/0x40 <4>[ 219.783647] </IRQ> <4>[ 219.783692] ? heartbeat+0x3e2/0x10f0 [i915] <4>[ 219.783696] do_softirq.part.13+0x49/0x50 <4>[ 219.783700] __local_bh_enable_ip+0x1a2/0x1e0 <4>[ 219.783748] heartbeat+0x409/0x10f0 [i915] <4>[ 219.783801] ? __live_idle_pulse+0x9f0/0x9f0 [i915] <4>[ 219.783806] ? lock_acquire+0x1ac/0x8a0 <4>[ 219.783811] ? process_one_work+0x811/0x1870 <4>[ 219.783827] ? rcu_read_lock_sched_held+0x9c/0xd0 <4>[ 219.783832] ? rcu_read_lock_bh_held+0xb0/0xb0 <4>[ 219.783836] ? _raw_spin_unlock_irq+0x1f/0x40 <4>[ 219.783845] process_one_work+0x8ca/0x1870 <4>[ 219.783848] ? lock_acquire+0x1ac/0x8a0 <4>[ 219.783852] ? worker_thread+0x1d0/0xb80 <4>[ 219.783864] ? pwq_dec_nr_in_flight+0x2c0/0x2c0 <4>[ 219.783870] ? do_raw_spin_lock+0x129/0x290 <4>[ 219.783886] worker_thread+0x82/0xb80 <4>[ 219.783895] ? __kthread_parkme+0xaf/0x1b0 <4>[ 219.783902] ? process_one_work+0x1870/0x1870 <4>[ 219.783906] kthread+0x34e/0x420 <4>[ 219.783911] ? kthread_create_on_node+0xc0/0xc0 <4>[ 219.783918] ret_from_fork+0x3a/0x50 <3>[ 219.783950] Allocated by task 1264: <4>[ 219.783975] save_stack+0x19/0x40 <4>[ 219.783978] __kasan_kmalloc.constprop.3+0xa0/0xd0 <4>[ 219.784029] i915_gem_create_context+0xa2/0xab8 [i915] <4>[ 219.784081] i915_gem_context_create_ioctl+0x1fa/0x450 [i915] <4>[ 219.784085] drm_ioctl_kernel+0x1d8/0x270 <4>[ 219.784088] drm_ioctl+0x676/0x930 <4>[ 219.784092] ksys_ioctl+0xb7/0xe0 <4>[ 219.784096] __x64_sys_ioctl+0x6a/0xb0 <4>[ 219.784100] do_syscall_64+0x94/0x530 <4>[ 219.784103] entry_SYSCALL_64_after_hwframe+0x49/0xb3 <3>[ 219.784120] Freed by task 12: <4>[ 219.784141] save_stack+0x19/0x40 <4>[ 219.784145] __kasan_slab_free+0x130/0x180 <4>[ 219.784148] kmem_cache_free_bulk+0x1bd/0x500 <4>[ 219.784152] kfree_rcu_work+0x1d8/0x890 <4>[ 219.784155] process_one_work+0x8ca/0x1870 <4>[ 219.784158] worker_thread+0x82/0xb80 <4>[ 219.784162] kthread+0x34e/0x420 <4>[ 219.784165] ret_from_fork+0x3a/0x50 Fixes: 2e46a2a0b014 ("drm/i915: Use explicit flag to mark unreachable intel_context") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200428090255.10035-1-chris@chris-wilson.co.uk
2020-04-29drm/i915/gt: Avoid uninitialized use of rpcurupei in frequency_showNathan Chancellor1-1/+1
When building with clang + -Wuninitialized: drivers/gpu/drm/i915/gt/debugfs_gt_pm.c:407:7: warning: variable 'rpcurupei' is uninitialized when used here [-Wuninitialized] rpcurupei, ^~~~~~~~~ drivers/gpu/drm/i915/gt/debugfs_gt_pm.c:304:16: note: initialize the variable 'rpcurupei' to silence this warning u32 rpcurupei, rpcurup, rpprevup; ^ = 0 1 warning generated. rpupei is assigned twice; based on the second argument to intel_uncore_read, it seems this one should have been assigned to rpcurupei. Fixes: 9c878557b1eb ("drm/i915/gt: Use the RPM config register to determine clk frequencies") Link: https://github.com/ClangBuiltLinux/linux/issues/1016 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200429030051.920203-1-natechancellor@gmail.com
2020-04-28drm/i915/execlists: Verify we don't submit two identical CCIDsChris Wilson1-9/+28
Check that we do not submit two contexts into ELSP with the same CCID [upper portion of the descriptor]. References: https://gitlab.freedesktop.org/drm/intel/-/issues/1793 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-3-chris@chris-wilson.co.uk
2020-04-28drm/i915/execlists: Track inflight CCIDChris Wilson4-12/+25
The presumption is that by using a circular counter that is twice as large as the maximum ELSP submission, we would never reuse the same CCID for two inflight contexts. However, if we continually preempt an active context such that it always remains inflight, it can be resubmitted with an arbitrary number of paired contexts. As each of its paired contexts will use a new CCID, eventually it will wrap and submit two ELSP with the same CCID. Rather than use a simple circular counter, switch over to a small bitmap of inflight ids so we can avoid reusing one that is still potentially active. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1796 Fixes: 2935ed5339c4 ("drm/i915: Remove logical HW ID") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.5+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-2-chris@chris-wilson.co.uk
2020-04-28drm/i915/execlists: Avoid reusing the same logical CCIDChris Wilson7-40/+43
The bspec is confusing on the nature of the upper 32bits of the LRC descriptor. Once upon a time, it said that it uses the upper 32b to decide if it should perform a lite-restore, and so we must ensure that each unique context submitted to HW is given a unique CCID [for the duration of it being on the HW]. Currently, this is achieved by using a small circular tag, and assigning every context submitted to HW a new id. However, this tag is being cleared on repinning an inflight context such that we end up re-using the 0 tag for multiple contexts. To avoid accidentally clearing the CCID in the upper 32bits of the LRC descriptor, split the descriptor into two dwords so we can update the GGTT address separately from the CCID. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1796 Fixes: 2935ed5339c4 ("drm/i915: Remove logical HW ID") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.5+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-1-chris@chris-wilson.co.uk
2020-04-28drm/i915/tgl: Wa_14011059788Matt Atwood1-0/+4
Reflect recent Bspec changes v2: fix whitespace, typo Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Radhakrishna Sripada <Radhakrishna.sripada@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200415193535.14597-1-matthew.s.atwood@intel.com
2020-04-28drm/i915/selftests: Tweak the tolerance for clock ticks to 12.5%Chris Wilson1-4/+4
Give a small bump for our tolerance on comparing the expected vs measured clock ticks/time from 10% to 12.5% to accommodate a bad result on Sandybridge that was off by 10.3%. Hopefully, that is the worst we will see. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1802 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200428114307.5153-1-chris@chris-wilson.co.uk
2020-04-28drm/i915/gt: fix spelling mistake "evalution" -> "evaluation"Colin Ian King1-1/+1
There is a spelling mistaking in a pr_notice message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200428084920.1035125-1-colin.king@canonical.com
2020-04-27drm/i915: Use proper fault mask in interrupt postinstall tooMatt Roper1-4/+2
The IRQ postinstall handling had open-coded pipe fault mask selection that never got updated for gen11. Switch it to use gen8_de_pipe_fault_mask() to ensure we don't miss updates for new platforms. Cc: José Roberto de Souza <jose.souza@intel.com> Fixes: d506a65d56fd ("drm/i915: Catch GTT fault errors for gen11+ planes") Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200424231423.4065231-1-matthew.d.roper@intel.com Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2020-04-27drm/i915/gt: Fix up clock frequencyChris Wilson4-6/+147
The bspec lists both the clock frequency and the effective interval. The interval corresponds to observed behaviour, so adjust the frequency to match. v2: Mika rightfully asked if we could measure the clock frequency from a selftest. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200427154554.12736-1-chris@chris-wilson.co.uk
2020-04-27drm/i915/gt: Sanitize GT firstChris Wilson2-1/+5
We see that if the HW doesn't actually sleep, the HW may eat the poison we set in its write-only HWSP during sanitize: intel_gt_resume.part.8: 0000:00:02.0 __gt_unpark: 0000:00:02.0 gt_sanitize: 0000:00:02.0 force:yes process_csb: 0000:00:02.0 vcs0: cs-irq head=5, tail=90 process_csb: 0000:00:02.0 vcs0: csb[0]: status=0x5a5a5a5a:0x5a5a5a5a assert_pending_valid: Nothing pending for promotion! The CS TAIL pointer should have been reset by reset_csb_pointers(), so in this case it is likely that we have read back from the CPU cache and so we must clflush our control over that page. In doing so, push the sanitisation to the start of the GT sequence so that our poisoning is assuredly before we start talking to the HW. References: https://gitlab.freedesktop.org/drm/intel/-/issues/1794 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200427084000.10999-1-chris@chris-wilson.co.uk
2020-04-27drm/i915/gt: Check cacheline is valid before acquiringChris Wilson1-0/+2
The hwsp_cacheline pointer from i915_request is very, very flimsy. The i915_request.timeline (and the hwsp_cacheline) are lost upon retiring (after an RCU grace). Therefore we need to confirm that once we have the right pointer for the cacheline, it is not in the process of being retired and disposed of before we attempt to acquire a reference to the cacheline. <3>[ 547.208237] BUG: KASAN: use-after-free in active_debug_hint+0x6a/0x70 [i915] <3>[ 547.208366] Read of size 8 at addr ffff88822a0d2710 by task gem_exec_parall/2536 <4>[ 547.208547] CPU: 3 PID: 2536 Comm: gem_exec_parall Tainted: G U 5.7.0-rc2-ged7a286b5d02d-kasan_117+ #1 <4>[ 547.208556] Hardware name: Dell Inc. XPS 13 9350/, BIOS 1.4.12 11/30/2016 <4>[ 547.208564] Call Trace: <4>[ 547.208579] dump_stack+0x96/0xdb <4>[ 547.208707] ? active_debug_hint+0x6a/0x70 [i915] <4>[ 547.208719] print_address_description.constprop.6+0x16/0x310 <4>[ 547.208841] ? active_debug_hint+0x6a/0x70 [i915] <4>[ 547.208963] ? active_debug_hint+0x6a/0x70 [i915] <4>[ 547.208975] __kasan_report+0x137/0x190 <4>[ 547.209106] ? active_debug_hint+0x6a/0x70 [i915] <4>[ 547.209127] kasan_report+0x32/0x50 <4>[ 547.209257] ? i915_gemfs_fini+0x40/0x40 [i915] <4>[ 547.209376] active_debug_hint+0x6a/0x70 [i915] <4>[ 547.209389] debug_print_object+0xa7/0x220 <4>[ 547.209405] ? lockdep_hardirqs_on+0x348/0x5f0 <4>[ 547.209426] debug_object_assert_init+0x297/0x430 <4>[ 547.209449] ? debug_object_free+0x360/0x360 <4>[ 547.209472] ? lock_acquire+0x1ac/0x8a0 <4>[ 547.209592] ? intel_timeline_read_hwsp+0x4f/0x840 [i915] <4>[ 547.209737] ? i915_active_acquire_if_busy+0x66/0x120 [i915] <4>[ 547.209861] i915_active_acquire_if_busy+0x66/0x120 [i915] <4>[ 547.209990] ? __live_alloc.isra.15+0xc0/0xc0 [i915] <4>[ 547.210005] ? rcu_read_lock_sched_held+0xd0/0xd0 <4>[ 547.210017] ? print_usage_bug+0x580/0x580 <4>[ 547.210153] intel_timeline_read_hwsp+0xbc/0x840 [i915] <4>[ 547.210284] __emit_semaphore_wait+0xd5/0x480 [i915] <4>[ 547.210415] ? i915_fence_get_timeline_name+0x110/0x110 [i915] <4>[ 547.210428] ? lockdep_hardirqs_on+0x348/0x5f0 <4>[ 547.210442] ? _raw_spin_unlock_irq+0x2a/0x40 <4>[ 547.210567] ? __await_execution.constprop.51+0x2e0/0x570 [i915] <4>[ 547.210706] i915_request_await_dma_fence+0x8f7/0xc70 [i915] Fixes: 85bedbf191e8 ("drm/i915/gt: Eliminate the trylock for reading a timeline's hwsp") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: <stable@vger.kernel.org> # v5.6+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200427093038.29219-1-chris@chris-wilson.co.uk
2020-04-27drm/i915/execlists: Check preempt-timeout target before submit_portsChris Wilson1-1/+1
We evaluate *active, which is a pointer into execlists->inflight[] during dequeue to decide how long a preempt-timeout we need to apply. However, as soon as we do the submit_ports, the HW may send its ACK interrupt causing us to promote execlists->pending[] tp execlists->inflight[], overwriting the value of *active. We know *active is only stable until we submit (as we only submit when there is no pending promotion). [ 16.102328] BUG: KCSAN: data-race in execlists_dequeue+0x1449/0x1600 [i915] [ 16.102356] [ 16.102375] race at unknown origin, with read to 0xffff8881e9500488 of 8 bytes by task 429 on cpu 1: [ 16.102780] execlists_dequeue+0x1449/0x1600 [i915] [ 16.103160] __execlists_submission_tasklet+0x48/0x60 [i915] [ 16.103540] execlists_submit_request+0x38e/0x3c0 [i915] [ 16.103940] submit_notify+0x8f/0xc0 [i915] [ 16.104308] __i915_sw_fence_complete+0x61/0x420 [i915] [ 16.104683] i915_sw_fence_complete+0x58/0x80 [i915] [ 16.105054] i915_sw_fence_commit+0x16/0x20 [i915] [ 16.105457] __i915_request_queue+0x60/0x70 [i915] [ 16.105843] i915_gem_do_execbuffer+0x2d6b/0x4230 [i915] [ 16.106227] i915_gem_execbuffer2_ioctl+0x2b0/0x580 [i915] [ 16.106257] drm_ioctl_kernel+0xe9/0x130 [ 16.106279] drm_ioctl+0x27d/0x45e [ 16.106311] ksys_ioctl+0x89/0xb0 [ 16.106336] __x64_sys_ioctl+0x42/0x60 [ 16.106370] do_syscall_64+0x6e/0x2c0 [ 16.106397] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200426094231.21995-1-chris@chris-wilson.co.uk
2020-04-27drm/i915: re-disable -Wframe-addressNick Desaulniers1-0/+1
The top level Makefile disables this warning. When building an i386_defconfig with Clang, this warning is triggered a whole bunch via includes of headers from perf. Link: https://github.com/ClangBuiltLinux/continuous-integration/pull/182 Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200426214215.139435-1-ndesaulniers@google.com
2020-04-25drm/i915: Use indirect ctx bb to mend CMD_BUF_CCTLMika Kuoppala3-3/+115
Use indirect ctx bb to load cmd buffer control value from context image to avoid corruption. v2: add to lrc layout (Chris) v3: end to a cacheline (Chris) v4: add to lrc fixed (Chris) v5: value in offset+1 Testcase: igt/i915_selftest/gt_lrc Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200424230632.30333-1-mika.kuoppala@linux.intel.com
2020-04-25drm/i915: Add live selftests for indirect ctx batchbuffersMika Kuoppala1-0/+150
Indirect ctx batchbuffers are a hw feature of which batch can be run, by hardware, during context restoration stage. Driver can setup a batchbuffer and also an offset into the context image. When context image is marshalled from memory to registers, and when the offset from the start of context register state is equal of what driver pre-determined, batch will run. So one can manipulate context restoration process at cacheline granularity, given some limitations, as you need to have rudimentaries in place before you can run a batch. Add selftest which will write the ring start register to a canary spot. This will test that hardware will run a batchbuffer for the context in question. v2: request wait fix, naming (Chris) v3: test order (Chris) v4: rebase Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200424214841.28076-3-mika.kuoppala@linux.intel.com
2020-04-25drm/i915: Add per ctx batchbuffer wa for timestampMika Kuoppala3-13/+125
Restoration of a previous timestamp can collide with updating the timestamp, causing a value corruption. Combat this issue by using indirect ctx bb to modify the context image during restoring process. We can preload value into scratch register. From which we then do the actual write with LRR. LRR is faster and thus less error prone as probability of race drops. v2: tidying (Chris) v3: lrr for all engines v4: grp v5: reg bit v6: wa_bb_offset, virtual engines (Chris) References: HSDES#16010904313 Testcase: igt/i915_selftest/gt_lrc Suggested-by: Joseph Koston <joseph.koston@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200424230546.30271-1-mika.kuoppala@linux.intel.com
2020-04-25drm/i915: Add engine scratch register to live_lrc_fixedMika Kuoppala2-0/+17
General purpose registers are per engine and in a fixed location. Add to live_lrc_fixed. Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200424214841.28076-1-mika.kuoppala@linux.intel.com
2020-04-24drm/i915: Drop rq->ring->vma peeking from error captureChris Wilson2-4/+2
We only hold the active spinlock while dumping the error state, and this does not prevent another thread from retiring the request -- as it is quite possible that despite us capturing the current state, the GPU has completed the request. As such, it is dangerous to dereference state below the request as it may already be freed, and the simplest way to avoid the danger is not include it in the error state. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1788 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200424191410.27570-1-chris@chris-wilson.co.uk
2020-04-24drm/i915/gt: Use the RPM config register to determine clk frequenciesChris Wilson11-75/+229
For many configuration details within RC6 and RPS we are programming intervals for the internal clocks. From gen11, these clocks are configuration via the RPM_CONFIG and so for convenience, we would like to convert to/from more natural units (ns). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200424162805.25920-2-chris@chris-wilson.co.uk
2020-04-24drm/i915/gt: Trace RPS eventsChris Wilson1-4/+44
Add tracek to the RPS events (interrupts, worker, enabling, threshold selection, frequency setting), so that if we have to debug reticent HW we have some traces to start from. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200424162805.25920-1-chris@chris-wilson.co.uk
2020-04-24drm/i915/gt: Prefer soft-rc6 over RPS DOWN_TIMEOUTChris Wilson1-23/+18
The RPS DOWN_TIMEOUT interrupt is signaled after a period of rc6, and upon receipt of that interrupt we reprogram the GPU clocks down to the next idle notch [to help convserve power during rc6]. However, on execlists, we benefit from soft-rc6 immediately parking the GPU and setting idle frequencies upon idling [within a jiffie], and here the interrupt prevents us from restarting from our last frequency. In the process, we can simply opt for a static pm_events mask and rely on the enable/disable interrupts to flush the worker on parking. This will reduce the amount of oscillation observed during steady workloads with microsleeps, as each time the rc6 timeout occurs we immediately follow with a waitboost for a dropped frame. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422001703.1697-1-chris@chris-wilson.co.uk
2020-04-24drm/i915: Split some long linesVille Syrjälä1-2/+8
Split some overly long lines. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200420200610.31798-4-ville.syrjala@linux.intel.com Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2020-04-24drm/i915: Introduce .set_idle_link_train() vfuncVille Syrjälä3-27/+32
Relocate a bunch of DDI specific code from intel_dp.c to intel_ddi.c by introducing a .set_idle_link_train() vfunc. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200420200610.31798-3-ville.syrjala@linux.intel.com Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2020-04-24drm/i915: Introduce .set_signal_levels() vfuncVille Syrjälä3-77/+149
Sort out some of the mess between intel_ddi.c intel_dp.c by introducing a .set_signal_levels() vfunc. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200420200610.31798-2-ville.syrjala@linux.intel.com Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2020-04-24drm/i915: Introduce .set_link_train() vfuncVille Syrjälä3-81/+104
Sort out some of the mess between intel_ddi.c intel_dp.c by introducing a .set_link_train() vfunc. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200420200610.31798-1-ville.syrjala@linux.intel.com Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2020-04-24drm/i915: Have pfit calculations return an error codeVille Syrjälä7-38/+49
Change intel_{gmch,pch}_panel_fitting() to return a normal error vs. success int. We'll need this later to validate that the margin properties aren't misconfigured. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422161917.17389-6-ville.syrjala@linux.intel.com Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
2020-04-24drm/i915: Pass connector state to pfit calculationsVille Syrjälä7-33/+31
Pass the entire connector state to intel_{gmch,pch}_panel_fitting(). For now we just need to get at .scaling_mode but in the future we'll want access to the margin properties as well. v2: Deal with intel_dp_ycbcr420_config() Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422161917.17389-5-ville.syrjala@linux.intel.com Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
2020-04-24drm/i915: s/pipe_config/crtc_state/ in pfit functionsVille Syrjälä7-65/+58
Follow the new naming convention and call the crtc state "crtc_state", and while at it drop the redundant crtc argument. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422161917.17389-4-ville.syrjala@linux.intel.com Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
2020-04-24drm/i915: Use drm_rect to store the pfit window pos/sizeVille Syrjälä3-50/+67
Make things a bit more abstract by replacing the pch_pfit.pos/size raw register values with a drm_rect. Makes it slighly more convenient to eg. compute the scaling factors. v2: Use drm_rect_init() Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422161917.17389-3-ville.syrjala@linux.intel.com Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
2020-04-24drm/i915: Flatten a bunch of the pfit functionsVille Syrjälä1-120/+117
Most of the pfit functions are of the form: func() { if (pfit_enabled) { ... } } Flip the pfit_enabled check around to flatten the functions. And while we're touching all this let's do the usual s/pipe_config/crtc_state/ replacement. Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422161917.17389-2-ville.syrjala@linux.intel.com
2020-04-24drm/i915: Fix skl+ non-scaled pfit modesVille Syrjälä4-46/+19
Fix skl_update_scaler_crtc() to deal with different scaling modes correctly. The current implementation assumes DRM_MODE_SCALE_FULLSCREEN. Fortunately we don't expose any border properties currently so the code does actually end up doing the right thing (assigning a scaler for pfit). The code does need to be fixed before any borders are exposed. Also we have redundant calls to skl_update_scaler_crtc() in dp/hdmi .compute_config() which can be nuked. They were anyway called before we had even computed the pfit state so were basically nonsense. The real call we need to keep is in intel_crtc_atomic_check(). v2: Deal witrh skl_update_scaler_crtc() in intel_dp_ycbcr420_config() Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422161917.17389-1-ville.syrjala@linux.intel.com
2020-04-24drm/i915: Only close vma we openChris Wilson9-107/+56
The history of i915_vma_close() is confusing, as is its use. As the lifetime of the i915_vma is currently bounded by the object it is attached to, we needed a means of identify when a vma was no longer in use by userspace (via the user's fd). This is further complicated by that only ppgtt vma should be closed at the user's behest, as the ggtt were always shared. Now that we attach the vma to a lut on the user's context, the open count does indicate how many unique and open context/vm are referencing this vma from the user. As such, we can and should just use the open_count to track when the vma is still in use by userspace. It's a poor man's replacement for reference counting. Closes: https://gitlab.freedesktop.org/drm/intel/issues/1193 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422190558.30509-1-chris@chris-wilson.co.uk
2020-04-24drm/i915: Make define for lrc state offsetMika Kuoppala5-14/+14
More often than not, we need a byte offset into lrc register state from the start of the hw state. Make it so. Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200423182355.21837-3-mika.kuoppala@linux.intel.com
2020-04-24drm/i915/selftests: Add context batchbuffers registers to live_lrc_fixedMika Kuoppala3-60/+89
Add per ctx bb and indirect ctx bb register locations to live_lrc_fixed for verification. Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200423224159.22078-1-chris@chris-wilson.co.uk
2020-04-23drm/i915/selftests: Add request throughput measurement to perfChris Wilson2-1/+580
Under ideal circumstances, the driver should be able to keep the GPU fully saturated with work. Measure how close to ideal we get under the harshest of conditions with no user payload. v2: Also measure throughput using only one thread. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422074203.9799-1-chris@chris-wilson.co.uk
2020-04-23drm/i915/gt: Check carefully for an idle engine in wait-for-idleChris Wilson1-3/+12
intel_gt_wait_for_idle() tries to wait until all the outstanding requests are retired and the GPU is idle. As a side effect of retiring requests, we may submit more work to flush any pm barriers, and so the wait-for-idle tries to flush the background pm work and catch the new requests. However, if the work completed in the background before we were able to flush, it would queue the extra barrier request without us noticing -- and so we would return from wait-for-idle with one request remaining. (This breaks e.g. record_default_state where we need to wait until that barrier is retired, and it may slow suspend down by causing us to wait on the background retirement worker as opposed to immediately retiring the barrier.) However, since we track if there has been a submission since the engine pm barrier, we can very quickly detect if the idle barrier is still outstanding. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1763 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200423085940.28168-1-chris@chris-wilson.co.uk
2020-04-23drm/i915/gt: Carefully order virtual_submission_taskletChris Wilson1-2/+5
During the virtual engine's submission tasklet, we take the request and insert into the submission queue on each of our siblings. This seems quite simply, and so no problems with ordering. However, the sibling execlists' submission tasklets may run concurrently with the virtual engine's tasklet, submitting the request to HW before the virtual finishes its task of telling all the siblings. If this happens, the sibling tasklet may *reorder* the ve->sibling[] array that the virtual engine tasklet is processing. This can *only* reorder within the elements already processed by the virtual engine, nevertheless the race is detected by KCSAN: [ 185.580014] BUG: KCSAN: data-race in execlists_dequeue [i915] / virtual_submission_tasklet [i915] [ 185.580054] [ 185.580076] write to 0xffff8881f1919860 of 8 bytes by interrupt on cpu 2: [ 185.580553] execlists_dequeue+0x6ad/0x1600 [i915] [ 185.581044] __execlists_submission_tasklet+0x48/0x60 [i915] [ 185.581517] execlists_submission_tasklet+0xd3/0x170 [i915] [ 185.581554] tasklet_action_common.isra.0+0x42/0x90 [ 185.581585] __do_softirq+0xc8/0x206 [ 185.581613] run_ksoftirqd+0x15/0x20 [ 185.581641] smpboot_thread_fn+0x15a/0x270 [ 185.581669] kthread+0x19a/0x1e0 [ 185.581695] ret_from_fork+0x1f/0x30 [ 185.581717] [ 185.581736] read to 0xffff8881f1919860 of 8 bytes by interrupt on cpu 0: [ 185.582231] virtual_submission_tasklet+0x10e/0x5c0 [i915] [ 185.582265] tasklet_action_common.isra.0+0x42/0x90 [ 185.582291] __do_softirq+0xc8/0x206 [ 185.582315] run_ksoftirqd+0x15/0x20 [ 185.582340] smpboot_thread_fn+0x15a/0x270 [ 185.582368] kthread+0x19a/0x1e0 [ 185.582395] ret_from_fork+0x1f/0x30 [ 185.582417] We can prevent this race by checking for the ve->request after looking up the sibling array. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200423115315.26825-1-chris@chris-wilson.co.uk
2020-04-23drm/i915/icl: Fix timeout handling during TypeC AUX power well enablingImre Deak1-49/+25
Fix the check for when an AUX power well enabling timeout is expected on a legacy TypeC port. Fixes: 89e01caac641 ("drm/i915: Use single set of AUX powerwell ops for gen11+") Cc: Matt Roper <matthew.d.roper@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422123440.19522-1-imre.deak@intel.com
2020-04-22drm/i915/execlists: Drop request-before-CS assertionChris Wilson1-26/+7
When we migrated to execlists, one of the conditions we wanted to test for was whether the breadcrumb seqno was being written before the breadcumb interrupt was delivered. This was following on from issues observed on previous generations which were not so strongly ordered. With the removal of the missed interrupt detection, we have not reliable means of detecting the out-of-order seqno/interrupt but instead tried to assert that the relationship between the CS event interrupt and the breadwrite should be strongly ordered. However, Icelake proves it is possible for the HW implementation to forget about minor little details such as write ordering and so the order between *processing* the CS event and the breadcrumb is unreliable. Remove the unreliable assertion, but leave a debug telltale in case we have reason to suspect. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1658 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422141749.28709-1-chris@chris-wilson.co.uk
2020-04-22drm/i915/gem: Hold obj->vma.lock over for_each_ggtt_vma()Chris Wilson2-6/+24
While the ggtt vma are protected by their object lifetime, the list continues until it hits a non-ggtt vma, and that vma is not protected and may be freed as we inspect it. Hence, we require the obj->vma.lock to protect the list as we iterate. An example of forgetting to hold the obj->vma.lock is [1642834.464973] general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] SMP PTI [1642834.464977] CPU: 3 PID: 1954 Comm: Xorg Not tainted 5.6.0-300.fc32.x86_64 #1 [1642834.464979] Hardware name: LENOVO 20ARS25701/20ARS25701, BIOS GJET94WW (2.44 ) 09/14/2017 [1642834.465021] RIP: 0010:i915_gem_object_set_tiling+0x2c0/0x3e0 [i915] [1642834.465024] Code: 8b 84 24 18 01 00 00 f6 c4 80 74 59 49 8b 94 24 a0 00 00 00 49 8b 84 24 e0 00 00 00 49 8b 74 24 10 48 8b 92 30 01 00 00 89 c7 <80> ba 0a 06 00 00 03 0f 87 86 00 00 00 ba 00 00 08 00 b9 00 00 10 [1642834.465025] RSP: 0018:ffffa98780c77d60 EFLAGS: 00010282 [1642834.465028] RAX: ffff8d232bfb2578 RBX: 0000000000000002 RCX: ffff8d25873a0000 [1642834.465029] RDX: dead000000000122 RSI: fffff0af8ac6e408 RDI: 000000002bfb2578 [1642834.465030] RBP: ffff8d25873a0000 R08: ffff8d252bfb5638 R09: 0000000000000000 [1642834.465031] R10: 0000000000000000 R11: ffff8d252bfb5640 R12: ffffa987801cb8f8 [1642834.465032] R13: 0000000000001000 R14: ffff8d233e972e50 R15: ffff8d233e972d00 [1642834.465034] FS: 00007f6a3d327f00(0000) GS:ffff8d25926c0000(0000) knlGS:0000000000000000 [1642834.465036] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [1642834.465037] CR2: 00007f6a2064d000 CR3: 00000002fb57c001 CR4: 00000000001606e0 [1642834.465038] Call Trace: [1642834.465083] i915_gem_set_tiling_ioctl+0x122/0x230 [i915] [1642834.465121] ? i915_gem_object_set_tiling+0x3e0/0x3e0 [i915] [1642834.465151] drm_ioctl_kernel+0x86/0xd0 [drm] [1642834.465156] ? avc_has_perm+0x3b/0x160 [1642834.465178] drm_ioctl+0x206/0x390 [drm] [1642834.465216] ? i915_gem_object_set_tiling+0x3e0/0x3e0 [i915] [1642834.465221] ? selinux_file_ioctl+0x122/0x1c0 [1642834.465226] ? __do_munmap+0x24b/0x4d0 [1642834.465231] ksys_ioctl+0x82/0xc0 [1642834.465235] __x64_sys_ioctl+0x16/0x20 [1642834.465238] do_syscall_64+0x5b/0xf0 [1642834.465243] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [1642834.465245] RIP: 0033:0x7f6a3d7b047b [1642834.465247] Code: 0f 1e fa 48 8b 05 1d aa 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed a9 0c 00 f7 d8 64 89 01 48 [1642834.465249] RSP: 002b:00007ffe71adba28 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [1642834.465251] RAX: ffffffffffffffda RBX: 000055f99048fa40 RCX: 00007f6a3d7b047b [1642834.465253] RDX: 00007ffe71adba30 RSI: 00000000c0106461 RDI: 000000000000000e [1642834.465254] RBP: 0000000000000002 R08: 000055f98f3f1798 R09: 0000000000000002 [1642834.465255] R10: 0000000000001000 R11: 0000000000000246 R12: 0000000000000080 [1642834.465257] R13: 000055f98f3f1690 R14: 00000000c0106461 R15: 00007ffe71adba30 Now to take the spinlock during the list iteration, we need to break it down into two phases. In the first phase under the lock, we cannot sleep and so must defer the actual work to a second list, protected by the ggtt->mutex. We also need to hold the spinlock during creation of a new vma to serialise with updates of the tiling on the object. Reported-by: Dave Airlie <airlied@redhat.com> Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: <stable@vger.kernel.org> # v5.5+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422072805.17340-1-chris@chris-wilson.co.uk
2020-04-22drm/i915/selftests: Try to detect rollback during batchbuffer preemptionChris Wilson1-1/+328
Since batch buffers dominant execution time, most preemption requests should naturally occur during execution of a batch buffer. We wish to verify that should a preemption occur within a batch buffer, when we come to restart that batch buffer, it occurs at the interrupted instruction and most importantly does not rollback to an earlier point. v2: Do not clear the GPR at the start of the batch, but rely on them being clear for new contexts. Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422100903.25216-1-chris@chris-wilson.co.uk
2020-04-22drm/i915/selftests: Disable heartbeat around RPS interrupt testingChris Wilson1-4/+55
For verifying reciving the EI interrupts, we need to hold the GPU in very precise conditions (in terms of C0 cycles during the EI). If we preempt the busy load to handle the heartbeat, this may perturb the busy load causing us to miss the interrupt. The other tests, while not as time sensitive, may also be slightly perturbed, so apply the heartbeat protection across all the measurements. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422083855.26842-1-chris@chris-wilson.co.uk
2020-04-21drm/i915/selftests: Unroll the CS frequency loopChris Wilson1-13/+20
Having noticed that MI_BB_START is incurring a memory stall (see the correlation with uncore frequency), we have to unroll the loop in order to diminish the impact of the MI_BB_START on the instruction throughput. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200421171351.19575-1-chris@chris-wilson.co.uk
2020-04-21drm/i915/gt: Poison residual state [HWSP] across resume.Chris Wilson3-2/+40
Since we may lose the content of any buffer when we relinquish control of the system (e.g. suspend/resume), we have to be careful not to rely on regaining control. A good method to detect when we might be using garbage is by always injecting that garbage prior to first use on load/resume/etc. v2: Drop sanitize callback on cleanup v3: Move seqno reset to timeline enter, so we reset all timelines. However, this is done on every activation during runtime and not reset. The similar level of paranoia we apply to correcting context state after a period of inactivity. Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200421092504.7416-1-chris@chris-wilson.co.uk
2020-04-21drm/i915/selftests: Disable C-states when measuring RPS frequency responseChris Wilson1-0/+18
Let's isolate the impact of cpu frequency selection on determing the GPU throughput in response to selection of RPS frequencies. For real systems, we do have to be concerned with the impact of integrating c-states, p-states and rp-states, but for the sake of proving whether or not RPS works, one baby step at a time. For the record, as one would hope, it does not seem to impact on the measured performance, but we do it anyway to reduce the number of variables. Later, we can extend the testing to encourage the the cpu/pkg to try and sleep while the GPU is busy. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200421142236.8614-1-chris@chris-wilson.co.uk Link: https://patchwork.freedesktop.org/patch/msgid/20200421142236.8614-1-chris@chris-wilson.co.uk
2020-04-21drm/i915/selftests: Show the full scaling curve on failureChris Wilson1-0/+40
If we detect that the RPS end points do not scale perfectly, take the time to measure all the in between values as well. We are aborting the test, so we might as well spend the available time gathering critical debug information instead. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200421124636.22554-1-chris@chris-wilson.co.uk
2020-04-21drm/i915/selftests: Show the pstate limits on any failure to reset minChris Wilson1-0/+2
We want to see the pstate limits whenever we fail to set the minimum frequency as that may help for debugging. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200420203040.8984-1-chris@chris-wilson.co.uk