summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/msm/adreno
AgeCommit message (Collapse)AuthorFilesLines
2023-01-20Merge tag 'drm-msm-fixes-2023-01-16' of ↵Dave Airlie2-2/+7
https://gitlab.freedesktop.org/drm/msm into drm-fixes msm-fixes for v6.3-rc5 Two GPU fixes which were meant to be part of the previous pull request, but I'd forgotten to fetch from gitlab after the MR was merged so that git tag was applied to the wrong commit. - kexec shutdown fix - fix potential double free Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGskguoVsz2wqAK2k+f32LwcVY5JC6+e2RwLqZswz3RY2Q@mail.gmail.com
2023-01-13Merge tag 'drm-msm-fixes-2023-01-12' of ↵Dave Airlie4-12/+21
https://gitlab.freedesktop.org/drm/msm into drm-fixes msm-fixes for v6.3-rc4 Display Fixes: - Fix the documentation for dpu_encoder_phys_wb_init() and dpu_encoder_phys_wb_setup_fb() APIs to address doc warnings - Remove vcca-supply and vdds-supply as mandatory for 14nm PHY and 10nm PHY DT schemas respectively as they are not present on some SOCs using these PHYs - Add the dsi-phy-regulator-ldo-mode to dsi-phy-28nm.yaml as it was missed out during txt to yaml migration - Remove operating-points-v2 and power-domain as a required property for the DSI controller as thats not the case for every SOC - Fix the description from display escape clock to display core clock in the dsi controller yaml - Fix the memory leak for mdp1-mem path for the cases when we return early after failing to get mdp0-mem ICC paths for msm - Fix error handling path in msm_hdmi_dev_probe() to release the phy ref count when devm_pm_runtime_enable() fails - Fix the dp_aux_isr() routine to make sure it doesnt incorrectly signal the aux transaction as complete if the ISR was not an AUX isr. This fixes a big hitter stability bug on chromebooks. - Add protection against null pointer dereference when there is no kms object as in the case of headless adreno GPU in the shutdown path. GPU Fixes: - a5xx: fix quirks to actually be a bitmask and not overwrite each other - a6xx: fix gx halt sequence to avoid 1000ms hang on some devices - kexec shutdown fix - fix potential double free Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGv7=in_MHW3kdkhqh7ZFoVCmnikmr29YYHCXR=7aOEneg@mail.gmail.com
2023-01-11drm/msm/gpu: Fix potential double-freeRob Clark1-0/+4
If userspace was calling the MSM_SET_PARAM ioctl on multiple threads to set the COMM or CMDLINE param, it could trigger a race causing the previous value to be kfree'd multiple times. Fix this by serializing on the gpu lock. Signed-off-by: Rob Clark <robdclark@chromium.org> Fixes: d4726d770068 ("drm/msm: Add a way to override processes comm/cmdline") Patchwork: https://patchwork.freedesktop.org/patch/517778/ Link: https://lore.kernel.org/r/20230110212903.1925878-1-robdclark@gmail.com
2023-01-11adreno: Shutdown the GPU properlyJoel Fernandes (Google)1-2/+3
During kexec on ARM device, we notice that device_shutdown() only calls pm_runtime_force_suspend() while shutting down the GPU. This means the GPU kthread is still running and further, there maybe active submits. This causes all kinds of issues during a kexec reboot: Warning from shutdown path: [ 292.509662] WARNING: CPU: 0 PID: 6304 at [...] adreno_runtime_suspend+0x3c/0x44 [ 292.509863] Hardware name: Google Lazor (rev3 - 8) with LTE (DT) [ 292.509872] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 292.509881] pc : adreno_runtime_suspend+0x3c/0x44 [ 292.509891] lr : pm_generic_runtime_suspend+0x30/0x44 [ 292.509905] sp : ffffffc014473bf0 [...] [ 292.510043] Call trace: [ 292.510051] adreno_runtime_suspend+0x3c/0x44 [ 292.510061] pm_generic_runtime_suspend+0x30/0x44 [ 292.510071] pm_runtime_force_suspend+0x54/0xc8 [ 292.510081] adreno_shutdown+0x1c/0x28 [ 292.510090] platform_shutdown+0x2c/0x38 [ 292.510104] device_shutdown+0x158/0x210 [ 292.510119] kernel_restart_prepare+0x40/0x4c And here from GPU kthread, an SError OOPs: [ 192.648789] el1h_64_error+0x7c/0x80 [ 192.648812] el1_interrupt+0x20/0x58 [ 192.648833] el1h_64_irq_handler+0x18/0x24 [ 192.648854] el1h_64_irq+0x7c/0x80 [ 192.648873] local_daif_inherit+0x10/0x18 [ 192.648900] el1h_64_sync_handler+0x48/0xb4 [ 192.648921] el1h_64_sync+0x7c/0x80 [ 192.648941] a6xx_gmu_set_oob+0xbc/0x1fc [ 192.648968] a6xx_hw_init+0x44/0xe38 [ 192.648991] msm_gpu_hw_init+0x48/0x80 [ 192.649013] msm_gpu_submit+0x5c/0x1a8 [ 192.649034] msm_job_run+0xb0/0x11c [ 192.649058] drm_sched_main+0x170/0x434 [ 192.649086] kthread+0x134/0x300 [ 192.649114] ret_from_fork+0x10/0x20 Fix by calling adreno_system_suspend() in the device_shutdown() path. [ Applied Rob Clark feedback on fixing adreno_unbind() similarly, also tested as above. ] Cc: Rob Clark <robdclark@chromium.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ricardo Ribalda <ribalda@chromium.org> Cc: Ross Zwisler <zwisler@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Reviewed-by: Ricardo Ribalda <ribalda@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/517633/ Link: https://lore.kernel.org/r/20230109222547.1368644-1-joel@joelfernandes.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2023-01-05drm/msm/a6xx: Avoid gx gbit halt during rpm suspendAkhil P Oommen3-6/+17
As per the downstream driver, gx gbif halt is required only during recovery sequence. So lets avoid it during regular rpm suspend. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/515279/ Link: https://lore.kernel.org/r/20221216223253.1.Ice9c47bfeb1fddb8dc377a3491a043a3ee7fca7d@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2023-01-05drm/msm/adreno: Make adreno quirks not overwrite each otherKonrad Dybcio1-6/+4
So far the adreno quirks have all been assigned with an OR operator, which is problematic, because they were assigned consecutive integer values, which makes checking them with an AND operator kind of no bueno.. Switch to using BIT(n) so that only the quirks that the programmer chose are taken into account when evaluating info->quirks & ADRENO_QUIRK_... Fixes: 370063ee427a ("drm/msm/adreno: Add A540 support") Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Marijn Suijten <marijn.suijten@somainline.org> Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/516456/ Link: https://lore.kernel.org/r/20230102100201.77286-1-konrad.dybcio@linaro.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-11-30Merge tag 'drm-msm-next-2022-11-28' of ↵Dave Airlie5-54/+67
https://gitlab.freedesktop.org/drm/msm into drm-next msm-next for v6.2 (the gpu/gem bits) - Remove exclusive-fence hack that caused over-synchronization - Fix speed-bin detection vs. probe-defer - Enable clamp_to_idle on 7c3 - Improved hangcheck detection Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvT1h_S4d=YRgphgR8i7aMaxQaNW8mru7QaoUo9uiUk2A@mail.gmail.com
2022-11-17drm/msm: Hangcheck progress detectionRob Clark1-0/+34
If the hangcheck timer expires, check if the fw's position in the cmdstream has advanced (changed) since last timer expiration, and allow it up to three additional "extensions" to it's alotted time. The intention is to continue to catch "shader stuck in a loop" type hangs quickly, but allow more time for things that are actually making forward progress. Because we need to sample the CP state twice to detect if there has not been progress, this also cuts the the timer's duration in half. v2: Fix typo (REG_A6XX_CP_CSQ_IB2_STAT), add comment v3: Only halve hangcheck timer duration for generations which support progress detection (hdanton); removed unused a5xx progress (without knowing how to adjust for data buffered in ROQ it is too likely to report a false negative) v4: Comment updates to better describe the total hangcheck duration when progress detection is applied Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Tested-by: Chia-I Wu <olvaffe@gmail.com> # dEQP-GLES2.functional.flush_finish.wait Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/511584/ Link: https://lore.kernel.org/r/20221114193049.1533391-3-robdclark@gmail.com
2022-11-17drm/msm/adreno: Simplify read64/write64 helpersRob Clark5-40/+21
The _HI reg is always following the _LO reg, so no need to pass these offsets seprately. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/511581/ Link: https://lore.kernel.org/r/20221114193049.1533391-2-robdclark@gmail.com
2022-11-17drm/msm: Enable clamp_to_idle for 7c3Rob Clark1-7/+7
This was overlooked. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/511693/ Link: https://lore.kernel.org/r/20221115155535.1615278-1-robdclark@gmail.com
2022-11-17drm/msm/a6xx: Fix speed-bin detection vs probe-deferRob Clark1-7/+5
If we get an error (other than -ENOENT) we need to propagate that up the stack. Otherwise if the nvmem driver hasn't probed yet, we'll end up end up claiming that we support all the OPPs which is not likely to be true (and on some generations impossible to be true, ie. if there are conflicting OPPs). v2: Update commit msg, gc unused label, etc v3: Add previously missing \n's Fixes: fe7952c629da ("drm/msm: Add speed-bin support to a618 gpu") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/511690/ Link: https://lore.kernel.org/r/20221115154637.1613968-1-robdclark@gmail.com
2022-11-03drm/msm: remove duplicated code from a6xx_create_address_spaceDmitry Baryshkov6-33/+20
The function a6xx_create_address_space() is mostly a copy of adreno_iommu_create_address_space() with added quirk setting. Rework these two functions to be a thin wrappers around a common helper. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Rob Clark <robdclark@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/509614/ Link: https://lore.kernel.org/r/20221102175449.452283-3-dmitry.baryshkov@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2022-11-03drm/msm: move domain allocation into msm_iommu_new()Dmitry Baryshkov4-37/+25
After the msm_iommu instance is created, the IOMMU domain is completely handled inside the msm_iommu code. Move the iommu_domain_alloc() call into the msm_iommu_new() to simplify callers code. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Rob Clark <robdclark@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/509615/ Link: https://lore.kernel.org/r/20221102175449.452283-2-dmitry.baryshkov@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2022-10-14drm/msm/a6xx: Remove state objects from list before freeingRob Clark1-1/+3
Technically it worked as it was before, only because it was using the _safe version of the iterator. But it is sloppy practice to leave dangling pointers. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/507017/ Link: https://lore.kernel.org/r/20221013225520.371226-4-robdclark@gmail.com
2022-10-14drm/msm/a6xx: Skip snapshotting unused GMU buffersRob Clark1-0/+3
Some buffers are unused on certain sub-generations of a6xx. So just skip them. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/507013/ Link: https://lore.kernel.org/r/20221013225520.371226-3-robdclark@gmail.com
2022-10-14drm/msm/a6xx: Fix kvzalloc vs state_kcalloc usageRob Clark2-2/+16
adreno_show_object() is a trap! It will re-allocate the pointer it is passed on first call, when the data is ascii85 encoded, using kvmalloc/ kvfree(). Which means the data *passed* to it must be kvmalloc'd, ie. we cannot use the state_kcalloc() helper. This partially reverts commit ec8f1813bf8d ("drm/msm/a6xx: Replace kcalloc() with kvzalloc()"), but adds the missing kvfree() to fix the memory leak that was present previously. And adds a warning comment. Fixes: ec8f1813bf8d ("drm/msm/a6xx: Replace kcalloc() with kvzalloc()") Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/20 Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/507014/ Link: https://lore.kernel.org/r/20221013225520.371226-2-robdclark@gmail.com
2022-09-30drm/msm/gpu: Fix crash during system suspend after unbindAkhil P Oommen1-1/+9
In adreno_unbind, we should clean up gpu device's drvdata to avoid accessing a stale pointer during system suspend. Also, check for NULL ptr in both system suspend/resume callbacks. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/505075/ Link: https://lore.kernel.org/r/20220928124830.2.I5ee0ac073ccdeb81961e5ec0cce5f741a7207a71@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-09-30drm/msm/a6xx: Replace kcalloc() with kvzalloc()Akhil P Oommen1-9/+3
In order to reduce chance of allocation failure while capturing a6xx gpu state, use kvzalloc() instead of kcalloc() in state_kcalloc(). Indirectly, this patch helps to fix leaking memory allocated for gmu_debug object. Fixes: b859f9b009b (drm/msm/gpu: Snapshot GMU debug buffer) Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/505074/ Link: https://lore.kernel.org/r/20220928124830.1.I8ea24a8d586b4978823b848adde000f92f74d5c2@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-09-01drm/msm/a6xx: Handle GMU prepare-slumber hfi failureAkhil P Oommen1-1/+5
When prepare-slumber hfi fails, we should follow a6xx_gmu_force_off() sequence. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/498401/ Link: https://lore.kernel.org/r/20220819015030.v5.7.I54815c7c36b80d4725cd054e536365250454452f@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-08-28drm/msm/a6xx: Improve gpu recovery sequenceAkhil P Oommen3-30/+58
We can do a few more things to improve our chance at a successful gpu recovery, especially during a hangcheck timeout: 1. Halt CP and GMU core 2. Do RBBM GBIF HALT sequence 3. Do a soft reset of GPU core Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/498400/ Link: https://lore.kernel.org/r/20220819015030.v5.6.Idf2ba51078e87ae7ceb75cc77a5bd4ff2bd31eab@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-08-28drm/msm/a6xx: Ensure CX collapse during gpu recoveryAkhil P Oommen1-0/+4
Because there could be transient votes from other drivers/tz/hyp which may keep the cx gdsc enabled, we should poll until cx gdsc collapses. We can use the reset framework to poll for cx gdsc collapse from gpucc clk driver. This feature requires support from the platform's gpucc driver. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Patchwork: https://patchwork.freedesktop.org/patch/498397/ Link: https://lore.kernel.org/r/20220819015030.v5.5.I176567525af2b9439a7e485d0ca130528666a55c@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-08-28drm/msm: Fix cx collapse issue during recoveryAkhil P Oommen1-3/+29
There are some hardware logic under CX domain. For a successful recovery, we should ensure cx headswitch collapses to ensure all the stale states are cleard out. This is especially true to for a6xx family where we can GMU co-processor. Currently, cx doesn't collapse due to a devlink between gpu and its smmu. So the *struct gpu device* needs to be runtime suspended to ensure that the iommu driver removes its vote on cx gdsc. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/498398/ Link: https://lore.kernel.org/r/20220819015030.v5.4.I4ac27a0b34ea796ce0f938bb509e257516bc6f57@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-08-28drm/msm: De-open-code some CP_EVENT_WRITERob Clark3-3/+3
Replace some open coding to improve readability. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/499272/ Link: https://lore.kernel.org/r/20220821155441.1092134-1-robdclark@gmail.com
2022-07-06drm/msm/adreno: Defer enabling runpm until hw_init()Rob Clark2-1/+6
To avoid preventing the display from coming up before the rootfs is mounted, without resorting to packing fw in the initrd, the GPU has this limbo state where the device is probed, but we aren't ready to start sending commands to it. This is particularly problematic for a6xx, since the GMU (which requires fw to be loaded) is the one that is controlling the power/clk/icc votes. So defer enabling runpm until we are ready to call gpu->hw_init(), as that is a point where we know we have all the needed fw and are ready to start sending commands to the coproc's. Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/489337/ Link: https://lore.kernel.org/r/20220613182036.2567963-1-robdclark@gmail.com
2022-07-06drm/msm/adreno: Do not propagate void return valuesGeert Uytterhoeven3-4/+4
With sparse ("make C=2"), lots of error: return expression in void function messages are seen. Fix this by removing the return statements to propagate void return values. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Patchwork: https://patchwork.freedesktop.org/patch/492529/ Link: https://lore.kernel.org/r/0083bc7e23753c19902580b902582ae499b44dbf.1657113388.git.geert@linux-m68k.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-07-06drm/msm/gpu: Add GEM debug label to devcoreRob Clark1-0/+1
When trying to understand an iova fault devcore, once you figure out which buffer we accessed beyond the end of, it is useful to see the buffer's debug label. Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/491910/ Link: https://lore.kernel.org/r/20220629211919.563585-3-robdclark@gmail.com
2022-07-06drm/msm: Fix %d vs %uRob Clark1-5/+5
In debugging fence rollover, I noticed that GPU state capture and devcore dumps were showing me negative fence numbers. Let's fix that and some related signed vs unsigned confusion. Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/489621/ Link: https://lore.kernel.org/r/20220615163532.3013035-1-robdclark@gmail.com
2022-07-06drm/msm/adreno: Allow larger address space sizeRob Clark4-1/+24
The restriction to 4G was strictly to work around 64b math bug in some versions of SQE firmware. This appears to be fixed in a650+ SQE fw, so allow a larger address space size on these devices. Also, add a modparam override for debugging and igt. v2: Send the right version of the patch (ie. the one that actually compiles) Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/487601/ Link: https://lore.kernel.org/r/20220529180428.2577832-1-robdclark@gmail.com
2022-07-06drm/msm/adreno: Fix up formattingKonrad Dybcio1-9/+8
Leading spaces are not something checkpatch likes, and it says so when they are present. Use tabs consistently to indent function body and unwrap a 83-char-long line, as 100 is cool nowadays. Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/487592/ Link: https://lore.kernel.org/r/20220528160353.157870-4-konrad.dybcio@somainline.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-07-06drm/msm/a6xx: Add speedbin support for A619 GPUKonrad Dybcio1-0/+19
There are various SKUs of A619, ranging from 565 MHz to 850 MHz, depending on the bin. Add support for distinguishing them, so that proper frequency ranges can be applied, depending on the HW. Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/487590/ Link: https://lore.kernel.org/r/20220528160353.157870-3-konrad.dybcio@somainline.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-07-06drm/msm/adreno: Add A619 supportKonrad Dybcio5-6/+169
Add support for the Adreno 619 GPU, as found in Snapdragon 690 (SM6350), 480 (SM4350) and 750G (SM7225). Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/487588/ Link: https://lore.kernel.org/r/20220528160353.157870-2-konrad.dybcio@somainline.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-07-06drm/msm/adreno: Remove dead codeKonrad Dybcio1-2/+0
This BUG_ON will never be reached, and there is a comment 20 above explaining why. Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/487586/ Link: https://lore.kernel.org/r/20220528160353.157870-1-konrad.dybcio@somainline.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-07-06drm/msm: Avoid unclocked GMU register access in 6xx gpu_busyDouglas Anderson4-24/+12
From testing on sc7180-trogdor devices, reading the GMU registers needs the GMU clocks to be enabled. Those clocks get turned on in a6xx_gmu_resume(). Confusingly enough, that function is called as a result of the runtime_pm of the GPU "struct device", not the GMU "struct device". Unfortunately the current a6xx_gpu_busy() grabs a reference to the GMU's "struct device". The fact that we were grabbing the wrong reference was easily seen to cause crashes that happen if we change the GPU's pm_runtime usage to not use autosuspend. It's also believed to cause some long tail GPU crashes even with autosuspend. We could look at changing it so that we do pm_runtime_get_if_in_use() on the GPU's "struct device", but then we run into a different problem. pm_runtime_get_if_in_use() will return 0 for the GPU's "struct device" the whole time when we're in the "autosuspend delay". That is, when we drop the last reference to the GPU but we're waiting a period before actually suspending then we'll think the GPU is off. One reason that's bad is that if the GPU didn't actually turn off then the cycle counter doesn't lose state and that throws off all of our calculations. Let's change the code to keep track of the suspend state of devfreq. msm_devfreq_suspend() is always called before we actually suspend the GPU and msm_devfreq_resume() after we resume it. This means we can use the suspended state to know if we're powered or not. NOTE: one might wonder when exactly our status function is called when devfreq is supposed to be disabled. The stack crawl I captured was: msm_devfreq_get_dev_status devfreq_simple_ondemand_func devfreq_update_target qos_notifier_call qos_max_notifier_call blocking_notifier_call_chain pm_qos_update_target freq_qos_apply apply_constraint __dev_pm_qos_update_request dev_pm_qos_update_request msm_devfreq_idle_work Fixes: eadf79286a4b ("drm/msm: Check for powered down HW in the devfreq callbacks") Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/489124/ Link: https://lore.kernel.org/r/20220610124639.v4.1.Ie846c5352bc307ee4248d7cab998ab3016b85d06@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-06-28Merge tag 'drm-msm-fixes-2022-06-28' into msm-next-stagingRob Clark1-4/+10
Merge v5.19 fixes to avoid merge conflicts Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-06-18drm/msm: Don't overwrite hw fence in hw_initRob Clark1-3/+8
Prior to the last commit, this could result in setting the GPU written fence value back to an older value, if we had missed updating completed_fence prior to suspend. This was mostly harmless as the GPU would eventually overwrite it again with the correct value. But we should just not do this. Instead just leave a sanity check that the fence looks plausible (in case the GPU scribbled on memory). Reported-by: Steev Klimaszewski <steev@kali.org> Fixes: 95d1deb02a9c ("drm/msm/gem: Add fenced vma unpin") Signed-off-by: Rob Clark <robdclark@chromium.org> Tested-by: Steev Klimaszewski <steev@kali.org> Patchwork: https://patchwork.freedesktop.org/patch/490138/ Link: https://lore.kernel.org/r/20220618161120.3451993-2-robdclark@gmail.com
2022-06-07drm/msm: Fix double pm_runtime_disable() callMaximilian Luz1-1/+2
Following commit 17e822f7591f ("drm/msm: fix unbalanced pm_runtime_enable in adreno_gpu_{init, cleanup}"), any call to adreno_unbind() will disable runtime PM twice, as indicated by the call trees below: adreno_unbind() -> pm_runtime_force_suspend() -> pm_runtime_disable() adreno_unbind() -> gpu->funcs->destroy() [= aNxx_destroy()] -> adreno_gpu_cleanup() -> pm_runtime_disable() Note that pm_runtime_force_suspend() is called right before gpu->funcs->destroy() and both functions are called unconditionally. With recent addition of the eDP AUX bus code, this problem manifests itself when the eDP panel cannot be found yet and probing is deferred. On the first probe attempt, we disable runtime PM twice as described above. This then causes any later probe attempt to fail with [drm:adreno_load_gpu [msm]] *ERROR* Couldn't power up the GPU: -13 preventing the driver from loading. As there seem to be scenarios where the aNxx_destroy() functions are not called from adreno_unbind(), simply removing pm_runtime_disable() from inside adreno_unbind() does not seem to be the proper fix. This is what commit 17e822f7591f ("drm/msm: fix unbalanced pm_runtime_enable in adreno_gpu_{init, cleanup}") intended to fix. Therefore, instead check whether runtime PM is still enabled, and only disable it in that case. Fixes: 17e822f7591f ("drm/msm: fix unbalanced pm_runtime_enable in adreno_gpu_{init, cleanup}") Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Tested-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Rob Clark <robdclark@gmail.com> Link: https://lore.kernel.org/r/20220606211305.189585-1-luzmaximilian@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-05-20Merge tag 'msm-next-5.19-fixes' of ↵Dave Airlie1-0/+1
https://gitlab.freedesktop.org/abhinavk/msm into drm-next 5.19 fixes for msm-next - Limiting WB modes to max sspp linewidth - Fixing the supported rotations to add 180 back for IGT - Fix to handle pm_runtime_get_sync() errors to avoid unclocked access in the bind() path for dpu driver - Fix the irq_free() without request issue which was a big-time hitter in the CI-runs. Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Signed-off-by: Dave Airlie <airlied@redhat.com> From: Abhinav Kumar <quic_abhinavk@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/b011d51d-d634-123e-bf5f-27219ee33151@quicinc.com
2022-05-18drm/msm/a6xx: Fix refcount leak in a6xx_gpu_initMiaoqian Lin1-0/+1
of_parse_phandle() returns a node pointer with refcount incremented, we should use of_node_put() on it when not need anymore. a6xx_gmu_init() passes the node to of_find_device_by_node() and of_dma_configure(), of_find_device_by_node() will takes its reference, of_dma_configure() doesn't need the node after usage. Add missing of_node_put() to avoid refcount leak. Fixes: 4b565ca5a2cb ("drm/msm: Add A6XX device support") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Link: https://lore.kernel.org/r/20220512121955.56937-1-linmq006@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-05-11Merge tag 'drm-msm-next-2022-05-09' of ↵Dave Airlie5-32/+80
https://gitlab.freedesktop.org/drm/msm into drm-next - Fourcc modifier for tiled but not compressed layouts - Support for userspace allocated IOVA (GPU virtual address) - Devfreq clamp_to_idle fix - DPU: DSC (Display Stream Compression) support - DPU: inline rotation support on SC7280 - DPU: update DP timings to follow vendor recommendations - DP, DPU: add support for wide bus (on newer chipsets) - DP: eDP support - Merge DPU1 and MDP5 MDSS driver, make dpu/mdp device the master component - MDSS: optionally reset the IP block at the bootup to drop bootloader state - Properly register and unregister internal bridges in the DRM framework - Complete DPU IRQ cleanup - DP: conversion to use drm_bridge and drm_bridge_connector - eDP: drop old eDP parts again - DPU: writeback support - Misc small fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvJCr_1D8d0dgmyQC5HD4gmXeZw=bFV_CNCfceZbpMxRw@mail.gmail.com
2022-05-02drm/msm: Fix null pointer dereferences without iommuLuca Weiss1-1/+4
Check if 'aspace' is set before using it as it will stay null without IOMMU, such as on msm8974. Fixes: bc2112583a0b ("drm/msm/gpu: Track global faults per address-space") Signed-off-by: Luca Weiss <luca@z3ntu.xyz> Link: https://lore.kernel.org/r/20220421203455.313523-1-luca@z3ntu.xyz Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-21drm/msm: simplify gpu_busy callbackChia-I Wu2-22/+12
Move tracking and busy time calculation to msm_devfreq_get_dev_status. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Cc: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220416003314.59211-2-olvaffe@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-21drm/msm: Add a way for userspace to allocate GPU iovaRob Clark1-0/+10
The motivation at this point is mainly native userspace mesa driver in a VM guest. The one remaining synchronous "hotpath" is buffer allocation, because guest needs to wait to know the bo's iova before it can start emitting cmdstream/state that references the new bo. By allocating the iova in the guest userspace, we no longer need to wait for a response from the host, but can just rely on the allocation request being processed before the cmdstream submission. Allocation failures (OoM, etc) would just be treated as context-lost (ie. GL_GUILTY_CONTEXT_RESET) or subsequent allocations (or readpix, etc) can raise GL_OUT_OF_MEMORY. v2: Fix inuse check v3: Change mismatched iova case to -EBUSY Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Link: https://lore.kernel.org/r/20220411215849.297838-11-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-21drm/msm/gem: Drop PAGE_SHIFT for address space mmRob Clark1-1/+1
Get rid of all the unnecessary conversion between address/size and page offsets. It just confuses things. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20220411215849.297838-6-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-21drm/msm/gpu: Drop duplicate fence counterRob Clark3-4/+4
The ring seqno counter duplicates the fence-context last_fence counter. They end up getting incremented in lock-step, on the same scheduler thread, but the split just makes things less obvious. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220411215849.297838-3-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-21drm/msm: Add a way to override processes comm/cmdlineRob Clark1-3/+40
In the cause of using the GPU via virtgpu, the host side process is really a sort of proxy, and not terribly interesting from the PoV of crash/fault logging. Add a way to override these per process so that we can see the guest process's name. v2: Handle kmalloc failure, add comment to explain kstrdup returns NULL if passed NULL [Dan Carpenter] Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220317165144.222101-4-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-21drm/msm: Add support for pointer paramsRob Clark2-4/+12
The 64b value field is already suffient to hold a pointer instead of immediate, but we also need a length field. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220317165144.222101-2-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-11drm/msm/gpu: Avoid -Wunused-function with !CONFIG_PM_SLEEPNathan Chancellor1-5/+2
When building with CONFIG_PM=y and CONFIG_PM_SLEEP=n (such as ARCH=riscv allmodconfig), the following warnings/errors occur: drivers/gpu/drm/msm/adreno/adreno_device.c:679:12: error: 'adreno_system_resume' defined but not used [-Werror=unused-function] 679 | static int adreno_system_resume(struct device *dev) | ^~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/msm/adreno/adreno_device.c:655:12: error: 'adreno_system_suspend' defined but not used [-Werror=unused-function] 655 | static int adreno_system_suspend(struct device *dev) | ^~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors These functions are only used in SET_SYSTEM_SLEEP_PM_OPS(), which evaluates to empty when CONFIG_PM_SLEEP is not set, making these functions unused. To resolve this, use the SYSTEM_SLEEP_PM_OPS() and RUNTIME_PM_OPS() macros, which were introduced in commit 1a3c7bb08826 ("PM: core: Add new *_PM_OPS macros, deprecate old ones"). They are designed to avoid these compiler warnings while still guarding their use on CONFIG_PM{,_SLEEP}=y. Fixes: 7e4167c9e021 ("drm/msm/gpu: Park scheduler threads for system suspend") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20220411181249.2758344-1-nathan@kernel.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-11drm/msm: Fix range size vs end confusionRob Clark1-1/+1
The fourth param is size, rather than range_end. Note that we could increase the address space size if we had a way to prevent buffers from spanning a 4G split, mostly just to avoid fw bugs with 64b math. Fixes: 84c31ee16f90 ("drm/msm/a6xx: Add support for per-instance pagetables") Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220407202836.1211268-1-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-03-24drm/msm/gpu: Remove mutex from wait_event conditionRob Clark1-10/+1
The mutex wasn't really protecting anything before. Before the previous patch we could still be racing with the scheduler's kthread, as that is not necessarily frozen yet. Now that we've parked the sched threads, the only race is with jobs retiring, and that is harmless, ie. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220310234611.424743-4-robdclark@gmail.com
2022-03-24drm/msm/gpu: Park scheduler threads for system suspendRob Clark1-4/+64
In the system suspend path, we don't want to be racing with the scheduler kthreads pushing additional queued up jobs to the hw queue (ringbuffer). So park them first. While we are at it, move the wait for active jobs to complete into the new system- suspend path. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220310234611.424743-3-robdclark@gmail.com